INTERNATIONAL JOURNAL OF SCIENTIFIC DEVELOPMENT AND RESEARCH International Peer Reviewed & Refereed Journals, Open Access Journal ISSN Approved Journal No: 2455-2631 | Impact factor: 8.15 | ESTD Year: 2016
open access , Peer-reviewed, and Refereed Journals, Impact factor 8.15
V V S SAI YASASWINI
, Nikhira Sajeevan , BURI SIVAKUMAR , R.J.Ramasree
Unique Id:
IJSDR2404047
Published In:
Volume 9 Issue 4, April-2024
Abstract:
Part-of-speech (POS) tagging is a fundamental task in Natural Language Processing (NLP) that assigns grammatical categories to words in a sentence. The importance of POS tagging in NLP tasks such as information retrieval, machine translation, and sentiment analysis. It then delves into the different approaches employed in POS tagging, including rule-based methods, statistical techniques, and deep learning models. Rule-based methods rely on handcrafted linguistic rules to assign POS tags, while statistical approaches utilize probabilistic models trained on annotated corpora. Deep learning models, particularly Recurrent Neural Networks (RNNs) and transformer-based architectures like BERT, have shown remarkable performance in POS tagging tasks due to their ability to capture complex linguistic patterns. The proposed solution leverages Flask, a lightweight web framework for Python, to create a user-friendly interface for POS tagging. By utilizing NLTKs word tokenization and POS tagging for Sanskrit text functionalities, a developer’s can effortlessly process text input from users and provide accurate grammatical annotations. The present work provides Error handling mechanism are also covered to ensure robustness and user-friendly error messages. This paper presents an innovative statistical part-of-speech (POS) tagging method tailored specifically for the Sanskrit language, a highly inflected and ancient language. Utilizing advanced machine learning techniques and linguistic insights specific to Sanskrit, our model aims to enhance the precision and effectiveness of POS tagging for this intricate language. In natural language processing (NLP) and computational linguistic the Gold Standard typically represents a corpus of text or a set of documents, annotated or tagged with the desired results for the analysis. Key elements of our investigation encompass the acquisition and preprocessing of annotated Sanskrit corpora, feature engineering customized to Sanskrit linguistic characteristics, model creation using state-of-the-art algorithms such as recurrent neural networks or transformer models, and assessment criteria to gauge the performance of the POS tagging system.
"STATISTICAL POS TAGGING FOR SANSKRIT LANGUAGE", International Journal of Science & Engineering Development Research (www.ijsdr.org), ISSN:2455-2631, Vol.9, Issue 4, page no.319 - 322, April-2024, Available :http://www.ijsdr.org/papers/IJSDR2404047.pdf
Downloads:
000338174
Publication Details:
Published Paper ID: IJSDR2404047
Registration ID:210727
Published In: Volume 9 Issue 4, April-2024
DOI (Digital Object Identifier):
Page No: 319 - 322
Publisher: IJSDR | www.ijsdr.org
ISSN Number: 2455-2631
Facebook Twitter Instagram LinkedIn