INTEGRATING MORPHOLOGICAL ANALYSIS INTO MACHINE LEARNING MODELS FOR LANGUAGE PROCESSING
Keywords:
Keywords: Computational linguistics, morphological analysis, machine learning, lemmatization, natural language processing, low-resource languagesAbstract
Annotation: This thesis explores the integration of linguistic morphological analysis into machine learning models for natural language processing (NLP). It focuses on how the inclusion of explicit morphological features, such as roots, affixes, and grammatical tags, can improve tasks like lemmatization. The study targets morphologically rich languages and uses both theoretical frameworks and experimental evaluation to support the findings.References
Aronoff, M. (1976). *Word formation in generative grammar*. MIT Press.
Booij, G. (2005). *The grammar of words: An introduction to linguistic morphology*. Oxford University Press.
Cotterell, R., & Heigold, G. (2017). Cross-lingual character-level neural morphological tagging. In *Proceedings of EMNLP* (pp. 759–769). https://doi.org/10.18653/v1/D17-1079
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. *arXiv preprint arXiv:1810.04805*. https://arxiv.org/abs/1810.04805
Jurafsky, D., & Martin, J. H. (2023). *Speech and language processing* (3rd ed., draft). https://web.stanford.edu/~jurafsky/slp3/
Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. In *Proceedings of ACL* (pp. 1715–1725). https://aclanthology.org/P16-1162
Vania, C., & Lopez, A. (2017). From characters to words to in between: Do we capture morphology? In *Proceedings of EACL* (pp. 751–761). https://aclanthology.org/E17-1071