КЛАССИФИКАЦИЯ ТЕКСТОВ ПО ЭМОЦИОНАЛЬНОЙ ОКРАСКЕ С ИСПОЛЬЗОВАНИЕМ МОДЕЛЕЙ TRANSFORMERS
Keywords:
классификация текста, BERT, узбекский язык, эмоции, глубокое обучение, Transformers, анализ текста.Abstract
В статье рассматривается задача классификации эмоциональной окраски текстов с помощью преобученной модели BERT (с
использованием тахрирчи-модели для узбекского языка). Реализована система обучения и оценки модели с логированием использования ресурсов памяти. Результаты показывают высокую точность классификации эмоций в многоклассовой среде.
References
1.
Joshi, P., Santy, S., Budhiraja, A., Bali, K., & Choudhury, M. (2020). The
state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of
the 58th Annual Meeting of the Association for Computational Linguistics, 6282-6293.
2.
Mohammad, S. M. (2022). Practical and ethical considerations in the effective
use of emotion and sentiment lexicons. In Proceedings of the Thirteenth Language
Resources and Evaluation Conference, 1675-1684.
3.
Yadollahi, A., Shahraki, A. G., & Zaiane, O. R. (2017). Current state of text
sentiment analysis from opinion to emotion mining. ACM Computing Surveys, 50(2),
1-33.
4.
Strapparava, C., & Valitutti, A. (2004). WordNet-Affect: an affective
extension of WordNet. In Proceedings of the 4th International Conference on Language
Resources and Evaluation, 1083-1086.
5.
Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis.
Foundations and Trends in Information Retrieval, 2(1-2), 1-135.
6.
Zhang, L., Wang, S., & Liu, B. (2018). Deep learning for sentiment analysis:
A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery,
8(4), e1253.
7.
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology:
What we know about how BERT works. Transactions of the Association for
Computational Linguistics, 8, 842-866.
8.
Kuriyozov, E., Matlatipov, S., & Alonso, P. (2023). Natural language
processing for low-resource Turkic languages: A comprehensive survey. ACM
Computing Surveys, 55(12), 1-38.9.
Matlatipov, G., & Vetulani, Z. (2022). Building language resources for Uzbek
natural language processing. In Proceedings of the International Conference on NLP
Applications for Turkic Languages, 47-55.
10.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre
training of deep bidirectional transformers for language understanding. In Proceedings
of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 4171-4186.
11.
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G.,
Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., & Stoyanov, V. (2020). Unsupervised
cross-lingual representation learning at scale. In Proceedings of the 58th Annual
Meeting of the Association for Computational Linguistics, 8440-8451.
12.
Madrahimov, B., Absalomov, L., & Khodjaev, S. (2023). Tahririchi: Pre
trained transformer models for Uzbek language. In Proceedings of the 3rd Workshop
on NLP for Turkic Languages, 112-121.
13.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre
training of deep bidirectional transformers for language understanding. In Proceedings
of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 4171-4186.
14.
Sun, C., Qiu, X., Xu, Y., & Huang, X. (2019). How to fine-tune BERT for
text classification? In China National Conference on Chinese Computational
Linguistics, 194-206.
15.
Plutchik, R. (1980). A general psychoevolutionary theory of emotion. In R.
Plutchik & H. Kellerman (Eds.), Emotion: Theory, Research, and Experience, 1, 3-33.
16.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational
and Psychological Measurement, 20(1), 37-46.
17.
Ekman, P. (2016). What scientists who study emotion agree about.
Perspectives on Psychological Science, 11(1), 31-34.18.
Abdurahmonova, N., & Tulakov, U. (2022). Corpus linguistics in Uzbek
language: Current state and perspectives. Journal of Turkic Language Processing, 7(3),
145-159.
19.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M.,
Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A robustly optimized BERT
pretraining approach. arXiv preprint arXiv:1907.11692.
20.
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V.
(2019). XLNet: Generalized autoregressive pretraining for language understanding. In
Advances in Neural Information Processing Systems, 5753-5763.
21.
Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled
version of BERT: smaller, faster, cheaper and lighter. arXiv preprint
arXiv:1910.01108.
22.
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R.
(2019). Predicting the type and target of offensive posts in social media. In Proceedings
of the 2019 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 1415-1420.
23.
Scherer, K. R., & Wallbott, H. G. (1994). Evidence for universality and
cultural variation of differential emotion response patterning. Journal of Personality
and Social Psychology, 66(2), 310-328.
24.
Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., &
Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of
the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 2227-2237.
25.
Mohammad, S. M., & Turney, P. D. (2013). Crowdsourcing a word-emotion
association lexicon. Computational Intelligence, 29(3), 436-465.
26.
Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., & Joulin, A. (2018).
Advances in pre-training distributed word representations. In Proceedings of the 11th
International Conference on Language Resources and Evaluation, 52-55. 27.
Alimova, I., & Tuychiev, G. (2022). Emotion detection in Uzbek social media
texts using transformer-based models. In Proceedings of the 4th International
Conference on NLP Applications for Turkic Languages, 78-86.
28.
Akhmedova, D., & Khodjaev, J. (2023). Transfer learning for sentiment
analysis in low-resource Uzbek language. Journal of Natural Language Processing,
15(2), 87-103.
29.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural
Computation, 9(8), 1735-1780.
30.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac,
P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C.,
Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., ... & Rush, A. (2020). Transformers:
State-of-the-art natural language processing. In Proceedings of the 2020 Conference
on Empirical Methods in Natural Language Processing: System Demonstrations, 38
45.