|
Using the bag of words algorithm in natural Pdf ko'rish
|
bet | 15/16 | Sana | 11.12.2023 | Hajmi | 379,37 Kb. | | #116133 |
Bog'liq b.elov n.xudayberganov z.xusainova til va madaniyat49
Using the bag of words algorithm in natural
language processing
Botir Elov
1
Nizomaddin Xudayberganov
2
Zilola Xusainova
3
Abstract:
A bag-of-words model is a digital representation of text to be
processed by machine learning algorithms. Using the Bag Of Words (BoW)
modeling algorithm, text can be converted and processed into digital
matrices. Bag of Words (BoW) is an algorithm that calculates the statistics
of a word in a document. The BoW algorithm is used in NLP applications
such as document comparison, information retrieval in search engines,
document classification, and thematic modeling. This article presents
the methods of converting Uzbek texts into digital form using the BoW
algorithm.
Keywords: BoW, Bag of words, set of words, word vector, token, BoW
algorithm, TF-IDF method.
References:
Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, Š., & Sedlmair,
M. (2018). More than Bags of Words: Sentiment Analysis with
Word Embeddings. Communication Methods and Measures,
12(2–3). https://doi.org/10.1080/19312458.2018.1455817
Zhang, Y., Jin, R., & Zhou, Z. H. (2010). Understanding bag-of-words model:
A statistical framework. International Journal of Machine
Learning and Cybernetics, 1(1–4). https://doi.org/10.1007/
s13042-010-0001-0
1Elov Botir Boltayevich – doctor of philosophy in technical sciences (PhD),
associate professor. Head of the Department of Computer Linguistics and Digital
Technologies of Tashkent State University of Uzbek Language and Literature
named after Alisher Navoi.
E-mail: elov@navoiy-uni.uz
ORCID: 0000-0001-5032-6648
1Xudayberganov Nizomaddin Uktamboy oʻgʻli – Teacher of the Department of
Computer Linguistics and Digital Technologies of Tashkent State University of
Uzbek Language and Literature named after Alisher Navoi.
|
| |