Have a personal or library account? Click to login
Albanian Text Classification: Bag of Words Model and Word Analogies Cover

Albanian Text Classification: Bag of Words Model and Word Analogies

Open Access
|May 2019

References

  1. 1. Antonellis, I., Bouras, C., Poulopoulos, V. (2006), “Personalized news categorization through scalable text classification”, in Zhou, X., Li, J., Shen, H. T., Kitsuregawa, M., Zhang, Y. (Eds.) Frontiers of WWW Research and Development – APWeb 2006, Springer, Berlin, Heidelberg, pp. 391-401.<a href="https://doi.org/10.1007/11610113_35" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1007/11610113_35</a>
  2. 2. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T. (2017), “Enriching word vectors with subword information”, Transactions of the Association of Computational Linguistics, Vol. 5, pp.135-146.<a href="https://doi.org/10.1162/tacl_a_00051" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1162/tacl_a_00051</a>
  3. 3. Chaudhari, S. V., Lade, S. (2013), “Classification of News and Research Articles Using Text Pattern Mining”, IOSR Journal of Computer Engineering (IOSR-JCE), Vol. 14, No. 5, pp. 120-126.<a href="https://doi.org/10.9790/0661-145120126" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.9790/0661-145120126</a>
  4. 4. Cortes, C., Vapnik, V. (1995), “Support-vector networks”, Machine Learning, Vol. 20, No. 3, pp. 273-297.<a href="https://doi.org/10.1007/BF00994018" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1007/BF00994018</a>
  5. 5. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y., (2006), “Online passive-aggressive algorithms”, Journal of Machine Learning Research, Vol. 7, pp. 551-585.
  6. 6. Gui, Y., Gao, Z., Li, R., Yang, X. (2012), “Hierarchical text classification for news articles based-on named entities”, in Zhou, S., Zhangs, S., Karypis, G. (Eds.) Advanced Data Mining and Applications, Springer, Berlin, Heidelberg, pp. 318-329.<a href="https://doi.org/10.1007/978-3-642-35527-1_27" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1007/978-3-642-35527-1_27</a>
  7. 7. Hartmann, N., Fonseca, E., Shulby, C., Treviso, M., Rodrigues, J., Aluisio, S. (2017), “Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks”, in Proceedings of Symposium in Information and Human Language Technology, Uberlandia, MG, Brazil, pp. 122-131.
  8. 8. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T. (2016), “Bag of tricks for efficient text classification”, in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Vol. 2, Short Papers, pp. 427-431.<a href="https://doi.org/10.18653/v1/E17-2068" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.18653/v1/E17-2068</a>
  9. 9. Jurka, T. P., Collingwood, L., Boydstun, A. E., Grossman, E., van Atteveldt, W. (2013) “RTextTools: A supervised learning package for text classification”, The R Journal, Vol. 5, No. 1, pp. 6-12.<a href="https://doi.org/10.32614/RJ-2013-001" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.32614/RJ-2013-001</a>
  10. 10. Liparas, D., HaCohen-Kerner, Y., Moumtzidou, A., Vrochidis, S., Kompatsiaris, I. (2014), “News Articles Classification Using Random Forests and Weighted Multimodal Features”, in Lamas, D., Buitelaar, P. (Eds.), Multidisciplinary Information Retrieval, Springer, Cham, pp. 63-75.<a href="https://doi.org/10.1007/978-3-319-12979-2_6" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1007/978-3-319-12979-2_6</a>
  11. 11. Manning, C. D., Raghavan, P., Schutze, H. (2008). Introduction to Information Retrieval, New York, Cambridge University Press.<a href="https://doi.org/10.1017/CBO9780511809071" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1017/CBO9780511809071</a>
  12. 12. Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013), “Efficient estimation of word representations in vector space”, in Proceedings of the International Conference on Learning Representations (ICLR 2013), available at: https://arxiv.org/pdf/1301.3781.pdf
  13. 13. September 2013).
  14. 14. Natural Language Processing Group (2014). Web corpora of Bosnian, Croatian and Serbian top-level domain published, available at: http://nlp.ffzg.hr/web-corpora-of-bosniancroatian-and-serbian-top-level-domain-published/ (7 September 2014).
  15. 15. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V. Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011), “Scikit-learn: Machine Learning in Python”, In Journal of Machine Learning Research, Vol. 12, pp. 2825-2830.
  16. 16. Raschka, S. (2015). Python machine learning, Birmingham, Packt Publishing Ltd.
  17. 17. Rubin, T. N., Chambers, A., Smyth, P., Steyvers, M. (2012), “Statistical topic models for multilabel document classification”, Machine Learning, Vol. 88, No. 1-2, pp. 157-208.<a href="https://doi.org/10.1007/s10994-011-5272-5" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1007/s10994-011-5272-5</a>
  18. 18. Scannell, K. P. (2007), “The Crúbadán Project: Corpus building for under-resourced languages”, in Fairon, C., Naets, H., Kilgarriff, A., de Schryver, G. M. (Eds.), Building and Exploring Web Corpora, Proceedings of the 3rd Web as Corpus Workshop, Vol. 4, pp. 5-15.
  19. 19. Swezey, R. M., Sano, H., Shiramatsu, S., Ozono, T., Shintani, T. (2012), “Automatic detection of news articles of interest to regional communities”, International Journal of Computer Science and Network Security, Vol. 12, No. 6, pp. 99-106.
  20. 20. Tyers, F. M., Alperen, M. S. (2010), “South-east European times: A parallel corpus of Balkan languages”, in Proceedings of the LREC Workshop on Exploitation of Multilingual Resources and Tools for Central and (South-) Eastern European Languages, pp. 49-53.
  21. 21. Zhou, D., Resnick, P., Mei, Q. (2011), “Classifying the Political Leaning of News Articles and Users from User Votes”, in 5th International AAAI Conference on Web and Social Media, North America, pp. 417-424.<a href="https://doi.org/10.1609/icwsm.v5i1.14108" target="_blank" rel="noopener noreferrer" class="text-signal-blue hover:underline">10.1609/icwsm.v5i1.14108</a>
DOI: https://doi.org/10.2478/bsrj-2019-0006 | Journal eISSN: 1847-9375 | Journal ISSN: 1847-8344
Language: English
Page range: 74 - 87
Submitted on: Dec 1, 2017
Accepted on: Feb 22, 2018
Published on: May 9, 2019
Published by: IRENET - Society for Advancing Innovation and Research in Economy
In partnership with: Paradigm Publishing Services
Publication frequency: 2 times per year

© 2019 Arbana Kadriu, Lejla Abazi, Hyrije Abazi, published by IRENET - Society for Advancing Innovation and Research in Economy
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.