PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Analysis of data pre-processing methods for sentiment analysis of reviews

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The goals of this study are to analyze the effects of data pre-processing methods for sentiment analysis and determine which of these pre-processing methods (and their combinations) are effective for English as well as for an agglutinative language like Turkish. We also try to answer the research question of whether there are any differences between agglutinative and non-agglutinative languages in terms of pre-processing methods for sentiment analysis. We find that the performance results for the English reviews are generally higher than those for the Turkish reviews due to the differences between the two languages in terms of vocabularies, writing styles, and agglutinative property of the Turkish language.
Wydawca
Czasopismo
Rocznik
Strony
123--141
Opis fizyczny
Bibliogr. 28 poz., tab.
Twórcy
autor
  • Mustafa Kemal University, Hatay, Turkiye
autor
  • Çukurova University, Adana, Turkiye
autor
  • University of Guelph, Ontario, Canada
Bibliografia
  • [1] Abbasi A., Chen H., Salem A.: Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums, ACM Transactions on Information Systems, vol. 26(3), pp. 1–34, 2008. http://dx.doi.org/10.1145 /1361684.1361685.
  • [2] Agarwal B., Mittal N.: Prominent feature extraction for review analysis: an empirical study, Journal of Experimental & Theoretical Artificial Intelligence, vol. 28(3), pp. 485–498, 2016. http://dx.doi.org/10.1080/0952813X.2014.9 77830.
  • [3] Akba F., Uçan A., Sezer E.A., Sever H.: Assessment of feature selection metrics for sentiment analyses: Turkish movie reviews. In: 8th European Conference on Data Mining, pp. 180–184, Lisbon, Portugal, 2014. http://humir.cs.hacette pe.edu.tr/file/AkbaFUcanA.pdf.
  • [4] Akın A.A., Akın M.D.: Zemberek, An Open Source Nlp Framework for Turkic Languages, Structure, vol. 10, pp. 1–5, 2007. http://zemberek.googlecode.co m/files/zemberek_makale.pdf.
  • [5] Asgarian E., Kahani M., Sharifi S.: The Impact of Sentiment Features on the Sentiment Polarity Classification in Persian Reviews, Cognitive Computation, vol. 10(1), pp. 117–135, 2018. http://dx.doi.org/10.1007/s12559-017- 9513-1.
  • [6] Bird S., Klein E., Loper E.: Natural Language Processing with Python, O’Reilly, 2009. http://www.nltk.org/book_1ed/.
  • [7] Blitzer J., Dredze M., Pereira F.: Biographies, Bollywood, Boom-Boxes and Blenders: Domain Adaptation for Sentiment Classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 440–447, 2007. https://www.aclweb.org/anthology/P07-1056.
  • [8] Bojanowski P., Grave E., Joulin A., Mikolov T.: Enriching Word Vectors with Subword Information, Transactions of the Association for Computational Linguistics, vol. 5, pp. 135–146, 2017.
  • [9] Çakici R.: Wide-coverage parsing for Turkish, Ph.D. thesis, PhD Thesis, University of Edinburgh, 2009. http://hdl.handle.net/1842/3807.
  • [10] Cetin M., Amasyali M.F.: Supervised and traditional term weighting methods for sentiment analysis. In: 21st Signal Processing and Communications Applications Conference (SIU), pp. 1–4, 2013. http://dx.doi.org/10.1109/SIU.2013.6531 173.
  • [11] Demirtas E., Pechenizkiy M.: Cross-lingual polarity detection with machine translation. In: Second International Workshop on Issues of Sentiment Discovery and Opinion Mining – WISDOM ’13, pp. 1–8. ACM Press, New York, 2013. http://dx.doi.org/10.1145/2502069.2502078.
  • [12] Despotovic V., Tanikic D.: Sentiment Analysis of Microblogs Using Multilayer Feed-Forward Artificial Neural Networks, Computing and Informatics, vol. 36(5), pp. 1127–1142, 2017. http://www.cai.sk/ojs/index.php/cai/article/viewA rticle/2017_5_1127.
  • [13] Devitt A., Ahmad K.: Sentiment Polarity Identification in Financial News: A Cohesion-Based Approach. In: Proceedings of Annual Meeting of the Association of Computational Linguistics, pp. 984–991, June 2007. https://www.ac lweb.org/anthology/P07-1124.
  • [14] Duwairi R., El-Orfali M.: A study of the effects of preprocessing strategies on sentiment analysis for Arabic text, Journal of Information Science, vol. 40(4), pp. 501–513, 2014. http://dx.doi.org/10.1177/0165551514534143.
  • [15] Erogul U.: Sentiment Analysis in Turkish, Master’s thesis, Middle East Technical University, Turkey, 2009.
  • [16] Kaya M., Fidan G., Toroslu I.H.: Sentiment Analysis of Turkish Political News. In: 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 174–180, Macau, China, 2012. http://dx.doi .org/10.1109/WI-IAT.2012.115.
  • [17] Kim Y.: Convolutional Neural Networks for Sentence Classification. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. 2014. http: //arxiv.org/abs/1408.5882.
  • [18] Liu Y., Bi J.W., Fan Z.P.: Multi-class sentiment classification: The experimental comparisons of feature selection and machine learning algorithms, Expert Systems with Applications, vol. 80, pp. 323–339, 2017. http://dx.doi.org/10.10 16/j.eswa.2017.03.042.
  • [19] Mladenovic M., Mitrovic J., Krstev C., Vitas D.: Hybrid sentiment analysis framework for a morphologically rich language, Journal of Intelligent Information Systems, vol. 46(3), pp. 599–620, 2016. http://dx.doi.org/10.1007/s10844- 015-0372-5.
  • [20] Nicholls C., Song F.: Comparison of Feature Selection Methods for Sentiment Analysis. In: Farzindar A., Kešelj V. (eds.), Advances in Artificial Intelligence, pp. 286–289, Springer, Berlin, Heidelberg, 2010. https://doi.org/10.1007/97 8-3-642-13059-5_30.
  • [21] Pang B., Lee L.: A sentimental education. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics – ACL ’04, pp. 1–8, 2004. http://dx.doi.org/10.3115/1218955.1218990.
  • [22] Pang B., Lee L.: Opinion Mining and Sentiment Analysis, Foundations and Trends in Information Retrieval, vol. 2(1–2), pp. 1–135, 2008. http://dx.d oi.org/10.1561/1500000011.
  • [23] Pang B., Lee L., Vaithyanathan S.: Thumbs up?: Sentiment classification using machine learning techniques In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing – EMNLP ’02, vol. 10, pp. 79–86, Association for Computational Linguistics, Stroudsburg, PA, USA, 2002. http://dx.doi.org/10.3115/1118693.1118704.
  • [24] Parlar T., Özel S.A., Song F.: QER: a new feature selection method for sentiment analysis, Human-centric Computing and Information Sciences, vol. 8(1), p. 10, 2018. http://dx.doi.org/10.1186/s13673-018-0135-8.
  • [25] Sevindi B.I.: Türkçe Metinlerde Denetimli ve Sözlük Tabanlı Duygu Analizi Yaklasımlarının Karsılastırılması, MSc Thesis, Gazi University, 2013.
  • [26] Witten I.H., Frank E., Hall M.A.: Data mining: Practical Machine Learning Tools and Techniques (Third Edition), Morgan Kaufmann, 2011. https://doi. org/10.1016/B978-0-12-374856-0.00026-2
  • [27] Yang D.H., Yu G.: A method of feature selection and sentiment similarity for Chinese micro-blogs, Journal of Information Science, vol. 39(4), pp. 429–441, 2013. http://dx.doi.org/10.1177/0165551513480308.
  • [28] Zheng L., Wang H., Gao S.: Sentimental feature selection for sentiment analysis of Chinese online reviews, International Journal of Machine Learning and Cybernetics, vol. 9(1), pp. 75–84, 2018. http://dx.doi.org/10.1007/s13042-015- 0347-4.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-c8123943-cf0e-46d6-acd2-8b42784e4235
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.