Deep learning hyper-parameter tuning for sentiment analysis in twitter based on evolutionary algorithms

Rodríguez-Barroso, Nuria; Moya, Antonio R.; Fernández, José A.; Romero, Elena; Martínez-Cámara, Eugenio; Herrera, Francisco

doi:10.15439/2019F183

Artykuł - szczegóły

Tytuł artykułu

Deep learning hyper-parameter tuning for sentiment analysis in twitter based on evolutionary algorithms

Autorzy

Rodríguez-Barroso Nuria , Moya Antonio R. , Fernández José A. , Romero Elena , Martínez-Cámara Eugenio , Herrera Francisco

Wybrane pełne teksty z tego czasopisma

http://annals-csis.org

Identyfikatory

DOI

10.15439/2019F183

Warianty tytułu

Konferencja

Federated Conference on Computer Science and Information Systems (14 ; 01-04.09.2019 ; Leipzig, Germany)

Języki publikacji

Abstrakty

The state of the art in Sentiment Analysis is defined by deep learning methods, and currently the research efforts are focused on improving the encoding of underlying contextual information in a sequence of text. However, those neural networks with a higher representation capacity are increasingly more complex, which means that they have more hyper-parameters that have to be defined by hand. We argue that the setting of hyper-parameters may be defined as an optimisation task, we thus claim that evolutionary algorithms may be used to the optimisation of the hyper-parameters of a deep learning method. We propose the use of the evolutionary algorithm SHADE for the optimisation of the configuration of a deep learning model for the task of sentiment analysis in Twitter. We evaluate our proposal in a corpus of Spanish tweets, and the results show that the hyper-parameters found by the evolutionary algorithm enhance the performance of the deep learning method.

Słowa kluczowe

Twitter neural network deep learning task analysis evolutionary computation proposal tuning

Twitter sieć neuronowa uczenie głębokie obliczenia ewolucyjne wniosek

Wydawca

Polskie Towarzystwo Informatyczne

Czasopismo

Annals of Computer Science and Information Systems

Rocznik

2019

Tom

Vol. 18

Strony

255–--264

Opis fizyczny

Bibliogr. 42 poz., wz., rys., tab.

Twórcy

autor

Rodríguez-Barroso Nuria

rbnuria@correo.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

autor

Moya Antonio R.

anmomar85@correo.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

autor

Fernández José A.

fernandezja@correo.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

autor

Romero Elena

elenaromeroc@correo.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

autor

Martínez-Cámara Eugenio

emcamara@decsai.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

autor

Herrera Francisco

herrera@decsai.ugr.es

Andalusian Research Institute in Data Science and Computational Intelligence, University of Granada, Granada (Spain)

Bibliografia

1. D. Zimbra, A. Abbasi, D. Zeng, and H. Chen, “The state-of-the-art in twitter sentiment analysis: A review and benchmark evaluation,” ACM Trans. Manage. Inf. Syst., vol. 9, no. 2, pp. 5:1–5:29, Aug. 2018. http://dx.doi.org/10.1145/3185045. [Online]. Available: http://doi.acm.org/10.1145/3185045
2. B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Found. Trends Inf. Retr., vol. 2, no. 1-2, pp. 1–135, Jan. 2008. http://dx.doi.org/10.1561/1500000011
3. E. Martínez-Cámara, M. T. Martín-Valdivia, L. A. Ureña López, and A. Montejo-Ráez, “Sentiment analysis in Twitter,” Natural Language Engineering, vol. 20, no. 1, p. 1–28, 2014. http://dx.doi.org/10.1017/S1351324912000332
4. S. Mohammad, S. Kiritchenko, and X. Zhu, “NRC-canada: Building the state-of-the-art in sentiment analysis of tweets,” in Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). Atlanta, Georgia, USA: Association for Computational Linguistics, Jun. 2013, pp. 321–327. [Online]. Available: https://www.aclweb.org/anthology/S13-2053
5. L. Hurtado, F. Pla, and D. Buscaldi, “ELiRF-UPV at TASS 2015: Sentiment analysis in twitter,” in Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN co-located with 31st SEPLN Conference (SEPLN 2015). Alicante, Spain: Spanish Society for Natural Language Processing, 2015, pp. 75–79.
6. M. Cliche, “BB_twtr at SemEval-2017 task 4: Twitter sentiment analysis with CNNs and LSTMs,” in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Vancouver, Canada: Association for Computational Linguistics, Aug. 2017. http://dx.doi.org/10.18653/v1/S17-2094 pp. 573–580. [Online]. Available: https://www.aclweb.org/anthology/S17-2094
7. H. L. González, José-Ángel and F. Pla, “ELiRF-UPV at TASS 2018: Sentiment analysis in twitter based on deep learning,” in Proceedings of TASS 2018: Workshop on Semantic Analysis at SEPLN (TASS 2018) colocated with 34nd SEPLN Conference (SEPLN 2018). Sevilla, Spain: Spanish Society for Natural Language Processing, 2018, pp. 37–44.
8. A. Ambartsoumian and F. Popowich, “Self-attention: A better building block for sentiment analysis neural network classifiers,” in Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Brussels, Belgium: Association for Computational Linguistics, Oct. 2018. http://dx.doi.org/10.18653/v1/P17 pp. 130–139. [Online]. Available: https://www.aclweb.org/anthology/W18-6219
9. N. Majumder, S. Poria, A. Gelbukh, M. S. Akhtar, E. Cambria, and A. Ekbal, “IARM: Inter-aspect relation modeling with memory networks in aspect-based sentiment analysis,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, Oct.-Nov. 2018, pp. 3402–3411. [Online]. Available: https://www.aclweb.org/anthology/D18-1377
10. B. Li, J. Li, K. Tang, and X. Yao, “Many-objective evolutionary algorithms: A survey,” ACM Comput. Surv., vol. 48, no. 1, pp. 13:1–13:35, Sep. 2015. http://dx.doi.org/10.1145/2792984. [Online]. Available: http://doi.acm.org/10.1145/2792984
11. R. Tanabe and A. Fukunaga, “Success-history based parameter adaptation for differential evolution,” in 2013 IEEE congress on evolutionary computation. IEEE, 2013. http://dx.doi.org/10.1109/CEC.2013.6557555 pp. 71–78.
12. M. C. Díaz-Galiano, M. A. García-Cumbreras, M. García-Vega, Y. Gutiérrez, E. M. Cámara, A. Piad-Morffis, and J. Villena-Román, “TASS 2018: The strength of deep learning in language understanding tasks,” Procesamiento del Lenguaje Natural, vol. 62, pp. 77–84, 2019. http://dx.doi.org/10.26342/2019-62-9
13. S. Tabik, D. Peralta, A. Herrera-Poyatos, and F. Herrera, “A snapshot of image pre-processing for convolutional neural networks: case study of MNIST,” International Journal of Computational Intelligence Systems, vol. 10, no. 1, pp. 555–568, 2017.
14. A. Java, X. Song, T. Finin, and B. Tseng, “Why we twitter: Understanding microblogging usage and communities,” in Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 Workshop on Web Mining and Social Network Analysis, ser. WebKDD/SNA-KDD ’07. New York, NY, USA: ACM, 2007. http://dx.doi.org/10.1145/1348549.1348556. ISBN 978-1-59593-848-0 pp. 56–65. [Online]. Available: http://doi.acm.org/10.1145/1348549.1348556
15. B. J. Jansen, M. Zhang, K. Sobel, and A. Chowdury, “Micro-blogging as online word of mouth branding,” in CHI ’09 Extended Abstracts on Human Factors in Computing Systems, ser. CHI EA ’09. New York, NY, USA: ACM, 2009. http://dx.doi.org/10.1145/1520340.1520584. ISBN 978-1-60558-247-4 pp. 3859–3864. [Online]. Available: http://doi.acm.org/10.1145/1520340.1520584
16. B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? sentiment classification using machine learning techniques,” in Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Jul. 2002. http://dx.doi.org/10.3115/1118693.1118704 pp. 79–86. [Online]. Available: https://www.aclweb.org/anthology/W02-1011
17. P. Turney, “Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews,” in Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, Jul. 2002. http://dx.doi.org/10.3115/1073083.1073153 pp. 417–424. [Online]. Available: https://www.aclweb.org/anthology/P02-1053
18. A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using distant supervision,” Stanford University, Stanford, CA, USA, Tech. Rep. CS224N Project Report, 2009.
19. L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao, “Target-dependent twitter sentiment classification,” in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, Jun. 2011, pp. 151–160. [Online]. Available: https://www.aclweb.org/anthology/P11-1016
20. L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu, “Combining lexicon-based and learning-based methods for twitter sentiment analysis,” HP Laboratories, USA, Tech. Rep. HPL-2011-89, 2011.
21. E. Martínez Cámara, M. A. García Cumbreras, M. T. Martín Valdivia, and L. A. Ureña López, “SINAI-EMMA: Vectors of words for sentiment analysis in twitter,” in Proceedings of TASS 2015: Workshop on Sentiment Analysis at SEPLN co-located with 31st SEPLN Conference (SEPLN 2015). Alicante, Spain: Spanish Society for Natural Language Processing, 2015, pp. 41–46.
22. A. Tumasjan, T. Sprenger, P. Sandner, and I. Welpe, “Predicting elections with twitter: What 140 characters reveal about political sentiment,” 2010.
23. A. Jungherr, P. Jürgens, and H. Schoen, “Why the pirate party won the german election of 2009 or the trouble with predictions: A response to tumasjan, a., sprenger, t. o., sander, p. g., & welpe, i. m. “predicting elections with twitter: What 140 characters reveal about political sentiment”,” Social Science Computer Review, vol. 30, no. 2, pp. 229–234, 2012. http://dx.doi.org/10.1177/0894439311404119
24. J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” Journal of Computational Science, vol. 2, no. 1, pp. 1 – 8, 2011. http://dx.doi.org/10.1016/j.jocs.2010.12.007
25. J. Wehrmann, W. E. Becker, and R. C. Barros, “A multi-task neural network for multilingual sentiment classification and language detection on twitter,” in Proceedings of the 33rd Annual ACM Symposium on Applied Computing, ser. SAC ’18. New York, NY, USA: ACM, 2018. http://dx.doi.org/10.1145/3167132.3167325. ISBN 978-1-4503-5191-1 pp. 1805–1812.
26. F. M. Luque and J. M. Pérez, “Atalaya at TASS 2018: Sentiment analysis with tweet embeddings and data augmentation,” in Proceedings of TASS 2018: Workshop on Sentiment Analysis at SEPLN co-located with 34th SEPLN Conference (SEPLN 2018). Sevilla, Spain: Spanish Society for Natural Language Processing, 2018, pp. 29–35.
27. J. Kapočiūtė-Dzikienė, R. Damaševičius, and M. Woźniak, “Sentiment analysis of lithuanian texts using traditional and deep learning approaches,” Computers, vol. 8, no. 1, 2019. http://dx.doi.org/10.3390/computers8010004. [Online]. Available: https://www.mdpi.com/2073-431X/8/1/4
28. N. Chen and P. Wang, “Advanced combined lstm-cnn model for twitter sentiment analysis,” in 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nov 2018. http://dx.doi.org/10.1109/CCIS.2018.8691381 pp. 684–687.
29. Y. Kim, “Convolutional neural networks for sentence classification,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014. http://dx.doi.org/10.3115/v1/D14-1181 pp. 1746–1751. [Online]. Available: https://www.aclweb.org/anthology/D14-1181
30. J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization,” Journal of Machine Learning Research, vol. 13, pp. 281–305, Feb. 2012.
31. J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of machine learning algorithms,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 2951–2959.
32. A. Cano, A. Zafra, and S. Ventura, “Speeding up the evaluation phase of gp classification algorithms on gpus,” Soft Computing, vol. 16, no. 2, pp. 187–202, Feb 2012. http://dx.doi.org/10.1007/s00500-011-0713-4
33. I. Loshchilov and F. Hutter, “Cma-es for hyperparameter optimization of deep neural networks,” in Proceedings of ICLR 2016 - Workshop Track), 2014, pp. 1746–1751.
34. N. Hansen and A. Ostermeier, “Completely derandomized self-adaptation in evolution strategies,” Evol. Comput., vol. 9, no. 2, pp. 159–195, Jun. 2001. http://dx.doi.org/10.1162/106365601750190398
35. T. Tanaka, T. Moriya, T. Shinozaki, S. Watanabe, T. Hori, and K. Duh, “Automated structure discovery and parameter tuning of neural network language model based on evolution strategy,” in 2016 IEEE Spoken Language Technology Workshop (SLT), Dec 2016. http://dx.doi.org/10.1109/SLT.2016.7846334 pp. 665–671.
36. Y. Nalçakan and T. Ensari, “Decision of neural networks hyperparameters with a population-based algorithm,” in Machine Learning, Optimization, and Data Science, G. Nicosia, P. Pardalos, G. Giuffrida, R. Umeton, and V. Sciacca, Eds. Cham: Springer International Publishing, 2019. ISBN 978-3-030-13709-0 pp. 276–281.
37. Z. Lin, M. Feng, C. N. dos Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio, “A structured self-attentive sentence embedding,” in International Conference on Learning Representations (ICLR 2017), 2017. [Online]. Available: https://openreview.net/forum?id=BJC_jUqxe
38. R. Tanabe and A. S. Fukunaga, “Improving the search performance of shade using linear population size reduction,” in 2014 IEEE congress on evolutionary computation (CEC). IEEE, 2014. http://dx.doi.org/10.1109/CEC.2014.6900380 pp. 1658–1665.
39. A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, and F. Herrera, Learning from imbalanced data sets. Springer, 2018.
40. P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 135–146, 2017. http://dx.doi.org/10.1162/tacl_a_00051. [Online]. Available: https://doi.org/10.1162/tacl_a_00051
41. Q. McNemar, “Note on the sampling error of the difference between correlated proportions or percentages,” Psychometrika, vol. 12, no. 2, pp. 153–157, Jun 1947. http://dx.doi.org/10.1007/BF02295996. [Online]. Available: https://doi.org/10.1007/BF02295996
42. L. Chiruzzo and A. Rosá, “RETUYT-InCo at TASS 2018: Sentiment analysis in spanish variants using neural networks and svm,” in Proceedings of TASS 2018: Workshop on Sentiment Analysis at SEPLN colocated with 34th SEPLN Conference (SEPLN 2018). Sevilla, Spain: Spanish Society for Natural Language Processing, 2018, pp. 57–63.

Uwagi

1. This work was supported by proyect TIN2017-89517-P, by the Spanish “Ministerio de Economía y Competitividad”, the project DeepSCOP-Ayudas Fundación BBVA a Equipos de Investigación Científica en Big Data 2018 and a grant from the Fondo Europeo de Desarrollo Regional (FEDER). Eugenio Martínez Cámara was supported by the Spanish Government Programme Juan de la Cierva Formación (FJCI-2016-28353).

2. Track 2: Computer Science & Systems

3. Technical Session: 4th International Workshop on Language Technologies and Applications

4. Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).

Typ dokumentu

Bibliografia

Identyfikator YADDA

bwmeta1.element.baztech-30e9a2e4-6c89-4506-97f8-86cf9c683908