Recent advances in deep learning based language models have boosted the performance in many downstream tasks such as sentiment analysis, text summarization, question answering, etc. Personality prediction from text is a relatively new task that has attracted researchers' attention due to the increased interest in personalized services as well as the availability of social media data. In this study, we propose a personality prediction system where text embeddings from large language models such as BERT are combined with multiple statistical features extracted from the input text. For the combination, we use the self-attention mechanism which is a popular choice when several information sources need to be merged together. Our experiments with the Kaggle dataset for MBTI clearly show that adding text statistical features improves the system performance relative to using only BERT embeddings. We also analyze the influence of the personality type words on the overall results.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.