We present a study on the automatic classification of speech acts in the domain of political communication, based on J. R. Searle's classification of illocutionary acts. Our research involves creating a dataset using the US State of the Union corpus and the UN General Debate corpus (UNGD) as data sources. To overcome limited labeled data, we employ a combination of weak supervision and active learning techniques for dataset creation and model training. Through various experiments, we investigate the influence of external and internal factors on speech act classification. In addition, we discuss the potential for further analysis of speech act usage, using the trained model on the UNGD corpus. The findings demonstrate the effectiveness of Transformer-based models for automatic speech act classification, highlight the benefits of weak supervision and active learning for dataset creation and model training, and underscore the potential for large-scale statistical analysis of speech act usage in the domain of political communication.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.