As automatic text analysis has become an established methodological field in the humanities and social sciences, one of the most sought after techniques is the automatic extraction of attitudes, emotions, judgments and opinions. Under the banner of sentiment analysis or opinion mining, these techniques have widely been used in scientific research as well as professional applications. Since sentiment can be defined and operationalized in multiple ways, and the expression of sentiment can differ greatly across domains, there is no single, universal sentiment analysis tool. Rather, dictionaries and models need to be tuned for specific use cases.
In this project we investigate the potential of a semi-supervised approach called active learning as a potentially fast and powerful way to train customized, task-specific sentiment analysis models. The essence of active learning is that a human annotator interactively trains a machine learning model. An algorithm provides the annotator with the most relevant texts for improving the model, which greatly reduces the amount of texts that require coding, thus enabling researchers themselves to supervise the training process.
Previous studies show promising results, but focus mostly on document-level sentiment scores, and often in short social media messages. In this project we investigate the application for journalistic texts, incorporating the holder and target of sentiment. We evaluate whether active learning enables us to train new models (RQ1) and retrain existing models (RQ2) for better performance on specific sentiment attribution tasks, using the Prodigy annotation tool1. Using two corpora (on terrorism and vaccinations, respectively), we develop two separate models for performing the same task in different domains. Additionally, two gold standard sets will be annotated independently from the active learning annotation process to detect possible bias caused by this particular approach. Based on these analyses, we discuss the potential applications of active learning for sentiment analysis.
- Kasper Welbers
- Isa Maks
- Lisa Vasileva
- Eduardo Guerriero