Emotions can be defined as a natural, instinctive state of mind arising from one’s circumstances, mood, and relationships with others. It has long been a question to be answered by psychology that how and what is it that humans feel. Enabling computers to recognize human emotions has been an of interest to researchers since 1990s (Picard et al., 1995). Ever since, this area of research has grown significantly and emotion detection is becoming an important component in many natural language processing tasks.
Several theories exist for defining emotions and are chosen by researchers according to their needs. For instance, according to appraisal theory, a psychology theory, emotions are produced by our evaluations (appraisals or estimates) of events that cause a specific reaction in different people. Some emotions are easy and universal, while others are complex and nuanced. Emotion classification is generally the process of labeling a piece of text with one or more corresponding emotion labels.
Psychologists have developed numerous models and taxonomies of emotions. The model or taxonomy depends on the problem, and thorough study is often required to select the best model. Early studies of emotion classification focused on building computational models to classify basic emotion categories. In recent years, increasing volumes of social media and the digitization of data have opened a new horizon in this area of study, where emotion classification is a key component of applications, including mood and behavioral studies, as well as disaster relief, amongst many other applications. Sophisticated models have been built to detect and classify emotion in text, but few analyze how well a model is able to learn emotion cues. The ability to learn emotion cues properly and be able to generalize this learning is very important. This work investigates the robustness of emotion classification approaches across genres and languages, with a focus on quantifying how well state-of-the-art models are able to learn emotion cues.
First, we use multi-task learning and hierarchical models to build emotion models that were trained on data combined from multiple genres. Our hypothesis is that a multi-genre, noisy training environment will help the classifier learn emotion cues that are prevalent across genres. Second, we explore splitting text (i.e. sentence) into its clauses and testing whether the model’s performance improves. Emotion analysis needs fine-grained annotation and clause-level annotation can be beneficial to design features to improve emotion detection performance. Intuitively, clause-level annotations may help the model focus on emotion cues, while ignoring irrelevant portions of the text. Third, we adopted a transfer learning approach for cross-lingual/genre emotion classification to focus the classifier’s attention on emotion cues which are consistent across languages. Fourth, we empirically show how to combine different genres to be able to build robust models that can be used as source models for emotion transfer to low-resource target languages. Finally, this study involved curating and re-annotating popular emotional data sets in different genres, and annotating a multi-genre corpus of Persian tweets and news, and generating a collection of emotional sentences for a low-resource language, Azerbaijani, a language spoken in the north west of Iran.
|Commitee:||Resnik, Philip, Youssef, Abdu, Pless, Robert, Caliskan, Aylin, McKeown, Kathleen|
|School:||The George Washington University|
|School Location:||United States -- District of Columbia|
|Source:||DAI-A 82/6(E), Dissertation Abstracts International|
|Subjects:||Artificial intelligence, Genetics, Linguistics, Design, Language arts, Middle Eastern Studies, Educational psychology|
|Keywords:||Cross-genre, Emotion classification, Low-resource languages, Text annotation, Natural language processing skills, Persia, Iran, Azerbaijani|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be