Words, Tweets, and Reviews: Leveraging Affective Knowledge Between Multiple Domains
Author
dc.contributor.author
Bravo Márquez, Felipe
Author
dc.contributor.author
Tamblay Veas, Cristián Felipe
Admission date
dc.date.accessioned
2022-01-27T14:11:59Z
Available date
dc.date.available
2022-01-27T14:11:59Z
Publication date
dc.date.issued
2021
Cita de ítem
dc.identifier.citation
Noname manuscript No. (will be inserted by the editor)
es_ES
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/183869
Abstract
dc.description.abstract
Three popular application domains of sentiment and emotion analysis are: 1) the automatic rating of movie reviews, 2) extracting opinions and
emotions on Twitter, and 3) inferring sentiment and emotion associations of words.
The textual elements of these domains differ in their length i.e., movie reviews are
usually longer than tweets and words are obviously shorter than tweets, but they also
share the property that they can be plausibly annotated according to the same affective
categories (e.g., positive, negative, anger, joy). Moreover, state-of-the-art models for
these domains are all based on the approach of training supervised machine learning
models on manually-annotated examples. This approach suffers from an important
bottleneck: manually annotated examples are expensive and time-consuming to obtain and not always available.
Methods In this paper we propose a method for transferring affective knowledge between words, tweets, and movie reviews using two representation techniques:
Word2Vec static embeddings and BERT contextualized embeddings. We build compatible representations for movie reviews, tweets, and words, using these techniques
and train and evaluate supervised models on all combinations of source and target
domains.
Results and Conclusions Our experimental results show that affective knowledge
can be successfully transferred between our three domains, that contextualized embeddings tend to outperform their static counterparts, and that better transfer learning
results are obtained when the source domain has longer textual units that the target
domain.
es_ES
Lenguage
dc.language.iso
es
es_ES
Type of license
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 United States