An integrated model for textual social media data with spatio-temporal dimensions
Author
dc.contributor.author
Díaz Zamora, Juglar
Author
dc.contributor.author
Poblete Labra, Bárbara
Author
dc.contributor.author
Bravo Márquez, Felipe
Admission date
dc.date.accessioned
2020-07-30T23:16:33Z
Available date
dc.date.available
2020-07-30T23:16:33Z
Publication date
dc.date.issued
2020
Cita de ítem
dc.identifier.citation
Information Processing & Management 57(5):102219 (2020)
es_ES
Identifier
dc.identifier.other
10.1016/j.ipm.2020.102219
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/176213
Abstract
dc.description.abstract
GPS-enabled devices and social media popularity have created an unprecedented opportunity for researchers to collect, explore, and analyze text data with fine-grained spatial and temporal metadata. In this sense, text, time and space are different domains with their own representation scales and methods. This poses a challenge on how to detect relevant patterns that may only arise from the combination of text with spatio-temporal elements. In particular, spatio-temporal textual data representation has relied on feature embedding techniques. This can limit a model's expressiveness for representing certain patterns extracted from the sequence structure of textual data. To deal with the aforementioned problems, we propose an Acceptor recurrent neural network model that jointly models spatio-temporal textual data. Our goal is to focus on representing the mutual influence and relationships that can exist between written language and the time-andplace where it was produced. We represent space, time, and text as tuples, and use pairs of elements to predict a third one. This results in three predictive tasks that are trained simultaneously. We conduct experiments on two social media datasets and on a crime dataset; we use Mean Reciprocal Rank as evaluation metric. Our experiments show that our model outperforms state-of-the-art methods ranging from a 5.5% to a 24.7% improvement for location and time prediction.
es_ES
Patrocinador
dc.description.sponsorship
Millennium Institute for Foundational Research on Data (IMFD)
Comisión Nacional de Investigación Cientifica y Tecnológica (CONICYT)
CONICYT FONDECYT
1191604
CONICYT-PCHA/Doctorado Nacional
2016-21160142