Learning to leverage microblog information for QA retrieval
Author
dc.contributor.author
Herrera, Jose
Author
dc.contributor.author
Poblete Labra, Bárbara
Author
dc.contributor.author
Parra, Denis
Admission date
dc.date.accessioned
2019-05-31T15:19:04Z
Available date
dc.date.available
2019-05-31T15:19:04Z
Publication date
dc.date.issued
2018
Cita de ítem
dc.identifier.citation
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Volumen 10772 LNCS, 20108
Identifier
dc.identifier.issn
16113349
Identifier
dc.identifier.issn
03029743
Identifier
dc.identifier.other
10.1007/978-3-319-76941-7_38
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/169308
Abstract
dc.description.abstract
Community Question Answering (cQA) sites have emerged as platforms designed specifically for the exchange of questions and answers among
users. Although users tend to find good quality answers in cQA sites, they also
engage in a significant volume of QA interactions in other platforms, such as microblog networking sites. This in part is explained because microblog platforms
contain up-to-date information on current events, provide rapid information propagation, and have social trust.
Despite the potential of microblog platforms, such as Twitter, for automatic QA
retrieval, how to leverage them for this task is not clear. There are unique characteristics that differentiate Twitter from traditional cQA platforms (e.g., short
message length, low quality and noisy information), which do not allow to directly apply prior findings in the area. In this work, we address this problem by
studying: 1) the feasibility of Twitter as a QA platform and 2) the discriminating
features that identify relevant answers to a particular query. In particular, we create a document model at conversation-thread level, which enables us to aggregate
microblog information, and set up a learning-to-rank framework, using factoid
QA as a proxy task. Our experimental results show microblog data can indeed
be used to perform QA retrieval effectively. We identify domain-specific features
and combinations of those features that better account for improving QA ranking,
achieving a MRR of 0.7795 (improving 62% over our baseline method). In addition, we provide evidence that our method allows to retrieve complex answers to
non-factoid questions.