VoxEL: A benchmark dataset for multilingual entity linking

Rosales-Méndez, Henry; Hogan, Aidan; Poblete Labra, Bárbara

Author	dc.contributor.author	Rosales-Méndez, Henry
Author	dc.contributor.author	Hogan, Aidan
Author	dc.contributor.author	Poblete Labra, Bárbara
Admission date	dc.date.accessioned	2019-05-31T15:21:01Z
Available date	dc.date.available	2019-05-31T15:21:01Z
Publication date	dc.date.issued	2018
Cita de ítem	dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Volumen 11137 LNCS, 2018, Pages 170-186.
Identifier	dc.identifier.issn	16113349
Identifier	dc.identifier.issn	03029743
Identifier	dc.identifier.other	10.1007/978-3-030-00668-6_11
Identifier	dc.identifier.uri	https://repositorio.uchile.cl/handle/2250/169479
Abstract	dc.description.abstract	The Entity Linking (EL) task identifies entity mentions in a text corpus and associates them with corresponding entities in a given knowledge base. While traditional EL approaches have largely focused on English texts, current trends are towards language-agnostic or otherwise multilingual approaches that can perform EL over texts in many languages. One of the obstacles to ongoing research on multilingual EL is a scarcity of annotated datasets with the same text in different languages. In this work we thus propose VoxEL: a manually-annotated gold standard for multilingual EL featuring the same text expressed in five European languages. We first motivate and describe the VoxEL dataset, using it to compare the behaviour of state of the art EL (multilingual) systems for five different languages, contrasting these results with those obtained using machine translation to English. Overall, our results identify how five state-of-the-art multilingual EL systems compare for various languages, how the results of different languages compare, and further suggest that machine translation of input text to English is now a competitive alternative to dedicated multilingual EL configurations.
Lenguage	dc.language.iso	en
Publisher	dc.publisher	Springer Verlag
Type of license	dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Chile
Link to License	dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/cl/
Source	dc.source	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Keywords	dc.subject	Entity linking
Keywords	dc.subject	Information extraction
Keywords	dc.subject	Multilingual
Título	dc.title	VoxEL: A benchmark dataset for multilingual entity linking
Document type	dc.type	Artículo de revista
Cataloguer	uchile.catalogador	jmm
Indexation	uchile.index	Artículo de publicación SCOPUS
uchile.cosecha	uchile.cosecha	SI

Files in this item

Name:: Voxel_multilingual_entity_link ...
Size:: 374.4Kb
Format:: PDF

This item appears in the following Collection(s)

Artículos de revistas
Artículos de revistas

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Chile