Fine-grained entity linking
Artículo
Open/ Download
Access note
Acceso Abierto
Publication date
2020
Abstract
The Entity Linking (EL) task involves linking mentions of entities in a text with their identifier in a Knowledge Base (KB) such as Wikipedia, BabelNet, DBpedia, Freebase, Wikidata, YAGO, etc. Numerous techniques have been proposed to address this task down through the years. However, not all works adopt the same convention regarding the entities that the EL task should target; for example, while some EL works target common entities like "interview"appearing in the KB, others only target named entities like "Michael Jackson". The lack of consensus on this issue (and others) complicates research on the EL task; for example, how can the performance of EL systems be evaluated and compared when systems may target different types of entities? In this work, we first design a questionnaire to understand what kinds of mentions and links the EL research community believes should be targeted by the task. Based on these results we propose a fine-grained categorization scheme for EL that distinguishes different types of mentions and links. We propose a vocabulary extension that allows to express such categories in EL benchmark datasets. We then relabel (subsets of) three popular EL datasets according to our novel categorization scheme, where we additionally discuss a tool used to semi-automate the labeling process. We next present the performance results of five EL systems for individual categories. We further extend EL systems with Word Sense Disambiguation and Coreference Resolution components, creating initial versions of what we call Fine-Grained Entity Linking (FEL) systems, measuring the impact on performance per category. Finally, we propose a configurable performance measure based on fuzzy sets that can be adapted for different application scenarios Our results highlight a lack of consensus on the goals of the EL task, show that the evaluated systems do indeed target different entities, and further reveal some open challenges for the (F)EL task regarding more complex forms of reference for entities.
Patrocinador
Comision Nacional de Investigacion Cientifica y Tecnologica (CONICYT)
PCHA/Doctorado Nacional/2016-21160017
Millennium Institute for Foundational Research on Data (IMFD), Chile
Comision Nacional de Investigacion Cientifica y Tecnologica (CONICYT)
CONICYT FONDECYT
1181896
1191604
Indexation
Artículo de publicación ISI Artículo de publicación SCOPUS
Quote Item
Web Semantics: Science, Services and Agents on the World Wide Web 65 (2020) 100600
Collections
The following license files are associated with this item: