Divide and conquer: An extreme multi-label classification approach for coding diseases and procedures in spanish
Professor Advisor
dc.contributor.advisor
Dunstan Escudero, Jocelyn
Professor Advisor
dc.contributor.advisor
Abeliuk Kimelman, Andrés
Author
dc.contributor.author
Barros Sanfuentes, José Miguel
Associate professor
dc.contributor.other
Bustos Cárdenas, Benjamín
Associate professor
dc.contributor.other
Parra Santander, Denis
Admission date
dc.date.accessioned
2023-05-30T15:35:34Z
Available date
dc.date.available
2023-05-30T15:35:34Z
Publication date
dc.date.issued
2023
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/193936
Abstract
dc.description.abstract
Clinical coding is the task of transforming medical documents into structured codes following a standard ontology. Since these terminologies are composed of thousands of codes, this problem can be considered an Extreme Multi-label Classification task. This thesis proposes a novel neural network-based architecture for clinical coding.
First, we take full advantage of the hierarchical nature of ontologies to create clusters based on semantic relations. Then, we use a Matcher module to assign the probability of documents belonging to each cluster. Finally, the Ranker calculates the probability of each code considering only the documents within the cluster. This division allows a fine-grained differentiation within the cluster, which cannot be addressed using a single classifier.
In addition, since most of the previous work has focused on solving this task in English, we conducted our experiments on four clinical coding corpora in Spanish. The experimental results demonstrate the effectiveness of our model, achieving state-of-the-art results on three of the four datasets. Specifically, we outperformed previous models on two subtasks of the CodiEsp shared task: CodiEsp-D and CodiEsp-P. Also we obtained state-of-the-art results in the FALP corpus.
es_ES
Lenguage
dc.language.iso
en
es_ES
Publisher
dc.publisher
Universidad de Chile
es_ES
Type of license
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 United States