Construcción de recursos de texto para la identificación automática de información clínica en narrativas no estructuradas
Author
dc.contributor.author
Báez, Pablo
Author
dc.contributor.author
Villena, Fabián
Author
dc.contributor.author
Zúñiga, Karen
Author
dc.contributor.author
Jones, Natalia
Author
dc.contributor.author
Fernández, Gustavo
Author
dc.contributor.author
Durán, Manuel
Author
dc.contributor.author
Dunstan Escudero, Jocelyn Mariel
Admission date
dc.date.accessioned
2022-05-03T16:35:55Z
Available date
dc.date.available
2022-05-03T16:35:55Z
Publication date
dc.date.issued
2021
Cita de ítem
dc.identifier.citation
Rev Med Chile 2021; 149: 1014-1022
es_ES
Identifier
dc.identifier.issn
0034-9887
Identifier
dc.identifier.uri
https://repositorio.uchile.cl/handle/2250/185227
Abstract
dc.description.abstract
A significant proportion of the clinical record is in free text
format, making it difficult to extract key information and make secondary use
of patient data. Automatic detection of information within narratives initially
requires humans, following specific protocols and rules, to identify medical
entities of interest. Aim: To build a linguistic resource of annotated medical
entities on texts produced in Chilean hospitals. Material and Methods: A
clinical corpus was constructed using 150 referrals in public hospitals. Three
annotators identified six medical entities: clinical findings, diagnoses, body
parts, medications, abbreviations, and family members. An annotation scheme
was designed, and an iterative approach to train the annotators was applied.
The F1-Score metric was used to assess the progress of the annotator’s agreement
during their training. Results: An average F1-Score of 0.73 was observed at
the beginning of the project. After the training period, it increased to 0.87.
Annotation of clinical findings and body parts showed significant discrepancy,
while abbreviations, medications, and family members showed high agreement.
Conclusions: A linguistic resource with annotated medical entities on texts
produced in Chilean hospitals was built and made available, working with
annotators related to medicine. The iterative annotation approach allowed
us to improve performance metrics. The corpus and annotation protocols will
be released to the research community.
es_ES
Lenguage
dc.language.iso
es
es_ES
Publisher
dc.publisher
Soc Medica Santiago
es_ES
Type of license
dc.rights
Attribution-NonCommercial-NoDerivs 3.0 United States