Adverse drug reaction extraction on electronic health records written in spanish

SANTISO GONZALEZ, SARA

Adverse drug reaction extraction on electronic health records written in spanish

SANTISO GONZALEZ, SARA

Dirigida por:

Alicia Pérez Ramírez Director/a
Arantza Casillas Rubio Director/a

Universidad de defensa: Universidad del País Vasco - Euskal Herriko Unibertsitatea

Fecha de defensa: 13 de junio de 2019

Tribunal:

Raquel Martínez Unanue Presidente/a
Arantza Díaz de Ilarraza Sánchez Secretario/a
Lluís Padró Cirera Vocal

Departamento:

Lenguajes y Sistemas Informáticos

Tipo: Tesis

Teseo: 149860 DIALNET ADDI editor

Resumen

This work focuses on the automatic extraction of Adverse Drug Reactions (ADRs) in Electronic HealthRecords (EHRs). That is, extracting a response to a medicine which is noxious and unintended and whichoccurs at doses normally used. From Natural Language Processing (NLP) perspective, this wasapproached as a relation extraction task in which the drug is the causative agent of a disease, sign orsymptom, that is, the adverse reaction.ADR extraction from EHRs involves major challenges. First, ADRs are rare events. That is, relationsbetween drugs and diseases found in an EHR are seldom ADRs (are often unrelated or, instead, related astreatment). This implies the inference from samples with skewed class distribution. Second, EHRs arewritten by experts often under time pressure, employing both rich medical jargon together with colloquialexpressions (not always grammatical) and it is not infrequent to find misspells and both standard andnon-standard abbreviations. All this leads to a high lexical variability.We explored several ADR detection algorithms and representations to characterize the ADR candidates.In addition, we have assessed the tolerance of the ADR detection model to external noise such as theincorrect detection of implied medical entities implied in the ADR extraction, i.e. drugs and diseases. Westtled the first steps on ADR extraction in Spanish using a corpus of real EHRs.