Document-level adverse drug reaction event extraction on electronic health records in Spanish

  1. Sara Santiso
  2. Arantza Casillas
  3. Alicia Pérez
  4. Maite Oronoz
  5. Koldo Gojenola
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2016

Issue: 56

Pages: 49-56

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

We outline an Adverse Drug Reaction (ADRs) extraction system for Electronic Health Records (EHRs) written in Spanish. The goal of the system is to assist experts on pharmacy in making the decision of whether a patient suffers from one or more ADRs. The core of the system is a predictive model inferred from a manually tagged corpus that counts on both semantic and syntactically features. This model is able to extract ADRs from disease-drug pairs in a given EHR. Finally, the ADRs automatically extracted are post-processed using a heuristic to present the information in a compact way. This stage reports the drugs and diseases of the document together with their frequency, and it also links the pairs related as ADRs. In brief, the system not only presents the ADRs in the text but also provides concise information on request by experts in pharmacy (the potential users of the system).

Bibliographic References

  • Aramaki, E., Y. Miura, M. Tonoike, T. Ohkuma, H. Masuichi, K. Waki, and K. Ohe. 2010. Extraction of adverse drug effects from clinical records. In Proceedings of Medinfo, pages 739—743.
  • Bretonnel, K. and D. Demmer-Fushman. 2014. Biomedical Natural Language Processing, volume 11. John Benjamins Publishing Company.
  • Cohen, K.B. and D. Demner-Fushman. 2014. Biomedical Natural Language Processing. Natural Language Processing. John Benjamins Publishing Company.
  • de la Peña, S., I. Segura-Bedmar, P. Mart́ınez, and J.L. Mart́ınezFernández. 2014. ADR Spanish tool: a tool for extracting adverse drug reactions and indications. Procesamiento del Lenguaje Natural, 53:177–180.
  • Deléger, L., C. Grouin, and P. Zweigenbaum. 2010. Extracting medical information from narrative patient records: the case of medication-related information. JAMIA, 17:555–558.
  • Friedman, N., D. Geiger, and M. Goldszmidt. 1997. Bayesian network classifiers. Machine Learning, 29(2-3):131–163.
  • Gojenola, K., M. Oronoz, A. Pérez, and A. Casillas. 2014. IxaMed: Applying freeling and a perceptron sequential tagger at the shared task on analyzing clinical texts. In International Workshop on Semantic Evaluation (SemEval-2014), Task: Analysis of Clinical Text, pages 361–365.
  • Grigonyte, G., M. Kvist, S. Velupillai, and M. Wirén. 2014. Improving readability of swedish electronic health records through lexical simplification: First results. In Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR), pages 74–83, April.
  • Gurulingappa, H., J. Fluck, M. HofmannApitius, and L. Toldo. 2011. Identification of adverse drug event assertive sentences in medical case reports. In Knowledge Discovery in Health Care and Medicine, pages 16–27.
  • Hall, M., E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. 2009. The WEKA data mining software: An update. SIGKDD Explorations, 11(1):10–18.
  • Japkowicz, N. and S. Stephen. 2002. The class imbalance problem: A systematic study. Intelligent data analysis, 6(5):429– 449.
  • Karlsson, S., J. Zhao, L. Asker, and H. Boström. 2013. Predicting adverse drug events by analyzing electronic patient records. In Proceedings of 14th Conference on Artificial Intelligence in Medicine, pages 125–129.
  • Kubat, M. and S. Matwin. 1997. Addressing the curse of imbalanced training sets: one-sided selection. In ICML, volume 97, pages 179–186. Nashville, USA.
  • Laippala, V., F. Ginter, S. Pyysalo, and T. Salakoski. 2009. Towards automated processing of clinical finnish: Sublanguage analysis and a rule-based parser. International journal of medical informatics, 78:e7–e12.
  • Li, Q., L. Deléger, T. Lingren, H. Zhai, M. Kaiser, L. Stoutenborough, A.G. Jegga, K.B. Cohen, and I. Solti. 2013. Mining fda drug labels for medical conditions. BMC Med. Inf. & Decision Making, 13:53.
  • Mollineda, R.A., R. Alejo, and J.M. Sotoca. 2007. The class imbalance problem in pattern classification and learning. In II Congreso Español de Informática (CEDI 2007). ISBN, pages 978–84. Citeseer.
  • Oronoz, M., A. Casillas, K. Gojenola, and A. Pérez. 2013. Automatic annotation of medical records in Spanish with disease, drug and substance names. Lecture Notes in Computer Science, 8259:536–547.
  • Oronoz, M., K. Gojenola, A. Pérez, A. Dı́az de Ilarraza, and A. Casillas. 2015. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions. Journal of Biomedical Informatics, 56:318 – 332.
  • Segura-Bedmar, I., P. Mart́ınez, R. Revert, and J. Moreno-Schneider. 2015. Exploring spanish health social media for detecting drug effects. BMC medical informatics and decision making, 15(Suppl 2):S6.
  • Sohn, S., JP. Kocher, C. Chute, and G. Savova. 2011. Drug side effect extraction from clinical narratives of psychiatry and psychology patients. JAMIA, 18:144– 149.
  • Wang, X., G. Hripcsak, M. Markatou, and C. Friedman. 2009. Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. JAMIA, 16:328–337.