Resumen de TESTLINK en IberLEF 2023Creación de relaciones entre análisis de laboratorio y mediciones clínicas y sus resultados

  1. Zanoli, Roberto
  2. Karunakaran, Goutham
  3. Altuna Díaz, Begoña
  4. Agerri Gascón, Rodrigo
  5. Salas Espejo, Lidia
  6. Saiz, José Javier
  7. Lavelli, Alberto
  8. Magnini, Bernardo
  9. Speranza, Manuela
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2023

Issue: 71

Pages: 313-320

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

La tarea TESTLINK de IberLEF2023 se centra en la extracción de relaciones de casos clínicos en español y euskera. La tarea consiste en identificar resultados y medidas clínicas y relacionarlos con las pruebas y mediciones de las que se obtuvieron. Tres equipos han participado en la tarea y se han evaluado varios modelos (supervisados) de aprendizaje profundo. Curiosamente, ninguno de los equipos exploró el uso del aprendizaje few-shot. La evaluación muestra que el fine-tuning en el dominio y conjuntos de datos de entrenamiento más grandes mejoran los resultados. De hecho, el hecho de que los modelos supervisados superaran significativamente la baseline basada en el aprendizaje few-shot muestra el papel crucial que aún desempeña la disponibilidad de datos de entrenamiento anotados.

Bibliographic References

  • Agerri, R., I. San Vicente, J. A. Campos, A. Barrena, X. Saralegi, A. Soroa, and E. Agirre. 2020. Give your text representation models some love: the case for Basque. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 4781–4788, Marseille, France, May. European Language Resources Association.
  • Alfattni, G., N. Peek, and G. Nenadic. 2020. Extraction of temporal relations from clinical free text: A systematic review of current approaches. Journal of Biomedical Informatics, 108:103488.
  • Altuna, B., G. Karunakaran, A. Lavelli, B. Magnini, M. Speranza, and R. Zanoli. 2023. CLinkaRT at EVALITA 2023: Overview of the Task on Linking a Lab Result to its Test Event in the Clinical Domain. In Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2023), Parma, Italy, September. CEUR.org.
  • Carrino, C. P., J. Armengol-Estape, A. Gutierrez-Fandiño, J. Llop-Palao, M. P`amies, A. Gonzalez-Agirre, and M. Villegas. 2021. Biomedical and Clinical Language Models for Spanish: On the Benefits of Domain-Specific Pretraining in a Mid-Resource Scenario.
  • Carrino, C. P., J. Llop, M. P`amies, A. Gutierrez-Fandiño, J. Armengol- Estape, J. Silveira-Ocampo, A. Valencia, A. Gonzalez-Agirre, and M. Villegas. 2022. Pretrained Biomedical Language Models for Clinical NLP in Spanish. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 193–199, Dublin, Ireland, May. Association for Computational Linguistics.
  • Cañete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Perez. 2020. Spanish Pre-Trained BERT Model and Evaluation Data. In PML4DC at ICLR 2020.
  • Hao, T., H. Liu, and C. Weng. 2016. Valx: A System for Extracting and Structuring Numeric Lab Test Comparison Statements from Text. Methods of information in medicine, 55:266–75.
  • Jain, K. and V. Prajapati. 2021. NLP/Deep Learning Techniques in Healthcare for Decision Making. Primary Health Care, 11.
  • Jimenez-Zafra, S. M., F. Rangel, and M. Montes-y Gomez. 2023. Overview of IberLEF 2023: Natural Language Processing Challenges for Spanish and other Iberian Languages. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023), co-located with the 39th Conference of the Spanish Society for Natural Language Processing (SEPLN 2023), CEUR-WS.org.
  • Johnson, A. E., T. J. Pollard, L. Shen, L.- w. H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. Anthony Celi, and R. G. Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data, 3.
  • Magnini, B., B. Altuna, A. Lavelli, A.- L. Minard, M. Speranza, and R. Zanoli. 2022. European Clinical Case Corpus. In G. Rehm, editor, European Language Grid. Springer, Cham, Switzerland, 1 edition, November, chapter 17, pages 283– 288.
  • Newman-Griffis, D., G. Divita, B. Desmet, A. Zirikly, C. P. Rose, and E. Fosler- Lussier. 2020. Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets. Journal of the American Medical Informatics Association, 28(3):516– 532, 12.
  • Percha, B. 2021. Modern Clinical Text Mining: A Guide and Review. Annual Review of Biomedical Data Science, 4(1):165–187. PMID: 34465177.
  • Pustejovsky, J., J. M. Castaño, R. Ingria, R. Saurı, R. J. Gaizauskas, A. Setzer, G. Katz, and D. R. Radev. 2003. TimeML: Robust Specification of Event and Temporal Expressions in Text. New directions in question answering, 3:28–34.
  • Sankoh, O. and P. Byass. 2014. Causespecific mortality at INDEPTH Health and Demographic Surveillance System Sites in Africa and Asia: concluding synthesis. Global health action, 7.
  • Santiso, S., A. Perez, and A. Casillas. 2021. Adverse Drug Reaction extraction: Tolerance to entity recognition errors and subdomain variants. Computer Methods and Programs in Biomedicine, 199:105891.
  • Schweter, S. and A. Akbik. 2020. FLERT: Document-Level Features for Named Entity Recognition. CoRR, abs/2011.06993.
  • Styler, W. F., S. Bethard, S. Finan, M. Palmer, S. Pradhan, P. C. de Groen, B. Erickson, T. Miller, C. Lin, G. Savova, et al. 2014. Temporal Annotation in the Clinical Domain. Transactions of the Association for Computational Linguistics, 2:143–154.
  • Trigueros, O., A. Blanco, N. Lebeña, A. Casillas, and A. Perez. 2022. Explainable ICD multi-label classification of EHRs in Spanish with convolutional attention. International Journal of Medical Informatics, 157:104615.