Analysis and automatic identification of spontaneous emotions in speech from human-human and human-machine communication

  1. DE VELASCO VAZQUEZ, MIKEL
Dirigée par:
  1. María Inés Torres Barañano Directeur/trice
  2. Raquel Justo Blanco Directeur/trice

Université de défendre: Universidad del País Vasco - Euskal Herriko Unibertsitatea

Fecha de defensa: 13 septembre 2023

Département:
  1. Electricidad y Electrónica

Type: Thèses

Teseo: 823411 DIALNET lock_openADDI editor

Résumé

This research mainly focuses on improving our understanding of human-human and human-machineinteractions by analysing paricipants¿ emotional status. For this purpose, we have developed andenhanced Speech Emotion Recognition (SER) systems for both interactions in real-life scenarios,explicitly emphasising the Spanish language. In this framework, we have conducted an in-depth analysisof how humans express emotions using speech when communicating with other persons or machines inactual situations. Thus, we have analysed and studied the way in which emotional information isexpressed in a variety of true-to-life environments, which is a crucial aspect for the development of SERsystems. This study aimed to comprehensively understand the challenge we wanted to address:identifying emotional information on speech using machine learning technologies. Neural networks havebeen demonstrated to be adequate tools for identifying events in speech and language. Most of themaimed to make local comparisons between some specific aspects; thus, the experimental conditions weretailored to each particular analysis. The experiments across different articles (from P1 to P19) are hardlycomparable due to our continuous learning of dealing with the difficult task of identifying emotions inspeech. In order to make a fair comparison, additional unpublished results are presented in the Appendix.These experiments were carried out under identical and rigorous conditions. This general comparisonoffers an overview of the advantages and disadvantages of the different methodologies for the automaticrecognition of emotions in speech.