Nueva técnica de fusión de clasificadores aplicada a la mejora de la segmentación de audio

  1. Tavarez Arriba, David
  2. Navas Cordón, Eva
  3. Erro Eslava, Daniel
  4. Saratxaga Couceiro, Ibon
  5. Hernáez Rioja, Inmaculada
Journal:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2013

Issue: 51

Pages: 161-168

Type: Article

More publications in: Procesamiento del lenguaje natural

Abstract

This paper presents a new classi er fusion algorithm based on the confusion matrixes of the classi ers which are used to extract the corresponding precision and recall values. The only data needed to be able to apply this new fusion method are the classes or labels assigned by each of the classi ers as well as the reference classes in the development part of the database. The proposed algorithm is described and it is applied to the fusion of two audio segmentation systems that took part in Albayzin 2012 evaluation campaign. The robustness of the algorithm has been assessed and a relative improvement of 6.28% has been achieved when combining the results of the best and worst systems presented to the evaluation

Bibliographic References

  • Aguilo, M., T. Butko, A. Temko, y C. Nadeu. 2009. A hierarchical architecture for audio segmentation in a broadcast news task. En Proc. I Iberian SLTech, páginas 17-20, Porto Salvo, Portugal.
  • Asman, A. J. y B. A. Landman. 2011. Robust statistical label fusion through consensus level, labeler accuracy, and truth estimation (COLLATE). IEEE Transactions on Medical Imaging, 30(10):1779-1794.
  • Butko, T. y C. Nadeu. 2011. Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion. EURASIP Journal on Audio, Speech, and Music Processing, 2011(1):1-10.
  • Jain, A. K. y A. Ross. 2004. Multibiometric systems. Communications of the ACM, 47(1):34-40, Enero.
  • Kittler, J. y M. Hatef. 1998. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligen- ce, 20(3):226-239.
  • Lu, L., H. J. Zhang, y H. Jiang. 2002. Content analysis for audio classification and segmentation. IEEE Transactions on Speech and Audio Processing, 10(7):504-516.
  • Meinedo, H. y J. Neto. 2003. Audio segmentation, classification and clustering in a broadcast news task. En IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volumen 2, páginas 5-8, Hong-Kong, China.
  • Moattar, M. H. y M. M. Homayounpour. 2012. A review on speaker diarization systems and approaches. Speech Communi- cation, 54(10):1065-1103, Junio.
  • Ore, B. M., R. E. Slyh, y E. G. Hansen. 2006. Speaker Segmentation and Clustering using Gender Information. En Procee- dings IEEE Odyssey'06 Conference, páginas 1-8.
  • Reynolds, D., W. Andrews, J. Campbell, J. Navratil, B. Peskin, A. Adomi, D. Kluracek, J. Abramson, R. Mihaescu, J. Godfrey, D. Jones, y S. Xiang. 2003. The SuperSID Project: Exploiting High-level Information. En IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), volumen 4, páginas 784-787, Hong-Kong, China.
  • Reynolds, Douglas A y P. Torres-Carrasquillo. 2005. Approaches and applications of audio diarization. En IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), páginas 953-956, Philadelphia, USA. Ross, A. y A. K. Jain. 2003. Information fusion in biometrics. Pattern Recognition Letters, 24(13):2115-2125, Septiembre.
  • Ruta, D. y B. Gabrys. 2000. An overview of classifier fusion methods. Computing and Information systems, 7:1-10.
  • Rybach, D. y C. Gollan. 2009. Audio segmentation for speech recognition using segment features. En IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), páginas 4197 - 4200, Taipei, Taiwan.
  • Schuller, B. 2012. The Computational Paralinguistics Challenge. Signal Processing Magazine, IEEE, (July):97-101.
  • Tavarez, D., E. Navas, D. Erro, y I. Saratxaga. 2012. Audio Segmentation System by Aholab for Albayzin 2012 Evaluation Campaign. En Iberspeech, páginas 577-584, Madrid, Spain.
  • Xu, L., A. Krzyzak, y C. Y. Suen. 1992. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on Systems, Man, and Cybernetics, 22(3):418-435.