Skip to main content

Speaker Verification on Unbalanced Data with Genetic Programming

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9597))

Abstract

Automatic Speaker Verification (ASV) is a highly unbalanced binary classification problem, in which any given speaker must be verified against everyone else. We apply Genetic programming (GP) to this problem with the aim of both prediction and inference. We examine the generalisation of evolved programs using a variety of fitness functions and data sampling techniques found in the literature. A significant difference between train and test performance, which can indicate overfitting, is found in the evolutionary runs of all to-be-verified speakers. Nevertheless, in all speakers, the best test performance attained is always superior than just merely predicting the majority class. We examine which features are used in good-generalising individuals. The findings can inform future applications of GP or other machine learning techniques to ASV about the suitability of feature-extraction techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.nist.gov/itl/iad/mig/ivec.cfm.

  2. 2.

    \(I(\cdot )\) is the indicator function.

  3. 3.

    The number of examples correctly classified as a fraction of the total number of training examples.

References

  1. Agapitos, A., Brabazon, A., O’Neill, M.: Controlling overfitting in symbolic regression based on a bias/variance error decomposition. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 438–447. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Batista, G., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor. Newsl. 6(1), 20–29 (2004)

    Article  Google Scholar 

  3. Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: Balancing strategies and class overlapping. In: Famili, A.F., Kok, J.N., Peña, J.M., Siebes, A., Feelders, A. (eds.) IDA 2005. LNCS, vol. 3646, pp. 24–35. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  4. Bhowan, U., Johnston, M., Zhang, M.: Developing new fitness functions in genetic programming for classification with unbalanced data. IEEE Trans. Syst. Man Cybern. B Cybern. 42(2), 406–421 (2012)

    Article  Google Scholar 

  5. Bhowan, U., Johnston, M., Zhang, M., Yao, X.: Evolving diverse ensembles using genetic programming for classification with unbalanced data. IEEE Trans. Evol. Comput. 17(3), 368–386 (2013)

    Article  Google Scholar 

  6. Burton, D.: Text-dependent speaker verification using vector quantization source coding. IEEE Trans. Acoust. Speech Signal Process. 35(2), 133–143 (1987)

    Article  Google Scholar 

  7. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)

    Article  Google Scholar 

  8. Curry, R., Lichodzijewski, P., Heywood, M.I.: Scaling genetic programming to large datasets using hierarchical dynamic subset selection. IEEE Trans. Syst. Man Cybern. B Cybern. 37(4), 1065–1073 (2007)

    Article  Google Scholar 

  9. Day, P., Nandi, A.K.: Robust text-independent speaker verification using genetic programming. IEEE Trans. Audio Speech Lang. Process. 15(1), 285–295 (2007)

    Article  Google Scholar 

  10. Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  11. Doucette, J., Heywood, M.I.: GP classification under imbalanced data sets: active sub-sampling and auc approximation. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 266–277. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Eggermont, J., Eiben, A.E., van Hemert, J.: Adapting the fitness function in gp for data mining. In: Langdon, W.B., Fogarty, T.C., Nordin, P., Poli, R. (eds.) EuroGP 1999. LNCS, vol. 1598, pp. 193–202. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  13. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: Darpa timit acoustic-phonetic continous speech corpus cd-rom. nist speech disc 1–1.1. NASA STI/Recon technical report n 93, 27403 (1993)

    Google Scholar 

  14. Gathercole, C., Ross, P.: Dynamic training subset selection for supervised learning in genetic programming. PPSN III. LNCS, vol. 866, pp. 312–321. Springer, Jerusalem (1994)

    Chapter  Google Scholar 

  15. Gonçalves, I., Silva, S., Melo, J.B., Carreiras, J.M.B.: Random sampling technique for overfitting control in genetic programming. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds.) EuroGP 2012. LNCS, vol. 7244, pp. 218–229. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Hermansky, H.: Perceptual linear predictive (plp) analysis of speech. J. Acoust. Soc. Am. 87, 1738 (1990)

    Article  Google Scholar 

  17. Hermansky, H., Morgan, N., Bayya, A., Kohn, P.: Rasta-plp speech analysis technique. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1992, vol. 1, pp. 121–124. IEEE (1992)

    Google Scholar 

  18. Holmes, J.H.: Differential negative reinforcement improves classifier system learning rate in two-class problems with unequal base rates. In: 3rd Annual Conference on Genetic Programming, pp. 635–642. ICSC Academic Press (1998)

    Google Scholar 

  19. Huang, X., Acero, A., Hon, H.W., et al.: Spoken Language Processing, vol. 15. Prentice Hall PTR, New Jersey (2001)

    Google Scholar 

  20. Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Factor analysis simplified. In: Proceedings of ICASSP, vol. 1, pp. 637–640. Citeseer (2005)

    Google Scholar 

  21. Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007)

    Article  Google Scholar 

  22. Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)

    Article  Google Scholar 

  23. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: Fisher, D.H. (ed.) Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, Tennessee, USA, July 8–12, 1997, pp. 179–186. Morgan Kaufmann (1997)

    Google Scholar 

  24. Lartillot, O., Toiviainen, P.: A matlab toolbox for musical feature extraction from audio. In: International Conference on Digital Audio Effects, pp. 237–244 (2007)

    Google Scholar 

  25. Liares, L.R., Garca-Mateo, C., Alba-Castro, J.L.: On combining classifiers for speaker authentication. Pattern Recogn. 36(2), 347–359 (2003)

    Article  Google Scholar 

  26. Logan, B., et al.: Mel frequency cepstral coefficient for music modelling. In: ISMIR (2000)

    Google Scholar 

  27. Loughran, R., Walker, J., O’Neill, M., McDermott, J.: Genetic programming for musical sound analysis. In: Machado, P., Romero, J., Carballal, A. (eds.) EvoMUSART 2012. LNCS, vol. 7247, pp. 176–186. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  28. Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)

    Article  Google Scholar 

  29. Márquez-Vera, C., Cano, A., Romero, C., Ventura, S.: Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl. Intell. 38(3), 315–330 (2013)

    Article  Google Scholar 

  30. O’Shaughnessy, D.: Speech communication: human and machine. Digital Signal Processing. Addison-Wesley, Reading (1987)

    Google Scholar 

  31. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digital sig. process 10(1), 19–41 (2000)

    Article  Google Scholar 

  32. Sivaram, G.S., Thomas, S., Hermansky, H.: Mixture of auto-associative neural networks for speaker verification. In: INTERSPEECH, pp. 2381–2384 (2011)

    Google Scholar 

  33. Song, D., Heywood, M.I., Zincir-Heywood, A.N.: Training genetic programming on half a million patterns: an example from anomaly detection. IEEE Trans. Evol. Comput. 9(3), 225–239 (2005)

    Article  Google Scholar 

  34. Winkler, S.M., Affenzeller, M., Wagner, S.: Advanced genetic programming based machine learning. J. Math. Model. Algorithms 6(3), 455–480 (2007)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was carried out as a collaboration of projects funded by Science Foundation Ireland under grant Grant Numbers 08/SRC/FM1389 and 13/IA/1850.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Róisín Loughran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Loughran, R., Agapitos, A., Kattan, A., Brabazon, A., O’Neill, M. (2016). Speaker Verification on Unbalanced Data with Genetic Programming. In: Squillero, G., Burelli, P. (eds) Applications of Evolutionary Computation. EvoApplications 2016. Lecture Notes in Computer Science(), vol 9597. Springer, Cham. https://doi.org/10.1007/978-3-319-31204-0_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31204-0_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31203-3

  • Online ISBN: 978-3-319-31204-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics