Real-Time, Non-intrusive Speech Quality Estimation: A Signal-Based Model

Raja, Adil; Flanagan, Colin

doi:10.1007/978-3-540-78671-9_4

Adil Raja¹ &
Colin Flanagan¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4971))

Included in the following conference series:

European Conference on Genetic Programming

1040 Accesses
6 Citations

Abstract

Speech quality estimation, as perceived by humans, is of vital importance to proper functioning of telecommunications networks. Speech quality can be degraded due to various network related problems. In this paper we present a model for speech quality estimation that is a function of various time and frequency domain features of human speech. We have employed a hybrid optimization approach, by using Genetic Programming (GP) to find a suitable structure for the desired model. In order to optimize the coefficients of the model we have employed a traditional GA and a numerical method known as linear scaling. The proposed model outperforms the ITU-T Recommendation P.563 in terms of prediction accuracy, which is the current non-intrusive speech quality estimation model. The proposed model also has a significantly reduced dimensionality. This may reduce the computational requirements of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ITU-T.: Methods for subjective determination of transmission quality. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation P.800 (1996)
Google Scholar
ITU-T.: Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation P.862 (2001)
Google Scholar
ITU-T.: Single-ended method for objective speech quality assessment in narrow-band telephony applications. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation P.563 (2005)
Google Scholar
ITU-T.: The E-Model, a computational model for use in transmission planning. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation G.107 (2005)
Google Scholar
Raja, A., Azad, R.M.A., Flanagan, C., Ryan, C.: Real-time, non-intrusive evaluation of VoIP. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 217–228. Springer, Heidelberg (2007)
Chapter Google Scholar
Gray, P., Hollier, M.P., Massara, R.E.: Non-intrusive speech-quality assessment using vocal-tract models. In: IEE Proceedings of Vision, Image and Signal Processing, vol. 147 (December 2000)
Google Scholar
Hermansky, H.: Perceptual Linear Predictive (PLP) Analysis of Speech. Journal of Acoustical Society of America 87(4), 1738–1752 (1990)
Article Google Scholar
ITU-T.: Subjective performance assessment of telephone-band and wideband digital codecs. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation P.830 (1996)
Google Scholar
Gold, B., Morgan, N.: Speech and Audio Signal Processing: Processing and Perception of Speech and Music. Wiley, New York (1999)
Google Scholar
Jin, C., Kubichek, R.: Vector quantization techniques for output-based objective speechquality. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), vol. 1, pp. 1291–1294 (November 1996)
Google Scholar
Picovici, D., Mahdi, A.E.: New output-based perceptual measure for predicting subjective quality of speech. In: IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP), vol. 5, pp. 17–21 (May 2004)
Google Scholar
Tarraf, A., Meyers, M.: Neural network-based voice quality measurement technique. In: IEEE international symposium on Computers and Communications, pp. 375–381 (July 1999)
Google Scholar
Falk, T.H., Chan, W.-Y.: Single-ended speech quality measurement using machine learning methods. IEEE Transactions on Audio, Speech and Language Processing 14(6), 1935–1947 (2006)
Article Google Scholar
Grancharov, V., Zhao, D.Y., Lindblom, J., Kleijn, W.B.: Low-complexity, nonintrusive speech quality assessment. IEEE Transactions on Audio, Speech and Language Processing 14(6), 1948–1956 (2006)
Article Google Scholar
Li, W., Kubichek, R.: Output-based objective speech quality measurement using continuous Hidden Markov Models. In: Seventh International Symposium on Signal Processing and Its Applications, vol. 1, pp. 1–4 (July 2003)
Google Scholar
Keijzer, M.: Scaled symbolic regression. Genetic Programming and Evolvable Machines 5(3), 259–269 (2004)
Article Google Scholar
ITU-T.: coded-speech database. International Telecommunications Union, Geneva, Switzerland, ITU-T P.Supplement 23 (1998)
Google Scholar
Thorpe, L., Yang, W.: Performance of current perceptual objective speech quality measures. In: IEEE International Speech Coding, vol. 1, pp. 144–146 (May 1996)
Google Scholar
ITU-T.: Pulse Code Modulation (PCM) of voice frequencies. International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation G.711 (November 1988)
Google Scholar
ITU-T.: 40, 32, 24, 16 kbit/s Adaptive Differential Pulse Code Modulation (ADPCM). International Telecommunications Union, Geneva, Switzerland, ITU-T Recommendation G.726 (1990)
Google Scholar
Luke, S., Panait, L.: Lexicographic parsimony pressure. In: W.B.L. (ed.) GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, New York, pp. 829–836 (2002)
Google Scholar
Gustafson, S., Burke, E.K., Krasnogor, N.: On improving genetic programming for symbolic regression. In: D.C., et al. (eds.) Proceedings of the 2005 IEEE Congress on Evolutionary Computation, Edinburgh, UK, 2-5 September, vol. 1, pp. 912–919. IEEE Press, Los Alamitos (2005)
Chapter Google Scholar
Howard, L.M., D’Angelo, D.J.: The GA–P: A genetic algorithm and genetic programming hybrid. IEEE Expert 10(3), 11–15 (1995)
Article Google Scholar
Topchy, A., Punch, W.F.: Faster genetic programming based on local gradient search of numeric leaf values. In: Spector, L., Goodman, E.D., Wu, A., Langdon, W.B., Voigt, H.-M., Gen, M., Sen, S., Dorigo, M., Pezeshk, S., Garzon, M.H., Burke, E. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), San Francisco, California, USA, 7-11 july, pp. 155–162. Morgan Kaufmann, San Francisco (2001), http://www.cs.bham.ac.uk/~wbl/biblio/gecco2001/d01.pdf
Google Scholar
Mugambi, E.M., Hunter, A., Oatley, G., Kennedy, L.: Polynomial-fuzzy decision tree structures for classifying medical data. Knowledge-Based Systems 17(2-4), 81–87 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic and Computer Engineering, University of Limerick, Limerick, Ireland
Adil Raja & Colin Flanagan

Authors

Adil Raja
View author publications
You can also search for this author in PubMed Google Scholar
Colin Flanagan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Michael O’Neill Leonardo Vanneschi Steven Gustafson Anna Isabel Esparcia Alcázar Ivanoe De Falco Antonio Della Cioppa Ernesto Tarantino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raja, A., Flanagan, C. (2008). Real-Time, Non-intrusive Speech Quality Estimation: A Signal-Based Model. In: O’Neill, M., et al. Genetic Programming. EuroGP 2008. Lecture Notes in Computer Science, vol 4971. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78671-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-78671-9_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78670-2
Online ISBN: 978-3-540-78671-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics