Proceedings of the 2nd International Conference on Measurement of Speech and Audio Quality in Networks, Prague, Czech Republic, May 2003.
Abstract: Temporal discontinuities in received speech are a reality of Internet Telephony or Voice over Internet Protocol (VoIP) systems. These relatively new impairments pose unique challenges to objective estimators of perceived speech quality. We suggest that objective estimators may benefit from the addition of a temporal discontinuity impairment processor and we provide subjective test results that may help with the design of such processors. We added the loss, pause, and jump impairments (nine different levels of each) to random locations in active segments of G.723.1 coded speech.We then measured the resulting perceived speech quality via a formal absolute category rating subjective experiment using the mean opinion score (MOS) scale. The results show that these three different impairments have similar influences on perceived speech quality, even though the pause and jump impairments are exact opposites (temporal dilation vs. temporal contraction). The results also demonstrate that at a fixed impairment rate, dispersion of these impairments is less detrimental to perceived speech quality than clustering of these impairments. We offer a simple mathematical model that relates impairment parameters to experimental MOS values. It is expected that these results will be of value to those who develop objective estimators of packetized speech quality as well as those who design jitter buffers and jitter buffer management (or playout) algorithms.
For technical information concerning this report, contact:
Stephen D. Voran
Institute for Telecommunication Sciences
Disclaimer: Certain commercial equipment, components, and software may be identified in this report to specify adequately the technical aspects of the reported results. In no case does such identification imply recommendation or endorsement by the National Telecommunications and Information Administration, nor does it imply that the equipment or software identified is necessarily the best available for the particular application or uses.