Audio dilation in real time speech communication

May 18th, 2015

Categories: Audio Research

Diapix Image Test
Diapix Image Test


Novak, J.S., Archer, J., Kenyon, R.V.


Algorithmically decreasing speech tempo, or “audio dilation,” can improve speech perception in attentionally demanding tasks [Gygi & Shafiro (2014), Hear. Res. 310, pp. 76-86]. On-line audio dilation [Novak, et al. (2013), Interspeech 1869-1871] is a recently developed technique that decreases the tempo of audio signals as they are generated. This pilot study investigated effects of on-line audio dilation on performance in interactive problem solving tasks. We used a Diapix task [Baker & Hazan (2011), Behav. Res. Methods 43(3), 761-770] to elicit and record spontaneous speech from pairs of participants under various dilation conditions: participants, seated in different rooms, were asked to find ten differences on two similar pictures, while their speech was either transmitted as spoken or dilated. Conditions tested include stretching one, both, or neither audio signal by 40%. Subsequent analysis shows that the technique, even using this substantial increase, did not interfere with interactive problem solving tasks, and did not lead to changes in speech production rate, measured as number of syllables per second. The lack of negative effects of on-line speech dilation provides a preliminary basis for further assessment of this method in speech perception tasks with high attentional and memory load.




Novak, J.S., Archer, J., Kenyon, R.V., Audio dilation in real time speech communication, The Journal of the Acoustical Society of America 137.4 (2015), Pittsburgh, PA, DOI:10.1121/1.4920407, pp. 2303-2303, May 18th, 2015.