We have been investigating the extent to which listeners are sensitive to the exact form of pitch contours in speech. With the very useful STRAIGHT analysis-synthesis package, we were able to extract the original pitch track from recorded speech, simplify it with our stylization algorithm, then resynthesize a new example based on the new pitch contour, but otherwise sounding very much like the original. We used these tokens in a subjective experiment to confirm that our linear pitch contour stylization was nearly perceptually equivalent to the original pitch contour.
This page provides some sound examples to illustrate this surprising (to us) result.
You can also read our ICASSP-08 paper on this work.
Below are spectrograms of the three examples. We use a long (64ms) analysis window and zoom in on the bottom 1000Hz of the spectrum to emphasize the differences in the pitch tracks, visible in the harmonics.
This material is based in part upon work supported by the National Science Foundation under Grant No. IIS-0535168. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).