Multiscale fourier descriptors for classifying semivowels in spectrograms

作者:

Highlights:

摘要

Fourier descriptors (FDs) have been used successfully to characterize the boundary of objects in images. It is demonstrated that FDs are appropriate for characterizing objects in speech spectrograms consisting of 40 sounds representing 10 speaker-dependent words containing the English semivowels /w y l r/. With 10 FDs, a 97.5% recognition rate is attained. Different sounds are misclassified by wide- and narrow-band methods, suggesting that multiscaling and FD changes (differences) may be appropriate features. With a FD difference approach, recognition rates equaled or exceeded those obtained with a conventional linear predictive coding (LPC) classifier as well as those obtained with wide- and narrow-band FD methods alone.

论文关键词:Fourier descriptors,Multiscale,Planar shape,Spectrogram,Cluster analysis

论文评审过程:Received 16 September 1992, Revised 4 March 1993, Accepted 7 April 1993, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(93)90163-Q