Articles written in Sadhana
Volume 44 Issue 3 March 2019 Article ID 0054
Mel-Frequency Cepstral Coefficients (MFCC) are features widely and successfully used for various speech processing applications. These features are extracted using Fourier transform. However, this transform suffers from some crucial restrictions when used for analyzing nonlinear and non-stationary signals such asspeech. To address this problem, in the present study, we investigate the application of Empirical Mode Decomposition (EMD) in extracting more efficient and robust features for automatic gender identification. In particular, in the proposed approach, the speech signal is first decomposed into a set of narrow-band oscillatory modes, using EMD, from which mel-frequency cepstral features can be extracted. On the other hand, multi-band decomposition of all modes results in some redundant and even irrelevant features that can degrade the performance of the classification. Therefore, we propose to efficiently select the most discriminative frequency bands over all modes. The minimal-redundancy-maximal-relevance (mRMR) feature selection algorithm is also examined for this purpose. The proposed EMD-based features are then extracted by applying DCT on log power values calculated over the selected mel-scale bands of the IMFs. Simulation results show that, using the proposed features for automatic gender identification considerably improves the performance of the system, inparticular in noisy environments.