A study on conventional and syllable-based approaches for automatic speech recognition in Malayalam
JASMIN S ASHISH ABRAHAM SAMUEL RAJEEV RAJAN
Click here to view fulltext PDF
Permanent link:
https://www.ias.ac.in/article/fulltext/sadh/047/0284
This paper investigates the conventional and syllable-based ASR systems for a low resource south Indian language, Malayalam. The standard Kaldi frame work is employed. While the first approach uses wordphoneme lexicon, the second approach uses syllable-phoneme lexicon as pronunciation dictionary. Melfrequency cepstral coefficient features of the audio corpus are extracted, and acoustic modeling is done using the Gaussian mixture model—hidden Markov model and deep neural network (DNN). The systems’ performance dependence on different factors like the type of modeling and alignment algorithms employed are studied. The number of hidden layers and units are varied, and the result is analysed. The fine-tuning of phoneme positions plays an integral part in the Kaldi speech recognition toolkit recognition process. The syllable-based study isconducted using a novel phonetic analyser, Mlphone. The analysis shows that Kaldi performed well for phoneme-level DNN acoustic modelling, providing a lower word error rate of 2.86% than the syllable-based model.
JASMIN S1 ASHISH ABRAHAM SAMUEL2 RAJEEV RAJAN2
Volume 48, 2023
All articles
Continuous Article Publishing mode
Click here for Editorial Note on CAP Mode
© 2022-2023 Indian Academy of Sciences, Bengaluru.