• Fulltext


        Click here to view fulltext PDF

      Permanent link:

    • Keywords


      Speech recognition; syllable; deep neural network; word error rate.

    • Abstract


      This paper investigates the conventional and syllable-based ASR systems for a low resource south Indian language, Malayalam. The standard Kaldi frame work is employed. While the first approach uses wordphoneme lexicon, the second approach uses syllable-phoneme lexicon as pronunciation dictionary. Melfrequency cepstral coefficient features of the audio corpus are extracted, and acoustic modeling is done using the Gaussian mixture model—hidden Markov model and deep neural network (DNN). The systems’ performance dependence on different factors like the type of modeling and alignment algorithms employed are studied. The number of hidden layers and units are varied, and the result is analysed. The fine-tuning of phoneme positions plays an integral part in the Kaldi speech recognition toolkit recognition process. The syllable-based study isconducted using a novel phonetic analyser, Mlphone. The analysis shows that Kaldi performed well for phoneme-level DNN acoustic modelling, providing a lower word error rate of 2.86% than the syllable-based model.

    • Author Affiliations



      1. L &T Technology Services, Mysore, Karnataka, India
      2. College of Engineering, Trivandrum, APJ Abdul Kalam Technological University, Thiruvananthapuram, India
    • Dates

  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2022-2023 Indian Academy of Sciences, Bengaluru.