• Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/sadh/047/0284

    • Keywords

       

      Speech recognition; syllable; deep neural network; word error rate.

    • Abstract

       

      This paper investigates the conventional and syllable-based ASR systems for a low resource south Indian language, Malayalam. The standard Kaldi frame work is employed. While the first approach uses wordphoneme lexicon, the second approach uses syllable-phoneme lexicon as pronunciation dictionary. Melfrequency cepstral coefficient features of the audio corpus are extracted, and acoustic modeling is done using the Gaussian mixture model—hidden Markov model and deep neural network (DNN). The systems’ performance dependence on different factors like the type of modeling and alignment algorithms employed are studied. The number of hidden layers and units are varied, and the result is analysed. The fine-tuning of phoneme positions plays an integral part in the Kaldi speech recognition toolkit recognition process. The syllable-based study isconducted using a novel phonetic analyser, Mlphone. The analysis shows that Kaldi performed well for phoneme-level DNN acoustic modelling, providing a lower word error rate of 2.86% than the syllable-based model.

    • Author Affiliations

       

      JASMIN S1 ASHISH ABRAHAM SAMUEL2 RAJEEV RAJAN2

      1. L &T Technology Services, Mysore, Karnataka, India
      2. College of Engineering, Trivandrum, APJ Abdul Kalam Technological University, Thiruvananthapuram, India
    • Dates

       
  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2022-2023 Indian Academy of Sciences, Bengaluru.