• Fulltext


        Click here to view fulltext PDF

      Permanent link:

    • Keywords


      Automatic speech recognition; deep learning; gated recurrent unit; indic BERT; indic spell corrector.

    • Abstract


      India is a land of unity; it is home to 122 major languages and 1599 other languages. Around 70% of people in India speak Indo-Aryan languages whereas 19% speak Dravidian languages which are agglutinative morphologically rich. Speech is a lucid, time-saving, and effortless means of communication. Automatic speech recognition (ASR) is a process that accurately transcribes spoken utterances into text. Speech recognition in Indian languages will empower people to easily access their regional language to any content they desire. The ultimate goal of this proposed work is to develop a novel deep sequence modeling-based ASR system with improved spell corrector for seven low-resource languages. The efficacy of our proposed model is evaluatedusing word error rate (WER) and sequence match ratio. The end-to-end ASR system based on a recurrent neural network-gated recurrent unit (RNN-GRU) achieves plausible results with average WER of 0.62. Indeed, one of the key concerns in the ASR system is spelling errors in transcribed text. Despite the intricacy involved in spell correction of Natural Language Processing, the transformer-based INDIC Bidirectional Encoder Representations from Transformers language model yields a significant improvement in performance by 10% and reduces the average WER to 0.52.

    • Author Affiliations



      1. Department of IT, PSG College of Technology, Coimbatore 641 004, India
      2. Department of EEE, PSG College of Technology, Coimbatore 641 004, India
      3. Department of CSE, PSG College of Technology, Coimbatore 641 004, India
    • Dates

  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2022-2023 Indian Academy of Sciences, Bengaluru.