• Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/sadh/047/0123

    • Keywords

       

      Low Resource Language; Machine Translation; Evaluation.

    • Abstract

       

      The recent United Nations Educational, Scientific and Cultural Organization (UNESCO) survey states that India has 197 endangered languages. Himachal Pradesh, a state in India, has topped the list with seven definitely endangered languages, and Kinnauri-Pahari being the one. Due to the lack of availability of digitized resources, the corpus compilation is a bit difficult. This paper presents and releases the Kinnauri-Pahari (ISO- 639-3:kjo) dataset, consisting of the 43,362 Monolingual and 20,307 Parallel sentences in version_0.1. The dataset was tested on the Statistical, and Neural Machine Translation and their results were evaluated using different evaluation metrics. The corpus is freely available for non-commercial usage and research (https:// github.com/phildani7/dlnith/tree/master/Kinnauri-Pahari).

    • Author Affiliations

       

      SHEFALI SAXENA1 SHWETA CHAUHAN1 PHILEMON DANIEL1

      1. Electronics and Communication Department, National Institute of Technology, Hamirpur, H. P., India
    • Dates

       
  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2022-2023 Indian Academy of Sciences, Bengaluru.