• Parteek Kumar

      Articles written in Sadhana

    • Punjabi to UNL enconversion system

      Parteek Kumar R K Sharma

      More Details Abstract Fulltext PDF

      This paper reports the work for the EnConversion of input Punjabi sentences to an interlingua representation called Universal Networking Language (UNL). The UNL system consists of two main components, namely, EnConverter (used for converting the text from a source language to UNL) and DeConverter (used for converting the text from UNL to a target language). This paper discusses the framework for designing the EnConverter for Punjabi language with a special focus on generation of UNL attributes and relations from Punjabi source text. It also describes the working of Punjabi Shallow Parser used for the processing of the input sentence, which performs the tasks of Tokenizer, Morph-analyzer, Part-of-Speech Tagger and Chunker. This paper also considers the seven phases used in the process of EnConversion of input Punjabi text to UNL representation. The paper highlights the EnConversion analysis rules used for the EnConverter and indicates its usage in the generation of UNL expressions. This paper also covers the results of implementation of Punjabi EnConverter and its evaluation on sample UNL sentences available at Spanish Language Server. The accuracy of the developed system has also been presented in this paper.

    • UNLization of Punjabi text for natural language processing applications


      More Details Abstract Fulltext PDF

      During the last couple of years, in the field of Natural Language Processing, UNL (i.e., Universal Networking Language) immense research activities have been witnessed. This paper illustrates UNLization of Punjabi Natural Language for UC-A1, UGO-A1, and AESOP-A1 with IAN (i.e., Interactive Analyzer) tool using X-Bar approach. This paper also discusses the UNLization process in depth, step-by-step with the help of tree diagrams and tables.

    • Sense disambiguation for Punjabi language using supervised machine learning techniques


      More Details Abstract Fulltext PDF

      Automatic identification of a meaning of a word in a context is termed as Word Sense Disambiguation (WSD). It is a vital and hard artificial intelligence problem used in several natural language processing applications like machine translation, question answering, information retrieval, etc. In this paper, an explicitWSD system for Punjabi language using supervised techniques has been analysed. The sense tagged corpus of 150 ambiguous Punjabi noun words has been manually prepared. The six supervised machine learning techniquesDecision List, Decision Tree, Naive Bayes, K-Nearest Neighbour (K-NN), Random Forest and Support Vector Machines (SVM) have been investigated in this proposed work. Every classifier has used same feature space encompassing lexical (unigram, bigram, collocations, and co-occurrence) and syntactic (part of speech) count based features. The semantic features of Punjabi language have been devised from the unlabelled Punjabi Wikipedia text using word2vec continuous bag of word and skip gram shallow neural network models. Two deeplearning neural network classifiers multilayer perceptron and long short term memory have also been applied for WSD of Punjabi words. The word embedding features have experimented on six classifiers for the Punjabi WSDtask. It has been observed that the performance of the supervised classifiers applied for the WSD task of Punjabi language has been enhanced with the application of word embedding features. In this work, an accuracy of 84%has been achieved by LSTM classifier using word embedding feature.

  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2017-2019 Indian Academy of Sciences, Bengaluru.