SUDIP KUMAR NASKAR
Articles written in Sadhana
Volume 44 Issue 1 January 2019 Article ID 0012
Bio-molecular event extraction by integrating multiple eventextraction systems
AMIT MAJUMDER ASIF EKBAL SUDIP KUMAR NASKAR
Event extraction from biomedical text is a very important task in text mining and natural language processing. The overall task involves finding event-related expressions, classifying these into predefined categories and attaching arguments to these events. We perform event detection and event classification in one stepusing an ensemble of classifiers. For event argument extraction, we also use an ensemble of classification models. Our base models are developed using supervised machine learning that makes use of statistical, contextualand syntactic features. Our experimental result on the benchmark datasets of BioNLP-2011 shared task shows the recall, precision and F-measure values of 51.20%, 65.78% and 57.58%, respectively.
Volume 44 Issue 8 August 2019 Article ID 0181
A novel approach to word sense disambiguation in Bengali language using supervised methodology
ALOK RANJAN PAL DIGANTA SAHA NILADRI SEKHAR DASH SUDIP KUMAR NASKAR ANTARA PAL
An attempt is made in this paper to report how a supervised methodology has been adopted for the task of Word Sense Disambiguation (WSD) in Bengali with necessary modifications. At the initial stage, four commonly used supervised methods, Decision Tree (DT), Support Vector Machine (SVM), Artificial NeuralNetwork (ANN) and Naı¨ve Bayes (NB), are developed at the baseline. These algorithms are applied individually on a data set of 13 most frequently used Bengali ambiguous words. On experimental basis, the baseline strategyis modified with two extensions: (a) inclusion of lemmatization process into the system and (b) bootstrapping of the operational process. As a result, the levels of accuracy of the baseline methods are slightly improved, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. In this experiment, the data sets are prepared from the Bengali corpus, developed in the Technology Development for Indian Languages (TDIL) project of the Government of India andfrom the Bengali WordNet, which is developed at the Indian Statistical Institute, Kolkata. The paper reports the challenges and pitfalls of the work that have been closely observed during the experiment.
Volume 44 Issue 12 December 2019 Article ID 0247
Classifier combination approach for question classification for Bengali question answering system
SOMNATH BANERJEE SUDIP KUMAR NASKAR PAOLO ROSSO SIVAJI BANDYOPADHYAY
Question classification (QC) is a prime constituent of an automated question answering system. The work presented here demonstrates that a combination of multiple models achieves better classification performance than those obtained with existing individual models for the QC task in Bengali. We have exploited stateof-the-art multiple model combination techniques, i.e., ensemble, stacking and voting, to increase QC accuracy. Lexical, syntactic and semantic features of Bengali questions are used for four well-known classifiers, namelyNaĩve Bayes, kernel Naı¨ve Bayes, Rule Induction and Decision Tree, which serve as our base learners. Singlelayer question-class taxonomy with 8 coarse-grained classes is extended to two-layer taxonomy by adding 69 fine-grained classes. We carried out the experiments both on single-layer and two-layer taxonomies. Experimental results confirmed that classifier combination approaches outperform single-classifier classification approaches by 4.02% for coarse-grained question classes. Overall, the stacking approach produces the best results for fine-grained classification and achieves 87.79% of accuracy. The approach presented here could be used in other Indo-Aryan or Indic languages to develop a question answering system.
Volume 48, 2023
All articles
Continuous Article Publishing mode
Click here for Editorial Note on CAP Mode
© 2022-2023 Indian Academy of Sciences, Bengaluru.