Articles written in Sadhana
Volume 47 All articles Published: 1 January 2022 Article ID 0002
An improvement of Bengali factoid question answering system using unsupervised statistical methods
ARIJIT DAS JAYDEEP MANDAL ZARGHAM DANIAL ALOK RANJAN PAL DIGANTA SAHA
Virtual Assistants (VA) and Chatbots have boosted the pace of research in Question Answering (QA) system. QA systems are supposed to return the answers of the questions by processing the backend repository. All the questions and the text in the repositories are in natural languages only. Substantial number of projects are executed for building QA systems in high resource languages. In case of low resource languages, the progress is still in early stage. In this work, we have designed, developed and evaluated the performance of a factoid QA system in a low resource language—Bengali. The system takes the questions from the human and then retrieves all the prospective answers from a multi-domain repository. Based on six parameters, the answers are ranked and returned. Therefore, the performance of the system is evaluated and compared with earlier systems using standard metrics. The algorithm is tested on two repositories. First is the TDIL corpus containing large collection of famous Bengali literature, which was developed in the Technology Development of Indian Languages (TDIL) project. Second is the translated SQuAD which is the Bengali translation of Stanford Question Answering Dataset. The accurate answer is ranked by the system as 1st in 88.23% cases. Accuracy and F1 score are calculated as 97.64% and 98.5%, respectively for TDIL corpus and 97.16% and 98.51% for translated SQuAD based on the performance evaluation by confusion matrix.
Volume 48, 2023
Continuous Article Publishing mode
Click here for Editorial Note on CAP Mode