• Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/sadh/045/0168

    • Keywords

       

      Text categorization; fuzzy logic; inference rule; defuzzification

    • Abstract

       

      The digital world is flooded with a huge number of documents belonging to multifarious categories. Most of these documents are uncategorized, which is a hindrance to efficient retrieval. In the case of news texts (one of the largest and most common sources of text information), it is often observed that a text does not belong to one particular category and has contents from multiple domains. This demands a text categorization system to segregate it into its respective domains for efficient information retrieval. The main challenge lies in handlingthe overlap of vocabulary among different domains at the time of categorization, which we have tackled using an approach based on fuzzy logic. In the present work a fuzzy rule inference system is presented, which works with newly proposed statistical features for segregating documents that belong to more than one or an undefined category. The generated model was defuzzified using five different techniques for determining the category of a document and the highest accuracy of 98.63% for the Centroid method was obtained. Experimentation was alsocarried out on standard English datasets (Reuters-21578 R8 and 20 Newsgroups). We obtain better results than those of reported works, thereby pointing to the language independence of our system

    • Author Affiliations

       

      ANKITA DHAR1 HIMADRI MUKHERJEE1 NILADRI SEKHAR DASH2 KAUSHIK ROY1

      1. Department of Computer Science, West Bengal State University, Kolkata, India
      2. Linguistic Research Unit, Indian Statistical Institute, Kolkata, India
    • Dates

       
  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2021-2022 Indian Academy of Sciences, Bengaluru.