• A novel extractive text summarization system with self-organizing map clustering and entity recognition

    • Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/sadh/045/0032

    • Keywords

       

      Natural language processing; self organizing maps; semantic role labeling; relevance; redundancy; text summarization; entity recognition.

    • Abstract

       

      Extractive text summarization yields the sensitive parts of the document by neglecting the irrelevant and redundant information. In this paper, we propose a new strategy for extractive single-document summarization in Malayalam. Initially, entity recognition is done, followed by relevance analysis is made based onsome context-aware features. The scored sentences are then clustered using self-organizing maps (SOM) and from these clusters, relevant sentences are extracted out based on the proposed algorithm. Both theoretical and practical evaluations are done to analyze the implemented system. In theoretical evaluation, gradient calculations of relevance equations are used to know that which of these sentence scoring features are contributing more. The relevance equation is optimized with the help of Lagrange’s multiplier. The complexity analysis of the proposed algorithms is also performed. In practical evaluation, the system compared with online and offline summarizers upon metrics like precision, recall, and F-measure. The system is tested through a non-clustering approach also in order to analyze the impact of clustering used in our work. Some existing strategies likequestion game evaluation, sentence rank evaluation, and keyword association are also done to evaluate the different parameters like the relevance of sentences, important entity words, etc.

    • Author Affiliations

       

      M RAHUL RAJ1 ROSNA P HAROON1 N V SOBHANA2

      1. Department of Computer Science and Engineering, Ilahia College of Engineering & Technology, Muvattupuzha, India
      2. Department of Computer Science, Rajiv Gandhi Institute of Technology, Kottayam, India
    • Dates

       
  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2021-2022 Indian Academy of Sciences, Bengaluru.