• Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/jbsc/040/04/0721-0730

    • Keywords

       

      Clustering; dissimilarity; eigenvalue; feature selection

    • Abstract

       

      Reduction of dimensionality has emerged as a routine process in modelling complex biological systems. A large number of feature selection techniques have been reported in the literature to improve model performance in terms of accuracy and speed. In the present article an unsupervised feature selection technique is proposed, using maximum information compression index as the dissimilarity measure and the well-known density-based cluster identification technique DBSCAN for identifying the largest natural group of dissimilar features. The algorithm is fast and less sensitive to the user-supplied parameters. Moreover, the method automatically determines the required number of features and identifies them. We used the proposed method for reducing dimensionality of a number of benchmark data sets of varying sizes. Its performance was also extensively compared with some other well-known feature selection methods.

    • Author Affiliations

       

      Debarka Sengupta1 Indranil Aich2 Sanghamitra Bandyopadhyay3

      1. Genome Institute of Singapore, Singapore 138 672, Singapore
      2. HTL Co. India Pvt. Ltd., New Delhi 110 092, India
      3. Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700 108, India
    • Dates

       
  • Journal of Biosciences | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2017-2019 Indian Academy of Sciences, Bengaluru.