• Fulltext

       

        Click here to view fulltext PDF


      Permanent link:
      https://www.ias.ac.in/article/fulltext/sadh/038/03/0407-0419

    • Keywords

       

      Data mining; unsupervised classification; kernel 𝑘-means clustering method; gradient descent method.

    • Abstract

       

      In unsupervised classification, kernel 𝑘-means clustering method has been shown to perform better than conventional 𝑘-means clustering method in identifying non-isotropic clusters in a data set. The space and time requirements of this method are $O(n^2)$, where 𝑛 is the data set size. Because of this quadratic time complexity, the kernel 𝑘-means method is not applicable to work with large data sets. The paper proposes a simple and faster version of the kernel 𝑘-means clustering method, called single pass kernel k-means clustering method. The proposed method works as follows. First, a random sample $\mathcal{S}$ is selected from the data set $\mathcal{D}$. A partition $\Pi_{\mathcal{S}}$ is obtained by applying the conventional kernel 𝑘-means method on the random sample $\mathcal{S}$. The novelty of the paper is, for each cluster in $\Pi_{\mathcal{S}}$, the exact cluster center in the input space is obtained using the gradient descent approach. Finally, each unsampled pattern is assigned to its closest exact cluster center to get a partition of the entire data set. The proposed method needs to scan the data set only once and it is much faster than the conventional kernel 𝑘-means method. The time complexity of this method is $O(s^2+t+nk)$ where 𝑠 is the size of the random sample $\mathcal{S}$, 𝑘 is the number of clusters required, and 𝑡 is the time taken by the gradient descent method (to find exact cluster centers). The space complexity of the method is $O(s^2)$. The proposed method can be easily implemented and is suitable for large data sets, like those in data mining applications. Experimental results show that, with a small loss of quality, the proposed method can significantly reduce the time taken than the conventional kernel 𝑘-means clustering method. The proposed method is also compared with other recent similar methods.

    • Author Affiliations

       

      T Hitendra Sarma1 P Viswanath2 B Eswara Reddy3

      1. Department of Computer Science and Engineering, Srinivasa Ramanujan Institute of Technology, Anantapur 515701, India
      2. Department of Computer Science and Engineering, Rajeev Gandhi Memorial College of Engineering and Technology, Nandyal 518501, India
      3. Department of Computer Science and Engineering, Jawaharlal Nehru Technological University, Anantapur College of Engineering, Anantapur 515002, India
    • Dates

       
  • Sadhana | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2017-2019 Indian Academy of Sciences, Bengaluru.