• Fulltext


        Click here to view fulltext PDF

      Permanent link:

    • Keywords


      Bioinformatics pipelines; clustering; metagenomics algorithms; microbiome; next generation sequencing; operational taxonomic units

    • Abstract


      Taxonomic profiling, using hyper-variable regions of 16S rRNA, is one of the important goals in metagenomics analysis.Operational taxonomic unit (OTU) clustering algorithms are the important tools to perform taxonomic profiling by grouping16S rRNA sequence reads into OTU clusters. Presently various OTU clustering algorithms are available within differentpipelines, even some pipelines have implemented more than one clustering algorithms, but there is less literature available forthe relative performance and features of these algorithms. This makes the choice of using these methods unclear. In this studyfive current state-of-the-art OTU clustering algorithms (CDHIT, Mothur’s Average Neighbour, SUMACLUST, Swarm, andUCLUST) have been comprehensively evaluated on the metagenomics sequencing data. It was found that in all the datasets,Mothur’s average neighbour and Swarm created more number of OTU clusters. Based on normalized mutual information(NMI) and normalized information difference (NID), Swarm and Mothur’s average neighbour showed better clusteringqualities than others. But in terms of time complexity the greedy algorithms (SUMACLUST, CDHIT, and UCLUST) performedwell. So there is a trade-off between quality and time, and it is necessary while analysing large size of 16S rRNA genesequencing data.

    • Author Affiliations



      1. K.S. Rangasamy College of Technology, Tiruchengode 637 215, India
      2. Indian Institute of Technology Madras, Chennai, India
    • Dates

  • Journal of Biosciences | News

    • Editorial Note on Continuous Article Publication

      Posted on July 25, 2019

      Click here for Editorial Note on CAP Mode

© 2021-2022 Indian Academy of Sciences, Bengaluru.