Genomic islands (GIs) are regions in the genome which are believed to have been acquired via horizontal gene transfer events and are thus likely to be compositionally distinct from the rest of the genome. Majority of the genes located in a GI encode a particular function. Depending on the genes they encode, GIs can be classified into various categories, such as `metabolic islands’, `symbiotic islands’, `resistance islands’, `pathogenicity islands’, etc. The computational process for GI detection is known and many algorithms for the same are available. We present a new method termed as Improved N-mer based Detection of Genomic Islands Using Sequence-clustering (INDeGenIUS) for the identification of GIs. This method was applied to 400 completely sequenced species belonging to proteobacteria. Based on the genes encoded in the identified GIs, the GIs were grouped into 6 categories: metabolic islands, symbiotic islands, resistance islands, secretion islands, pathogenicity islands and motility islands. Several new islands of interest which had previously been missed out by earlier algorithms were picked up as GIs by INDeGenIUS. The present algorithm has potential application in the identification of functionally relevant GIs in the large number of genomes that are being sequenced. Investigation of the predicted GIs in pathogens may lead to identification of potential drug/vaccine candidates.
Volume 44 | Issue 5
Click here for Editorial Note on CAP Mode