• Volume 37, Issue 5

November 2012,   pages  807-919

• Editorial

• Advances in genetics and molecular breeding of three legume crops of semi-arid tropics using next-generation sequencing and high-throughput genotyping technologies

Molecular markers are the most powerful genomic tools to increase the efficiency and precision of breeding practices for crop improvement. Progress in the development of genomic resources in the leading legume crops of the semi-arid tropics (SAT), namely, chickpea (Cicer arietinum), pigeonpea (Cajanus cajan) and groundnut (Arachis hypogaea), as compared to other crop species like cereals, has been very slow. With the advances in next-generation sequencing (NGS) and high-throughput (HTP) genotyping methods, there is a shift in development of genomic resources including molecular markers in these crops. For instance, 2,000 to 3,000 novel simple sequence repeats (SSR) markers have been developed each for chickpea, pigeonpea and groundnut. Based on Sanger, 454/FLX and Illumina transcript reads, transcriptome assemblies have been developed for chickpea (44,845 transcript assembly contigs, or TACs) and pigeonpea (21,434 TACs). Illumina sequencing of some parental genotypes of mapping populations has resulted in the development of 120 million reads for chickpea and 128.9 million reads for pigeonpea. Alignment of these Illumina reads with respective transcriptome assemblies have provided > 10,000 SNPs each in chickpea and pigeonpea. A variety of SNP genotyping platforms including GoldenGate, VeraCode and Competitive Allele Specific PCR (KASPar) assays have been developed in chickpea and pigeonpea. By using above resources, the first-generation or comprehensive genetic maps have been developed in the three legume speciesmentioned above. Analysis of phenotyping data together with genotyping data has provided candidate markers for drought-tolerance-related root traits in chickpea, resistance to foliar diseases in groundnut and sterility mosaic disease (SMD) and fertility restoration in pigeonpea. Together with these trait-associated markers along with those already available, molecular breeding programmes have been initiated for enhancing drought tolerance, resistance to fusarium wilt and ascochyta blight in chickpea and resistance to foliar diseases in groundnut. These trait-associated robust markers along with other genomic resources including genetic maps and genomic resources will certainly accelerate crop improvement programmes in the SAT legumes.

• Large SNP arrays for genotyping in crop plants

Genotyping with large numbers of molecular markers is now an indispensable tool within plant genetics and breeding. Especially through the identification of large numbers of single nucleotide polymorphism (SNP) markers using the novel high-throughput sequencing technologies, it is now possible to reliably identify many thousands of SNPs at many different loci in a given plant genome. For a number of important crop plants, SNP markers are now being used to design genotyping arrays containing thousands of markers spread over the entire genome and to analyse large numbers of samples. In this article, we discuss aspects that should be considered during the design of such large genotyping arrays and the analysis of individuals. The fact that crop plants are also often autopolyploid or allopolyploid is given due consideration. Furthermore, we outline some potential applications of large genotyping arrays including high-density genetic mapping, characterization (fingerprinting) of genetic material and breeding-related aspects such as association studies and genomic selection.

• Application of large-scale sequencing to marker discovery in plants

Advances in DNA sequencing provide tools for efficient large-scale discovery of markers for use in plants. Discovery options include large-scale amplicon sequencing, transcriptome sequencing, gene-enriched genome sequencing and whole genome sequencing. Examples of each of these approaches and their potential to generate molecular markers for specific applications have been described. Sequencing the whole genome of parents identifies all the poly-morphisms available for analysis in their progeny. Sequencing PCR amplicons of sets of candidate genes from DNA bulks can be used to define the available variation in these genes that might be exploited in a population or germplasm collection. Sequencing of the transcriptomes of genotypes varying for the trait of interest may identify genes with patterns of expression that could explain the phenotypic variation. Sequencing genomic DNA enriched for genes by hybridization with probes for all or some of the known genes simplifies sequencing and analysis of differences in gene sequences between large numbers of genotypes and genes especially when working with complex genomes. Examples of application of the above-mentioned techniques have been described.

• Diversity in global maize germplasm: Characterization and utilization

Maize (Zea mays L.) is not only of worldwide importance as a food, feed and as a source of diverse industrially important products, but is also a model genetic organism with immense genetic diversity. Although it was first domesticated in Mexico, maize landraces are widely found across the continents. Several studies in Mexico and other countries highlighted the genetic variability in the maize germplasm. Applications of molecular markers, particularly in the last two decades, have led to new insights into the patterns of genetic diversity in maize globally, including landraces as well as wild relatives (especially teosintes) in Latin America, helping in tracking the migration routes of maize from the centers of origin, and understanding the fate of genetic diversity during maize domestication. The genome sequencing of B73 (a highly popular US Corn Belt inbred) and Palomero (a popcorn landrace in Mexico) in the recent years are important landmarks in maize research, with significant implications to our understanding of the maize genome organization and evolution. Next-generation sequencing and high-throughput genotyping platforms promise to further revolutionize our understanding of genetic diversity and for designing strategies to utilize the genomic information for maize improvement. However, the major limiting factor to exploit the genetic diversity in crops like maize is no longer genotyping, but high-throughput and precision phenotyping. There is an urgent need to establish a global phenotyping network for comprehensive and efficient characterization of maize germplasm for an array of target traits, particularly for biotic and abiotic stress tolerance and nutritional quality. ‘Seeds of Discovery’ (SeeD), a novel initiative by CIMMYT with financial support from the Mexican Government for generating international public goods, has initiated intensive exploration of phenotypic and molecular diversity of maize germplasm conserved in the CIMMYT Gene Bank; this is expected to aid in effective identification and use of novel alleles and haplotypes for maize improvement. Multi-institutional efforts are required at the global level to systematically explore the maize germplasm to diversify the genetic base of elite breeding materials, create novel varieties and counter the effects of global climate changes.

• Divergence of flowering genes in soybean

Soybean genome sequences were blasted with Arabidopsis thaliana regulatory genes involved in photoperiod-dependent flowering. This approach enabled the identification of 118 genes involved in the flowering pathway. Two genome sequences of cultivated (Williams 82) and wild (IT182932) soybeans were employed to survey functional DNA variations in the flowering-related homologs. Forty genes exhibiting nonsynonymous substitutions between G. max and G. soja were catalogued. In addition, 22 genes were found to co-localize with QTLs for six traits including flowering time, first flower, pod maturity, beginning of pod, reproductive period, and seed filling period. Among the genes overlapping the QTL regions, two LHY/CCA1 genes, GI and SFR6 contained amino acid changes. The recently duplicated sequence regions of the soybean genome were used as additional criteria for the speculation of the putative function of the homologs. Two duplicated regions showed redundancy of both flowering-related genes and QTLs. ID 12398025, which contains the homeologous regions between chr 7 and chr 16, was redundant for the LHY/CCA1 and SPA1 homologs and the QTLs. Retaining of the CRY1 gene and the pod maturity QTLs were observed in the duplicated region of ID 23546507 on chr 4 and chr 6. Functional DNA variation of the LHY/CCA1 gene (Glyma07g05410) was present in a counterpart of the duplicated region on chr 7, while the gene (Glyma16g01980) present in the other portion of the duplicated region on chr 16 did not show a functional sequence change. The gene list catalogued in this study provides primary insight for understanding the regulation of flowering time and maturity in soybean.

• Molecular markers in management of ex situ PGR - A case study

Worldwide germplasm collections contain about 7.4 million accessions of plant genetic resources for food and agriculture. One of the 10 largest ex situ genebanks of our globe is located at the Leibniz Institute of Plant Genetics and Crop Plant Research in Gatersleben, Germany. Molecular tools have been used for various gene bank management practices including characterization and utilization of the germplasm. The results on genetic integrity of longterm-stored gene bank accessions of wheat (self-pollinating) and rye (open-pollinating) cereal crops revealed a high degree of identity for wheat. In contrast, the out-pollinating accessions of rye exhibited shifts in allele frequencies. The genetic diversity of wheat and barley germplasm collected at intervals of 40 to 50 years in comparable geographical regions showed qualitative rather than a quantitative change in diversity. The inter- and intraspecific variation of seed longevity was analysed and differences were detected. Genetic studies in barley, wheat and oilseed rape revealed numerous QTL, indicating the complex and quantitative nature of seed longevity. Some of the loci identified were in genomic regions that co-localize with genes determining agronomic traits such as spike architecture or biotic and abiotic stress response. Finally, a genome-wide association mapping analysis of a core collection of wheat for flowering time was performed using diversity array technology (DArT) markers. Maker trait associations were detected in genomic regions where major genes or QTL have been described earlier. In addition, new loci were also detected, providing opportunities to monitor genetic variation for crop improvement.

• Genetic mapping and coccidial parasites: past achievements and future prospects

Coccidial parasites including Cryptosporidium parvum, Cyclospora cayetanensis, Neospora caninum, Toxoplasma gondii and the Eimeria species can cause severe disease of medical and veterinary importance. As many as one-third of the human population may carry T. gondii infection, and Eimeria are thought to cost the global poultry production industry in excess of US\$2 billion per annum. Despite their significance, effective vaccines are scarce and have been confined to the veterinary field. As sequencing and genotyping technologies continue to develop, genetic mapping remains a valuable tool for the identification of genes that underlie phenotypic traits of interest and the assembly of contiguous genome sequences. For the coccidian, cross-fertilization still requires in vivo infection, a feature of their life cycle which limits the use of genetic mapping strategies. Importantly, the development of population-based approaches has now removed the need to isolate clonal lines for genetic mapping of selectable traits, complementing the classical clone-based techniques. To date, four coccidial species, representing three genera, have been investigated using genetic mapping. In this review we will discuss recent progress with these species and examine the prospects for future initiatives.

• Molecular-based rapid inventories of sympatric diversity: A comparison of DNA barcode clustering methods applied to geography-based vs clade-based sampling of amphibians

Molecular markers offer a universal source of data for quantifying biodiversity. DNA barcoding uses a standardized genetic marker and a curated reference database to identify known species and to reveal cryptic diversity within well-sampled clades. Rapid biological inventories, e.g. rapid assessment programs (RAPs), unlike most barcoding campaigns, are focused on particular geographic localities rather than on clades. Because of the potentially sparse phylogenetic sampling, the addition of DNA barcoding to RAPs may present a greater challenge for the identification of named species or for revealing cryptic diversity. In this article we evaluate the use of DNA barcoding for quantifying lineage diversity within a single sampling site as compared to clade-based sampling, and present examples from amphibians. We compared algorithms for identifying DNA barcode clusters (e.g. species, cryptic species or Evolutionary Significant Units) using previously published DNA barcode data obtained from geography-based sampling at a site in Central Panama, and from clade-based sampling in Madagascar. We found that clustering algorithms based on genetic distance performed similarly on sympatric as well as clade-based barcode data, while a promising coalescent-based method performed poorly on sympatric data. The various clustering algorithms were also compared in terms of speed and software implementation. Although each method has its shortcomings in certain contexts, we recommend the use of the ABGD method, which not only performs fairly well under either sampling method, but does so in a few seconds and with a user-friendly Web interface.

• Exploring the correlations between sequence evolution rate and phenotypic divergence across the Mammalian tree provides insights into adaptive evolution

Sequence evolution behaves in a relatively consistent manner, leading to one of the fundamental paradigms in biology, the existence of a molecular clock’. The molecular clock can be distilled to the concept of accumulation of substitutions, through time yielding a stable rate from which we can estimate lineage divergence. Over the last 50 years, evolutionary biologists have obtained an in-depth understanding of this clock’s nuances. It has been fine-tuned by taking into account the vast heterogeneity in rates across lineages and genes, leading to relaxed’ molecular clock methods for timetree reconstruction. Sequence rate varies with life history traits including body size, generation time and metabolic rate, and we review recent studies on this topic. However, few studies have explicitly examined correlates between molecular evolution and morphological evolution. The patterns observed across diverse lineages suggest that rates of molecular and morphological evolution are largely decoupled. We discuss how identifying the molecular mechanisms behind rapid functional radiations are central to understanding evolution. The vast functional divergence within mammalian lineages that have relatively slow’ sequence evolution refutes the hypotheses that pulses in diversification yielding major phenotypic change are the result of steady accumulation of substitutions. Patterns rather suggest phenotypic divergence is likely caused by regulatory alterations mediated through mechanisms such as insertions/deletions in functional regions. These can rapidly arise and sweep to fixation faster than predicted from a lineage’s sequence neutral substitution rate, enabling species to leapfrog between phenotypic islands’. We suggest research directions that could illuminate mechanisms behind the functional diversity we see today.

• Complex genetic origin of Indian populations and its implications

Indian populations are classified into various caste, tribe and religious groups, which altogether makes them very unique compared to rest of the world. The long-term firm socio-religious boundaries and the strict endogamy practices along with the evolutionary forces have further supplemented the existing high-level diversity. As a result, drawing definite conclusions on its overall origin, affinity, health and disease conditions become even more sophisticated than was thought earlier. In spite of these challenges, researchers have undertaken tireless and extensive investigations using various genetic markers to estimate genetic variation and its implication in health and diseases. We have demonstrated that the Indian populations are the descendents of the very first modern humans, who ventured the journey of out-of-Africa about 65,000 years ago. The recent gene flow from east and west Eurasia is also evident. Thus, this review attempts to summarize the unique genetic variation among Indian populations as evident from our extensive study among approximately 20,000 samples across India.

• # Journal of Biosciences

Current Issue
Volume 42 | Issue 4
December 2017