      We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

      Dang Thanh Hai1 Nguyen Dai Thanh1 Pham Thi Minh Trang1 Le Si Quang2 Phan Thi Thu Hang2 Dang Cao Cuong1 Hoang Kim Phuc1 Nguyen Huu Duc3 Do Duc Dong4 Bui Quang Minh5 Pham Bao Son1 Le Sy Vinh1 4

      1. University of Engineering and Technology, Vietnam National University Hanoi, Hanoi, Vietnam
      2. Wellcome Trust Center for Human Genetics, Oxford University, Oxford, UK
      3. High Performance Computing Center, Hanoi University of Science and Technology, Hanoi, Vietnam
      4. Information Technology Institute, Vietnam National University Hanoi, Hanoi, Vietnam
      5. Center for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Vienna, Austria
