TY - JOUR
T1 - Quantifying the number of independent organelle DNA insertions in genome evolution and human health
AU - Hazkani-Covo, Einat
AU - Martin, William F.
N1 - Publisher Copyright:
© The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
PY - 2017/5
Y1 - 2017/5
N2 - Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments ofmitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundanceofnumts associatedwith tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data.Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline themain technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health.
AB - Fragments of organelle genomes are often found as insertions in nuclear DNA. These fragments ofmitochondrial DNA (numts) and plastid DNA (nupts) are ubiquitous components of eukaryotic genomes. They are, however, often edited out during the genome assembly process, leading to systematic underestimation of their frequency. Numts and nupts, once inserted, can become further fragmented through subsequent insertion of mobile elements or other recombinational events that disrupt the continuity of the inserted sequence relative to the genuine organelle DNA copy. Because numts and nupts are typically identified through sequence comparison tools such as BLAST, disruption of insertions into smaller fragments can lead to systematic overestimation of numt and nupt frequencies. Accurate identification of numts and nupts is important, however, both for better understanding of their role during evolution, and for monitoring their increasingly evident role in human disease. Human populations are polymorphic for 141 numt loci, five numts are causal to genetic disease, and cancer genomic studies are revealing an abundanceofnumts associatedwith tumor progression. Here, we report investigation of salient parameters involved in obtaining accurate estimates of numt and nupt numbers in genome sequence data.Numts and nupts from 44 sequenced eukaryotic genomes reveal lineage-specific differences in the number, relative age and frequency of insertional events as well as lineage-specific dynamics of their postinsertional fragmentation. Our findings outline themain technical parameters influencing accurate identification and frequency estimation of numts in genomic studies pertinent to both evolution and human health.
KW - Cancer genomics
KW - Mitochondria
KW - Numts
KW - Nupts
KW - Organelle insertions
UR - http://www.scopus.com/inward/record.url?scp=85026658042&partnerID=8YFLogxK
U2 - 10.1093/gbe/evx078
DO - 10.1093/gbe/evx078
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 28444372
AN - SCOPUS:85026658042
SN - 1759-6653
VL - 9
SP - 1190
EP - 1203
JO - Genome Biology and Evolution
JF - Genome Biology and Evolution
IS - 5
ER -