TY - JOUR
T1 - Failure to recover major events of gene flux in real biological data due to method misapplication
AU - Kapust, Nils
AU - Nelson-Sathi, Shijulal
AU - Schönfeld, Barbara
AU - Hazkani-Covo, Einat
AU - Bryant, David
AU - Lockhart, Peter J.
AU - Röttger, Mayo
AU - Xavier, Joana C.
AU - Martin, William F.
N1 - Publisher Copyright:
© The Author(s) 2018.
PY - 2018/5/1
Y1 - 2018/5/1
N2 - In prokaryotes, knownmechanismsof lateral gene transfer (transformation, transduction, conjugation, and gene transfer agents) generate new combinations of genes among chromosomes during evolution. In eukaryotes, whose host lineage is descended from archaea, lateral gene transfer from organelles to the nucleus occurs at endosymbiotic events. Recent genome analyses studyinggene distributions have uncovered evidence for sporadic, discontinuous eventsofgene transfer frombacteria toarchaea duringevolution. Other studies have used traditional models designed to investigate gene family size evolution (Count) to support claims that gene transfer to archaeawas continuous during evolution, rather than involving occasional periodicmass gene influx events. Here, we show that the methodology used in analyses favoring continuous gene transfers to archaea was misapplied in other studies and does not recover known events of single simultaneous origin formany genes followed by differential loss in real data:plastidgenomes.Using the samesoftwareand the samesettings, we reanalyzed presence/absence pattern data for proteins encoded in plastid genomes and for eukaryotic protein families acquired from plastids. Contrary to expectations under a plastid originmodel, we found that the methodology employed inferred that gene acquisitions occurred uniformly across the plant tree. Sometimes as many as nine different acquisitions by plastid DNA were inferred for the same protein family. That is, the methodology that recovered gradual and continuous lateral gene transfer among lineages for archaea obtains the same result for plastids, even though it is known that massive gains followed by gradual differential loss is the true evolutionary process that generated plastid gene distribution data. Our findings caution against the use of models designed to study gene family size evolutionfor investigatinggene transferprocesses, especially when transfers involvingmore thanone geneper event are possible.
AB - In prokaryotes, knownmechanismsof lateral gene transfer (transformation, transduction, conjugation, and gene transfer agents) generate new combinations of genes among chromosomes during evolution. In eukaryotes, whose host lineage is descended from archaea, lateral gene transfer from organelles to the nucleus occurs at endosymbiotic events. Recent genome analyses studyinggene distributions have uncovered evidence for sporadic, discontinuous eventsofgene transfer frombacteria toarchaea duringevolution. Other studies have used traditional models designed to investigate gene family size evolution (Count) to support claims that gene transfer to archaeawas continuous during evolution, rather than involving occasional periodicmass gene influx events. Here, we show that the methodology used in analyses favoring continuous gene transfers to archaea was misapplied in other studies and does not recover known events of single simultaneous origin formany genes followed by differential loss in real data:plastidgenomes.Using the samesoftwareand the samesettings, we reanalyzed presence/absence pattern data for proteins encoded in plastid genomes and for eukaryotic protein families acquired from plastids. Contrary to expectations under a plastid originmodel, we found that the methodology employed inferred that gene acquisitions occurred uniformly across the plant tree. Sometimes as many as nine different acquisitions by plastid DNA were inferred for the same protein family. That is, the methodology that recovered gradual and continuous lateral gene transfer among lineages for archaea obtains the same result for plastids, even though it is known that massive gains followed by gradual differential loss is the true evolutionary process that generated plastid gene distribution data. Our findings caution against the use of models designed to study gene family size evolutionfor investigatinggene transferprocesses, especially when transfers involvingmore thanone geneper event are possible.
KW - Archaea
KW - Evolutionary models
KW - LGT
KW - Plastid genomes
UR - http://www.scopus.com/inward/record.url?scp=85048261903&partnerID=8YFLogxK
U2 - 10.1093/gbe/evy080
DO - 10.1093/gbe/evy080
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 29718211
AN - SCOPUS:85048261903
SN - 1759-6653
VL - 10
SP - 1198
EP - 1209
JO - Genome Biology and Evolution
JF - Genome Biology and Evolution
IS - 5
ER -