Failure to recover major events of gene flux in real biological data due to method misapplication

Nils Kapust, Shijulal Nelson-Sathi, Barbara Schönfeld, Einat Hazkani-Covo, David Bryant, Peter J. Lockhart, Mayo Röttger, Joana C. Xavier, William F. Martin

Research output: Contribution to journalArticlepeer-review


In prokaryotes, knownmechanismsof lateral gene transfer (transformation, transduction, conjugation, and gene transfer agents) generate new combinations of genes among chromosomes during evolution. In eukaryotes, whose host lineage is descended from archaea, lateral gene transfer from organelles to the nucleus occurs at endosymbiotic events. Recent genome analyses studyinggene distributions have uncovered evidence for sporadic, discontinuous eventsofgene transfer frombacteria toarchaea duringevolution. Other studies have used traditional models designed to investigate gene family size evolution (Count) to support claims that gene transfer to archaeawas continuous during evolution, rather than involving occasional periodicmass gene influx events. Here, we show that the methodology used in analyses favoring continuous gene transfers to archaea was misapplied in other studies and does not recover known events of single simultaneous origin formany genes followed by differential loss in real data:plastidgenomes.Using the samesoftwareand the samesettings, we reanalyzed presence/absence pattern data for proteins encoded in plastid genomes and for eukaryotic protein families acquired from plastids. Contrary to expectations under a plastid originmodel, we found that the methodology employed inferred that gene acquisitions occurred uniformly across the plant tree. Sometimes as many as nine different acquisitions by plastid DNA were inferred for the same protein family. That is, the methodology that recovered gradual and continuous lateral gene transfer among lineages for archaea obtains the same result for plastids, even though it is known that massive gains followed by gradual differential loss is the true evolutionary process that generated plastid gene distribution data. Our findings caution against the use of models designed to study gene family size evolutionfor investigatinggene transferprocesses, especially when transfers involvingmore thanone geneper event are possible.

Original languageEnglish
Pages (from-to)1198-1209
Number of pages12
JournalGenome Biology and Evolution
Issue number5
StatePublished - 1 May 2018

Bibliographical note

Publisher Copyright:
© The Author(s) 2018.


  • Archaea
  • Evolutionary models
  • LGT
  • Plastid genomes


Dive into the research topics of 'Failure to recover major events of gene flux in real biological data due to method misapplication'. Together they form a unique fingerprint.

Cite this