Vous êtes sur la page 1sur 21

Estimating the timing of early eukaryotic

diversification with multigene molecular clocks


Laura Wegener Parfreya,b,2, Daniel J. G. Lahra,b, Andrew H. Knollc,1, and Laura A. Katza,b,1
a
Program in Organismic and Evolutionary Biology, University of Massachusetts, Amherst, MA 01003; bDepartment of Biological Sciences, Smith College,
Northampton, MA 01063; and cDepartment of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138

Contributed by Andrew H. Knoll, July 1, 2011 (sent for review February 9, 2011)

Although macroscopic plants, animals, and fungi are the most ibrating molecular clocks has also been greatly improved with
familiar eukaryotes, the bulk of eukaryotic diversity is microbial. both the recognition that single calibration points are insufficient
Elucidating the timing of diversification among the more than 70 (21, 22), and the availability of methods incorporate uncertainty
lineages is key to understanding the evolution of eukaryotes. Here, from the fossil record by specifying calibrations as time dis-
we use taxon-rich multigene data combined with diverse fossils tributions rather than points (15, 16). Additional limitations in
and a relaxed molecular clock framework to estimate the timing previous molecular clock studies of eukaryotes stem from the
of the last common ancestor of extant eukaryotes and the diver- tradeoff between analyses of many taxa and calibration points
gence of major clades. Overall, these analyses suggest that the last but only a single gene (4), and analyses of many genes but a small
common ancestor lived between 1866 and 1679 Ma, consistent with number of taxa and calibrations (5, 23).
the earliest microfossils interpreted with confidence as eukaryotic. Molecular clock estimates rely on robust phylogenies. Recon-
During this interval, the Earth’s surface differed markedly from to-
structions of relationships among eukaryotes have begun to sta-
bilize in recent years with the increasing availability of multigene
day; for example, the oceans were incompletely ventilated, with
data from diverse lineages (24–26). The majority of the >70 lin-
ferruginous and, after about 1800 Ma, sulfidic water masses com-
eages of eukaryotes fall within four major groups: Opisthokonta;
monly lying beneath moderately oxygenated surface waters. Our
Excavata; Amoebozoa; and Stramenopiles, Alveolates, and Rhi-
time estimates also indicate that the major clades of eukaryotes zaria (SAR) (25, 26), while the placement of some photosynthetic
diverged before 1000 Ma, with most or all probably diverging be- lineages remains controversial (25, 27, 28). Greater data avail-
fore 1200 Ma. Fossils, however, suggest that diversity within major ability also yields more accurate estimates of divergence times
extant clades expanded later, beginning about 800 Ma, when the because more nodes are available for calibration (29).
oceans began their transition to a more modern chemical state. In The availability of taxon- and gene-rich datasets coupled with
combination, paleontological and molecular approaches indicate flexible molecular clock methods make this an ideal time to re-
that long stems preceded diversification in the major eukaryotic visit the timing of early eukaryotic evolution. Here, broadly
lineages. sampled multigene trees are used to estimate dates, with rate
heterogeneity across the tree and among genes incorporated into
microbial eukaryotes | Proterozoic oceans | taxon sampling | the model. We use 23 calibration points derived from diverse
origin of eukaryotes fossils of Proterozoic and Phanerozoic age specified as prior
distributions (Table 1). The Proterozoic fossil record is sparse (2,
8, 9), and the taxonomic assignment of some Proterozoic fossils
T he antiquity of eukaryotes and the tempo of early eukaryotic
diversification remain open questions in evolutionary biology.
Proposed dates for the origin of the domain based on the fossil
has been called into question by a minority of researchers (6). In
the spirit of testing these ideas, we assess the impact of including
record and molecular clock analyses differ by up to 2 billion years calibration constraints derived from Phanerozoic fossils alone
(1). Microfossils attributed to eukaryotes occur at about 1800 Ma and Phanerozoic plus Proterozoic fossils. We also assess di-
(2) and putative biomarkers of early eukaryotes have been found vergence dates across analyses that varied in the position of the
in 2700 Ma rocks (3). Such geological interpretations contrast root, and the number of taxa included, as well as across different
with both molecular clock studies that place the origin of software platforms and models.
eukaryotes at 1250–850 Ma (4, 5), and a controversial hypothesis
that rejects the eukaryotic interpretation of all older fossils and Results
places eukaryogenesis at 850 Ma (6, 7). Taxon-rich analyses of multiple genes reveal a stability in di-
Paleontologists generally agree that an unambiguous record vergence dates across the eukaryotic tree of life that is robust to
of eukaryotic microfossils extends back to ∼1800 Ma (2, 8, 9). changing taxon inclusion, position of the root, molecular clock
Microfossils of this age are assigned to eukaryotes because they model, and choice of calibration points (Phanerozoic only or
combine informative characters that include complex morphol- both Phanerozoic and Proterozoic fossils). Collectively, these
ogy (e.g., the presence of processes and evidence for real-time analyses provide a mean age for the root of extant eukaryotes
modification of vegetative morphology), complex wall ultra- to 1866–1679 Ma in analyses including both Proterozoic and
structure, and specific inferred behaviors (2, 9, 10). Despite being Phanerozoic calibrations (“All” analyses; Fig. 1A and Table S1).
interpreted as eukaryotic, the taxonomic affinities of these fossils Varying the position of the root had little impact on divergence
remain unclear (2). Eukaryotic fossils that can be assigned to
extant taxonomic groups begin to appear ∼1200 Ma (11) and
become more widespread, abundant, and diverse in rocks ∼800 Author contributions: L.W.P. and L.A.K. designed research; L.W.P. and D.J.G.L. performed
Ma and younger (2, 12, 13). research; L.W.P., D.J.G.L., A.H.K., and L.A.K. analyzed data; and L.W.P., D.J.G.L., A.H.K.,
Molecular estimation of divergence times has improved dra- and L.A.K. wrote the paper.

matically in recent years due the development of methods that The authors declare no conflict of interest.
incorporate uncertainty from sources that include phylogenetic Freely available online through the PNAS open access option.
reconstruction, fossil calibrations, and heterogeneous rates of 1
To whom correspondence may be addressed. E-mail: aknoll@oeb.harvard.edu or lkatz@
molecular evolution (1, 14, 15). Relaxed clock approaches ac- smith.edu
count for heterogeneity in evolutionary rates across branches 2
Present address: Department of Chemistry and Biochemistry, University of Colorado,
and enable the use of complex models of sequence evolution Boulder, CO 80309.
(reviewed in refs. 16 and 17), although debate continues as to the This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
best method for relaxing the clock (18–20). The process of cal- 1073/pnas.1110633108/-/DCSupplemental.

13624–13629 | PNAS | August 16, 2011 | vol. 108 | no. 33 www.pnas.org/cgi/doi/10.1073/pnas.1110633108


Table 1. Calibration constraints for dating the eukaryotic tree of life
Calibration†

Taxon Fossil Eon* Min Dist Ref(s).

Amniota Westlonthania Phan 328.3 4, 3 (54)


Angiosperms Oldest angio pollen Phan 133.9 2, 10 (55)
Ascomycetes Paleopyrenomycites Phan 400 4, 50 (56)
Coccolithophores Earliest Heterococcolith Phan 203.6 2, 8 (57)
Diatoms Earliest diatoms Phan 133.9 2, 100 (58)
Dinoflagellates Earliest gonyaulacales Phan 240 2, 10 (59)
Embryophytes Land plant spores Phan 471 2, 20 (60)
Endopterygota Mecoptera Phan 284.4 5, 5 (61)
Eudicots Eudicot pollen Phan 125 2, 1.5 (62, 63)
Euglenids Moyeria Phan 450 2, 40 (64)
Foraminifera Oldest forams Phan 542 2, 200 (65)
Gonyaulacales Gonyaulacaceae split Phan 196 2,10 (59)
Pennate diatoms Oldest pennate Phan 80 3, 5 (66)
Spirotrichs Oldest tintinnids Phan 444 2.5, 100 (67)
Trachaeophytes Earliest trachaeophytes Phan 425 4, 2.5 (68)
Vertebrates Haikouichthys Phan 520 3, 5 (69)
Animals LOEMs, sponge biomarkers Protero 632 2, 300 (70, 71)
Arcellinida Paleoarcella Protero 736 2, 300 (12)
Bilateria Kimberella Protero 555 2, 30 (72)
Chlorophytes Palaeastrum Protero 700 2.5, 300 (73)
Ciliates Gammacerane Protero 736 2.5, 300 (74)
Florideophyceae Doushantuo red algae Protero 550 2.5, 100 (75)
Red algae‡ Bangiomorpha Protero 1174 3, 250 (11)

*Eon: Phan, Phanerozoic; Protero, Proterozoic. Proterozoic calibrations are excluded from Phan analyses.

Calibration constraints are specified for BEAST using a gamma distribution with a minimum date in Ma based
on the fossil record parameters as indicated: min, minimum divergence data; dist, gamma prior distribution
(shape, scale). See Table S3 for details of PhyloBayes calibrations.

In the All 720 analysis (c), the minimum age constraint for the red algae node is set to 720 Ma.

dates, especially for the estimated date of the root itself, which age of 720 Ma to this constraint, representing the absolute
generally changed by <100 million years (myr; Fig. 1A). Phylo- younger bound of the Hunting Formation, Canada, in which it is
bayes estimates generally showed more uncertainty than those found (SI Text) (11). In BEAST, placing the Bangiomorpha
from BEAST analyses, but around similar means. Similarly, constraint at 720 Ma shifted the estimated age of the root by only
estimates were robust to changing models (uncorrelated or 95 myr toward the present (Fig. 1A and Fig. S3, analysis c).
autocorrelated) and to the inclusion of only Phanerozoic (Phan) The autocorrelated CIR model combined with the low number
or all calibrations (All) with one exception: under the auto- of substitutions on deep branches of the eukaryotic tree appears

EVOLUTION
correlated Cox–Ingersoll–Ross (CIR) model, estimates are much more sensitive to the distribution of calibration dates included in
more recent in Phan analyses (1038 Ma and 1180 Ma; Fig. 1A). these analyses. Under the CIR autocorrelated model, a consistent
age was estimated with All calibrations included (1798–1691 Ma;
Impact of Calibration Constraints on Estimates of the Origin of Extant Fig. 1A, analyses m and o), although confidence intervals are
Eukaryotes. We assessed the impact of including Proterozoic greater in PhyloBayes analyses in general (Fig. 1A, analyses i–p).
fossils, which are considered controversial by some (6, 7), by However, excluding Proterozoic calibration points did cause es-
analyzing datasets without these seven calibration constraints timated ages to shift more than 600 myr younger under the CIR
(Phan analyses). In BEAST analyses, the exclusion of Proterozoic model (1180–1038 Ma; Fig. 1A, analyses n and p), pushing the
fossils shifted estimated divergence times toward the present, but estimated age for the root of extant eukaryotes younger than
not dramatically so: estimates for the mean age of root of extant the widely accepted date for the Bangiomorpha fossils. Similarly,
eukaryotes fall between 1506–1471 Ma in Phan analyses [95% the CIR analyses in PhyloBayes were sensitive to the age of the
highest-probability density (HPD) range 1643–1347 Ma; Fig. 1A, Bangiomorpha constraint, shifting more than 500 myr younger to
Figs. S1, S5, and S7, analyses b, f, and h] compared with 1837– 1296 Ma and 1167 Ma in analyses with All calibration points
1717 Ma (95% HPD range 1954–1601 Ma; Figs. 1A and 2 and rooted with Opisthokonta and “Unikonta,” respectively (Dataset
Figs. S4 and S6; analyses a, e, and g) when Proterozoic fossils S1). The necessity of using PhyloBayes to explore the differences
were included (All analyses). Similar dates were recovered in between autocorrelated and uncorrelated models introduces
Phan and All PhyloBayes analyses when the uncorrelated gamma confounding factors, as PhyloBayes requires both uniform dis-
model (UGAM) model (uncorrelated) of the molecular clock tributions around calibration points and a fixed tree topology.
was assumed (Fig. 1A, analyses i–l). Given that calibration points are likely best represented by more
Of the seven Proterozoic calibration points used in our anal- informative distributions, and that the topology of the tree is not
yses, only the Bangiomorpha point is controversial in terms of fully known, we focus the rest of our discussions on the results
either systematic attribution or age. The Bangiomorpha calibra- from BEAST, although data from all PhyloBayes analyses are
tion constraint is more than 400 myr older than our other Pro- available in Fig. 1A and Dataset S1.
terozoic constraints (Table 1). To determine whether this
calibration point drives results in analyses with All calibrations, Origin of Major Clades. In most analyses, the major clades of extant
we assessed the age of the root with a much more conservative eukaryotes diverged before 1200 Ma, with SAR, Excavata, and
estimate for the age of this red alga (All 720; Fig. 1, analysis c). A Amoebozoa arising within a similar time frame, as evidenced by
number of factors place the age of Bangiomorpha ∼1200 Ma (SI overlapping 95% HPD ranges (Figs. 1 and 2, Figs. S1–S7, and
Text); however, given the importance of the fossil we assigned an Dataset S1). The 95% HPD intervals are wider for clades with few

Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13625
A BEAST PhyloBayes algae, diverged within a similar time frame (Fig. 2). These results
2400 imply an early acquisition of photosynthesis in eukaryotes, in ac-
2200
cordance with both previous molecular clock estimates (30) and the
∼1200 Ma age assigned to the red algal fossil Bangiomorpha (11).
2000
d i Discussion
1800 e k m
a The molecular clock analyses presented here suggest that the last
c g o
1600 j l
common ancestor of extant eukaryotes lived between 1866 and
b f h 1679 Ma when both Phanerozoic and Proterozoic fossils are
1400 considered. We favor these more-inclusive analyses as they should
1200
reveal a more accurate picture of eukaryotic diversification, es-
p
pecially because the chosen fossils are widely accepted by pale-
n
1000 ontologists, and calibration constraints were assigned in a
conservative manner that accounts for age uncertainties. Esti-
800
mated ages are younger when we remove Proterozoic calibration
uncorrelated autocorrelated constraints, though not dramatically so, with the notable excep-
tion of the autocorrelated model CIR as implemented in Phylo-
Root op op op op est est un un op op un un op op un un Bayes with only Phanerozoic calibrations. Thus, our results tend
Calibration All Ph 720 All All Ph All Ph All Ph All Ph All Ph All Ph to place the last common ancestor of extant eukaryotes deep
within the Proterozoic Eon.
B 2400 Our estimates for the timing of the origin of extant eukaryotes
2200 are in line with fossil evidence (2, 13), but reject the hypothesis
2000 that eukaryotes originated only 850 Ma (6, 7). Fossils provide
minimum dates, leaving open the possibility that clades evolved
1800
d
e much earlier than their first fossil appearance (2, 31). Thus, it is
1600 d a g
a c c de g not surprising that divergence times for many eukaryotic clades
a
1400 e g b
f
h d c are older than their first unambiguous fossil occurrence (Table
b a
1200
c e g b f h 2). The paleontological literature contains some references to
f h b f h eukaryotic fossils older than our estimate of the last common
1000
ancestor. In some cases, these paleontological reports are in-
800 correct or ambiguous. For example, large carbonaceous fossils
assigned to the genus Grypania were originally reported to be
older than our molecular clock estimate (32), but more recent
radiometric dates indicate an age of 1874 ± 9 Ma (33), consistent
Fig. 1. Summary of mean divergence dates for the most recent common with the clock analyses presented here. Older still are the 50- to
ancestor of major clades of extant eukaryotes. Letters are at the mean di- 300-μm spheroidal microfossils described from ∼3200 Ma rocks
vergence time and denote analyses, as detailed in Table S1. Error bars rep-
by Javaux et al. (34), and proposed as possible eukaryotes by
resent 95% HPD for BEAST analyses (a–h) and the 95% confidence interval
Buick (35), and sterane biomarkers from 2700 Ma shales (3).
for PhyloBayes (analysis i–p). (A) Estimated age of the root of extant
eukaryotes across analyses. Root position: Opis, root constrained to Opis-
Whether these materials record Archean eukaryotes remains a
thokonta; Uni, root constrained to “Unikonta”; Estim, root estimated by subject of debate (34, 36). Our molecular clock estimates suggest
BEAST. Calibration: All, all Phanerozoic and Proterozoic CCs; Phan, Phaner- that if these fossils do represent eukaryotes, they record stem
ozic CCs only; 720, All CCs with the minimum age of red algae set to 720 Ma. lineages—early representatives of eukaryotic groups that went
d = 91 taxa. (B) Estimated ages of major clades from BEAST analyses. extinct—that were present before the emergence of extant eu-
karyotic clades.
The major lineages of extant eukaryotes (Opisthokonta, SAR,
calibration points, such as Excavata and Amoebozoa (Fig. 1B). Excavata, and Amoebozoa) are projected to have diverged from
Estimates for the last common ancestor of extant Opisthokonta are one another by the Mesoproterozoic era (1600–1000 Ma), rela-
younger than the other clades, at 1389–1240 Ma in analyses with All tively early in the history of the domain (Fig. 1 and Table 2).
calibration constraints. This, in turn, suggests that these lineages were present for hun-
Exclusion of Proterozoic calibration constraints (Phan analy- dreds of millions of years before the observed increase in the
ses) shifted age estimates for the origins of major extant abundance and diversity of eukaryotic microfossils beginning
eukaryotic clades younger by 200–300 myr (Fig. 1B). Differences ∼800 Ma (2, 37–40). Our molecular clock estimates indicate that
in divergence times are relatively small for nested clades—e.g., stem groups were present well before recognizable members of
the 95% HPD for Alveolata shifts from 1445 to 1236 Ma in crown lineages—monophyletic groups consisting of living rep-
analysis a (Fig. 2) to 1206–1020 Ma with only Phanerozoic cal- resentatives and their ancestors—diversified. A similar pattern of
ibration points (analysis b; Fig. S1). Not surprisingly, the differ- long stems preceding diversification is seen in animal and plants
ing calibration schemes had their most dramatic impact on the and may be a consistent pattern in evolution (38).
estimated age of the red algae, which changes from 1285 to 1180 Fossils and our molecular clock analyses agree that eukaryotes
Ma 95% HPD (Fig. 2) to 959–625 Ma 95% HPD when Prote- originated and diversified during a time when oceans differed
rozoic calibration points, including the constraint on red algae at substantially from the modern seas. Increasingly, geochemical
1174 Ma in accordance with the widely cited age for Bangio- data indicate that for much of the Proterozoic eon, mildly oxic
morpha, are excluded (Fig. S1). Estimated ages of major clades surface waters lay above an oxygen-minimum zone that was per-
were also much younger in analyses using the CIR model with sistently anoxic and commonly sulfidic (41, 42). Such conditions
Phan calibrations (analyses n and p; Dataset S1). are compatible with scenarios for eukaryogenesis that rely on
The topology of the eukaryotic tree produced through coes- anaerobic methanogens in symbiotic partnership with faculta-
timation of phylogeny and divergence times in BEAST is broadly tively aerobic proteobacteria or sulfate reducers (see references
consistent with other analyses (SI Text) (25, 26). Hence, the in ref. 43), because facultatively anaerobic mitochondria may
BEAST topology was also used for the PhyloBayes analyses, which have enabled early eukaryotes to live in the sulfidic Proterozoic
require a fixed topology. Though the relationships among the oceans (44). Because sulfide interferes with the function of mi-
photosynthetic eukaryotes remain uncertain (25), our analyses tochondria in aerobically respiring eukaryotes, the radiation of
suggest that many photosynthetic clades, such as red and green diverse species within eukaryotic clades may have become pos-

13626 | www.pnas.org/cgi/doi/10.1073/pnas.1110633108 Parfrey et al.


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa rotundata
Alexandrium tamarense
Crypthecodinium cohnii
Karenia brevis
Oxyrrhis marina SAR
Perkinsus marinus
Theileria parva Alveolates
Plasmodium berghei
Toxoplasma gondii
Eimeria tenella
Stylonychia lemnae
Sterkiella histriomuscorum
Nyctotherus ovalis
Paramecium tetraurelia
Tetrahymena thermophila
Chilodonella uncinata
Reticulomyxa filosa
Ovammina opaca
Plasmodiophora brassicae Rhizaria
Bigelowiella natans
Gromia
Corallomyxa tenera
Heteromita globosa
Thalassiosira pseudonana
Phaeodactylum tricornutum
Aureococcus anophagefferens Stramenopiles
Heterosigma akashiwo
Ectocarpus siliculosus
Apodachlya brachynema
Phytophthora infestans
Isochrysis galbana
Emiliania huxleyi
Prymnesium parvum Haptophytes
Pavlova lutheri
Oryza sativa
Arabidopsis thaliana
Welwitschia mirabilis
Ginkgo biloba
Physcomitrella patens
Mesostigma viride
Volvox carteri
Chlamydomonas reinhardtii
Dunaliella salina
Green algae
Acetabularia acetabulum
Micromonas pusilla
Ostreococcus tauri
Goniomonas
Guillardia theta
Leucocryptos marina
Cryptomonads
Gracilaria changii
Chondrus crispus
Porphyra yezoensis Red algae
Cyanidioschyzon merolae
Glaucocystis nostochinearum Glaucocystophytes
Cyanophora paradoxa
Trypanosoma brucei
Leishmania major
Bodo saltans
Diplonema papillatum
Euglena longa
Euglena gracilis
Entosiphon sulcatum
Jakoba libera
Reclinomonas americana
Seculamonas ecuadoriensis
Naegleria gruberi Excavata
Sawyeria marylandensis
Trichomonas vaginalis
Giardia duodenalis
Spironucleus barkhanus
Carpediemonas membranifera
Monocercomonoides sp.
Streblomastix strix
Trimastix pyriformis
Malawimonas californiana
Malawimonas jakobiformis
Acanthamoeba castellanii
Hartmannella vermiformis
Arcella hemisphaerica
Rhizamoeba sp.
Entamoeba histolytica Amoebozoa
Mastigamoeba balamuthi
Dictyostelium discoideum
Physarum polycephalum
Capitella capitata
Aplysia californica
Schistosoma mansoni
Apis mellifera
Drosophila melanogaster
Caenorhabditis elegans
Gallus gallus
Homo sapiens

EVOLUTION
Branchiostoma floridae
Mnemiopsis leidyi
Oscarella carmela
Aphrocallistes vastus Opisthokonta
Nematostella vectensis
Monosiga brevicollis
Amoebidium parasiticum
Sphaeroforma arctica
Capsaspora owczarzaki
Candida albicans
Saccharomyces cerevisiae
Schizosaccharomyces pombe
Phanerochaete chrysosporium
Ustilago maydis
Glomus intraradices
Allomyces macrogynus
Spizellomyces punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. 2. Time-calibrated tree of extant eukaryotes using All calibration points, 109 taxa, and root constrained to Opisthokonta. Nodes are at mean divergence
times and gray bars represent 95% HPD of node age. (Upper) Geological time scale; (Lower) Absolute time scale in Ma. Thick vertical bars demarcate eras and

thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart. Node calibrated with Phanerozoic fossils ( ); node
calibrated with Proterozoic fossils (◯). Estimated ages of calibrated nodes differ from calibration constraints (Table 1) because they have been modified by
relaxed clock analysis of sequence data.

sible only when sulfidic subsurface waters began to wane about photosynthetic bacteria are capable of nitrogen fixation, ame-
800 Ma (45). Alternatively, early eukaryotic evolution may have liorating the impact of nitrate and ammonia limitation on pri-
occurred in coastal environments sheltered from the impact of mary production. Eukaryotes, however, have no such capacity;
sulfidic waters or in freshwater systems, which are both poorly thus, it may not be a coincidence that biomarkers indicating an
sampled by the geologic record and not impacted by sulfidic expanding importance of algae in marine primary production
oceanic water masses (46). Consistent with this view, moderately occur in conjunction with geochemical data recording the spread
diverse assemblages of fossil eukaryotes occur in well-ventilated
lake deposits of the 1200 to 900 Ma Torridonian succession, of oxygen through later Neoproterozoic oceans (51). In our
Scotland (47, 48), and in coastal marine deposits of the ∼1500 to analyses, the clade that contains extant photosynthetic taxa, in-
1400-Ma Roper Group, Australia (49). cluding green algae plus land plant and red algae, arose between
Within Proterozoic oceans, low concentrations of biologically 1670 and 1428 Ma, but diversification within these lineages oc-
available nitrogen may also have inhibited the diversification of curred later in the Neoproterozoic and may correspond to
photosynthetic eukaryotes (50). Many cyanobacteria and other a changing redox profile in the oceans (Fig. 2).

Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13627
Table 2. Comparison of major node ages to fossil dates Encephalitozoon cuniculi) and orphans (e.g., Breviata anathema) were re-
moved to minimize rate heterogeneity for the clock analysis. The resulting
Major clade Estimated age, Ma Oldest fossil, Ma Ref.
109-taxon data matrix includes 5,696 characters, with each taxon having
Eukaryotes * 1800 (2) between three and 15 of the target genes (36% missing character data;
Extant eukaryotes 1679–1866 1200 (11) Table S2; analyses a–c and e–p). A 91-taxon alignment was created by re-
Amoebozoa 1384–1624 800 (12) moving additional taxa with either long branches or high levels of missing
Excavata 1510–1699 450 (64) data to ensure that our results were not driven by these potential sources of
Opisthokonta 1240–1481 632 (71) artifact (analysis d).
Rhizaria 1017–1256 550 (65)
SAR 1365–1577 736 (74) Molecular Dating Analyses. Dating analyses were predominantly performed
in BEAST v1.5.4 (52), and we also assessed results obtained in PhyloBayes
Estimated age is range of mean dates from All analyses. 3.2f (53) (see SI Text for analysis details). BEAST offers a number of desirable
*The age of the root of all eukaryotes is not estimated because molecular features, including flexible specification of prior distributions that enable
clock studies can only inform the timing of extant clades. the uncertainty of the fossil record to be realistically modeled, as well as the
ability to coestimate divergence times with topology (15). We compared
divergence dates for eukaryotes obtained from different models to assess
Discrepancy Between These and Previous Molecular Clock Studies. whether our conclusions were driven by the choice of a particular model (SI
Previous molecular clock studies yielded vastly different dates for Text, Fig. 1 and Table S1).
the root of extant eukaryotes, ranging from 3970 to 1100 Ma (1). In
a recent analysis of small subunit ribosomal DNA (SSU-rDNA) Calibration Constraints. Calibration constraints were specified with prior dis-
from 83 broadly sampled eukaryotes, Berney and Pawlowski (4) tributions to incorporate errors arising from age dating, stratigraphy, and
placed the origin of eukaryotes at 1100 Ma, a conclusion that was clade assignment (Table 1). The impact of Proterozoic fossils was assessed by
robust to changing the position of the root. They had numerous analyzing the data with only the 16 Phanerozoic calibration constraints
Phanerozoic calibration constraints specified as either minimum or (Phan analyses b, f, h, j, l, n, and p) or with Phanerozoic and Proterozoic
maximum divergence dates (4), but they found that including calibration constraints (All analyses a, c–e, g, i, k, m, and o). Calibration
Proterozoic calibration points, such as Bangiomorpha at 1200 Ma, constraints were specified with prior distributions in BEAST using BEAUTi
shifted their estimates of the origin and diversification of eukar- v1.5.4 (52) and were derived from a conservative reading of the fossil record
yotes by 1000–2500 Ma. The age discrepancy observed by Berney (i.e., we err toward younger rather than older ages; SI Text). Distributions
and Pawlowski (4), when Proterozoic calibration constraints are were specified with long tails unless the fossil record provided minimum-
included, contrasts sharply with the relative stability of dates seen in divergence information. Calibration constraints used for PhyloBayes had to
our analyses (Fig. 1A). We hypothesize that the increased gene and be specified as a uniform distribution (Table S3).
taxon sampling, as well as the use of flexible prior distributions of
calibration points as implemented in BEAST, are major factors Assessing Impact of the Root on the Inferred Age of Eukaryotes. Molecular
contributing to the stability of molecular clock estimation in clock analyses require a rooted tree. However, the position of the eukaryotic
our analyses. root remains an open question; therefore, we compared age estimates from
molecular clock analyses with multiple positions for the root of extant
Conclusion eukaryotes. First, the root was constrained to the branch leading to the
Opisthokonta or to Opisthokonta + Amoebozoa (“Unikonta”) in accordance
Our molecular clock analyses yield a timeline of eukaryotic
with current hypotheses (see SI Text for discussion of the position of the
evolution that is congruent with the paleontological record and
eukaryotic root). In BEAST, the root was specified by constraining a mono-
robust to varying analytical conditions. According to our analy-
phyletic ingroup. PhyloBayes requires the tree topology to be fixed, and we
ses, crown (extant) groups of eukaryotes arose in the Paleo-
used the tree in Fig. 2 rooted on either Opisthokonta or “Unikonta”. Finally,
proterozoic era (2500–1600 Ma) and began to diversify soon
for the third condition, the root was estimated by the molecular clock cri-
thereafter, suggesting that early eukaryotic evolution was influ-
terion, as implemented in BEAST (SI Text), which yielded variable estimates
enced by anoxic and sulfidic water masses in contemporaneous of the location of the root.
oceans. The stability in our analysis across a range of variables is
a welcome departure from the large age discrepancies reported ACKNOWLEDGMENTS. We thank Ben Normark, Rob Dorit, and Sam Bowser
in earlier molecular analyses, reflecting improved paleontologi- for useful discussions, and Jeff Thorne and Bengt Sennblad for helpful
cal interpretation, advancements in molecular methods, and the discussions about molecular clock models. This manuscript has been improved
rapidly growing body of molecular data from diverse eukaryotes. following the comments of Emmanuelle Javaux, Andrew Roger, and Heroen
Verbruggen. We thank Jessica Grant and Tony Caldanaro for technical help.
Materials and Methods This research was supported by the National Aeronautics and Space Admin-
istration Astrobiology Institute (A.H.K.) and by National Science Foundation
Alignments. Alignments are derived from the 15 protein-coding genes ana- Assembling the Tree of Life Grant 043115 and National Science Foundation
lyzed in Parfrey et al. (dataset 15:10 of ref. 25). Using this 88-taxon dataset Systematics Grant 0919152 (to L.A.K). D.J.G.L. is supported by Conselho Nacional
as a starting point, taxa were added to capture additional lineages, partic- de Desenvolvimento Científico e Tecnológico-Brazil Doutorado no Exterior Fel-
ularly those with fossil data available (Table S2). Rapidly evolving taxa (e.g., lowship 200853/2007-4.

1. Roger AJ, Hug LA (2006) The origin and diversification of eukaryotes: Problems with 9. Javaux EJ, Knoll AH, Walter M (2003) Recognizing and interpreting the fossils of early
molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B eukaryotes. Orig Life Evol Biosph 33:75–94.
Biol Sci 361:1039–1054. 10. Javaux EJ, Knoll AH, Walter MR (2004) TEM evidence for eukaryotic diversity in mid-
2. Knoll AH, Javaux EJ, Hewitt D, Cohen P (2006) Eukaryotic organisms in Proterozoic Proterozoic oceans. Geobiology 2:121–132.
oceans. Philos Trans R Soc Lond B Biol Sci 361:1023–1038. 11. Butterfield NJ (2000) Bangiomorpha pubescens n. gen., n. sp.: Implications for the
3. Brocks JJ, Logan GA, Buick R, Summons RE (1999) Archean molecular fossils and the evolution of sex, multicellularity, and the Mesoproterozoic/Neoproterozoic radiation
early rise of eukaryotes. Science 285:1033–1036.
of eukaryotes. Paleobiol 26:386–404.
4. Berney C, Pawlowski J (2006) A molecular time-scale for eukaryote evolution recali-
12. Porter SM, Meisterfeld R, Knoll AH (2003) Vase-shaped microfossils from the Neo-
brated with the continuous microfossil record. Proc Roy Soc Lond B 273:18671872.
proterozoic Chuar Group, Grand Canyon: A classification guided by modern testate
5. Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic
amoebae. J Paleontol 77:409–429.
evolution: Does a relaxed molecular clock reconcile proteins and fossils? Proc Natl
13. Javaux EJ (2007) The early eukaryotic fossil record. Adv Exp Med Biol 607:1–19.
Acad Sci USA 101:15386–15391.
14. Welch JJ, Bromham L (2005) Molecular dating when rates vary. Trends Ecol Evol 20:
6. Cavalier-Smith T (2002) The phagotrophic origin of eukaryotes and phylogenetic
classification of Protozoa. Int J Syst Evol Microbiol 52:297–354. 320–327.
7. Cavalier-Smith T (2010) Deep phylogeny, ancestral groups and the four ages of life. 15. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and
Philos Trans R Soc Lond B Biol Sci 365:111–132. dating with confidence. PLoS Biol 4:e88.
8. Porter SM (2004) The fossil record of early eukaryotic diversification. Paleontol Soc 16. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic
Papers 10:35–50. estimation of evolutionary divergence times. Syst Biol 58:367–380.

13628 | www.pnas.org/cgi/doi/10.1073/pnas.1110633108 Parfrey et al.


17. Rutschmann F (2006) Molecular dating of phylogenetic trees: A brief review of cur- 46. Cavalier-Smith T (2009) Megaphylogeny, cell body plans, adaptive zones: Causes and
rent methods that estimate divergence times. Divers Distrib 12:35–48. timing of eukaryote basal radiations. J Eukaryot Microbiol 56:26–33.
18. Linder M, Britton T, Sennblad B (2011) Evaluation of Bayesian models of substitution 47. Strother PK, Battison L, Brasier MD, Wellman CH (2011) Earth’s earliest non-marine
rate evolution—parental guidance versus mutual independence. Syst Biol 60:329–342. eukaryotes. Nature 473:505–509.
19. Ho SYW (2009) An examination of phylogenetic models of substitution rate variation 48. Parnell J, Boyce AJ, Mark D, Bowden S, Spinks S (2010) Early oxygenation of the
among lineages. Biol Lett 5:421–424. terrestrial environment during the Mesoproterozoic. Nature 468:290–293.
20. Lepage T, Bryant D, Philippe H, Lartillot N (2007) A general comparison of relaxed 49. Javaux EJ, Knoll AH, Walter MR (2001) Morphological and ecological complexity in
molecular clock models. Mol Biol Evol 24:2669–2680. early eukaryotic ecosystems. Nature 412:66–69.
21. Graur D, Martin W (2004) Reading the entrails of chickens: Molecular timescales of 50. Anbar AD, Knoll AH (2002) Proterozoic ocean chemistry and evolution: A bio-
evolution and the illusion of precision. Trends Genet 20:80–86. inorganic bridge? Science 297:1137–1142.
22. Hug LA, Roger AJ (2007) The impact of fossils and taxon sampling on ancient mo- 51. Knoll AH, Summons RE, Waldbauer JR, Zumberge J (2007) The geological succession
lecular dating analyses. Mol Biol Evol 24:1889–1897. of primary producers in the oceans. The Evolution of Primary Producers in the Sea, eds
23. Hedges SB, Blair JE, Venturi ML, Shoe JL (2004) A molecular timescale of eukaryote Falkowski PG, Knoll AH (Elsevier, Burlington, MA), pp 133–163.
evolution and the rise of complex multicellular life. BMC Evol Biol 4:2. 52. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling
24. Adl SM, et al. (2005) The new higher level classification of eukaryotes with emphasis trees. BMC Evol Biol 7:214.
on the taxonomy of protists. J Eukaryot Microbiol 52:399–451. 53. Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: A Bayesian software package
25. Parfrey LW, et al. (2010) Broadly sampled multigene analyses yield a well-resolved for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288.
eukaryotic tree of life. Syst Biol 59:518–533. 54. Smithson TR, Rolfe WDI (1990) Westlothiana Gen. nov.—naming the earliest known
26. Hampl V, et al. (2009) Phylogenomic analyses support the monophyly of Excavata and reptile. Scott J Geol 26:137–138.
resolve relationships among eukaryotic “supergroups”. Proc Natl Acad Sci USA 106: 55. Crane PR, Friis EM, Pedersen KR (1995) The origin and early diversification of an-
3859–3864. giosperms. Nature 374:27–33.
27. Lane CE, Archibald JM (2008) The eukaryotic tree of life: Endosymbiosis takes its TOL. 56. Taylor TN, Hass H, Kerp H (1999) The oldest fossil ascomycetes. Nature 399:648.
Trends Ecol Evol 23:268–275. 57. Bown PR (1998) Calcareous Nannofossil Biostratigraphy (Kluwer Academic, London).
28. Baurain D, et al. (2010) Phylogenomic evidence for separate acquisition of plastids in 58. Harwood DM, Nikolaev VA, Winter DM (2007) Cretaceous records of diatom evolu-
cryptophytes, haptophytes, and stramenopiles. Mol Biol Evol 27:1698–1709. tion, radiation, and expansion. Paleontol Soc Papers 13:33–59.
29. Heath TA, Hedtke SM, Hillis DM (2008) Taxon sampling and the accuracy of phylo- 59. Fensome RA, Saldarriaga JF, Taylor F (1999) Dinoflagellate phylogeny revisited: Rec-
genetic analyses. J Syst Evol 46:239–257. onciling morphological and molecular based phylogenies. Grana 38:66–80.
30. Yoon HS, Hackett JD, Ciniglia C, Pinto G, Bhattacharya D (2004) A molecular timeline 60. Rubinstein CV, Gerrienne P, de la Puente GS, Astini RA, Steemans P (2010) Early
for the origin of photosynthetic eukaryotes. Mol Biol Evol 21:809–818. Middle Ordovician evidence for land plants in Argentina (eastern Gondwana). New
31. Donoghue PCJ, Benton MJ (2007) Rocks and clocks: Calibrating the Tree of Life using Phytol 188:365–369.
fossils and molecules. Trends Ecol Evol 22:424–431. 61. Dostál O, Prokop J (2009) New fossil insects (Diaphanopterodea: Martynoviidae) from
32. Han TM, Runnegar B (1992) Megascopic eukaryotic algae from the 2.1-billion-year-old the Lower Permian of the Boskovice Basin, southern Moravia. Geobios 42:495–502.
Negaunee-Iron-Formation, Michigan. Science 257:232–235. 62. Friis EM, Pedersen KR, Crane PR (2010) Diversity in obscurity: Fossil flowers and the
33. Schneider DA, Bickford ME, Cannon WF, Schulz KJ, Hamilton MA (2002) Age of vol- early history of angiosperms. Philos Trans R Soc Lond B Biol Sci 365:369–382.
canic rocks and syndepositional iron formations, Marquette Range Supergroup: Im- 63. Sun G, Dilcher DL, Wang H, Chen Z (2011) A eudicot from the Early Cretaceous of
plications for the tectonic setting of Paleoproterozoic iron formations of the Lake China. Nature 471:625–628.
Superior. Can J Earth Sci 39:999–1012. 64. Gray J, Boucot AJ (1989) Is Moyeria a euglenoid? Lethaia 22:447–456.
34. Javaux EJ, Marshall CP, Bekker A (2010) Organic-walled microfossils in 3.2-billion-year- 65. McIlroy D, Green OR, Brasier MD (2001) Palaeobiology and evolution of the earliest
old shallow-marine siliciclastic deposits. Nature 463:934–938. agglutinated Foraminifera: Platysolenites, Spirosolenites and related forms. Lethaia
35. Buick R (2010) Early life: Ancient acritarchs. Nature 463:885–886. 34:13–29.
36. Rasmussen B, Fletcher IR, Brocks JJ, Kilburn MR (2008) Reassessing the first appear- 66. Kooistra W, Gersonde R, Medlin L, Mann DG (2007) The origin and evolution of the
ance of eukaryotes and cyanobacteria. Nature 455:1101–1104. diatoms: Their adaptation to a planktonic existence. The Evolution of Primary Pro-
37. Knoll AH (1994) Proterozoic and early Cambrian protists: Evidence for accelerating ducers in the Sea, eds Falkowski PG, Knoll AH (Elsevier, Burlington, MA), pp 201–249.
evolutionary tempo. Proc Natl Acad Sci USA 91:6743–6750. 67. Lipps HJ (1993) Fossil Prokaryotes and Protists (Blackwell Scientific, Boston).
38. Knoll AH (2011) The multiple origins of complex multicellularity. Annu Rev Earth 68. Kenrick P, Crane PR (1997) The origin and early evolution of plants on land. Nature
Planet Sci 39:217–239. 389:33–39.
39. Yin L, Yuan X (2007) Radiation of Meso-Neoproterozoic and early Cambrian protists 69. Shu DG, et al. (1999) Lower Cambrian vertebrates from South China. Nature 402:
inferred from the microfossil record of China. Palaeogeogr Palaeocl 254:350–361. 42–46.
40. Porter SM (2006) Heterotrophic Eukaryotes. Neoproterozoic Geobiology and Paleo- 70. Love GD, et al. (2009) Fossil steroids record the appearance of Demospongiae during
biology, eds Xiao S, Kaufman AJ (Springer, Dordrecht, The Netherlands), pp 1–21. the Cryogenian period. Nature 457:718–721.

EVOLUTION
41. Canfield DE (1998) A new model for Proterozoic ocean chemistry. Nature 396: 71. Cohen PA, Knoll AH, Kodner RB (2009) Large spinose microfossils in Ediacaran rocks as
450–453. resting stages of early animals. Proc Natl Acad Sci USA 106:6519–6524.
42. Johnston DT, Wolfe-Simon F, Pearson A, Knoll AH (2009) Anoxygenic photosynthesis 72. Martin MW, et al. (2000) Age of Neoproterozoic bilatarian body and trace fossils,
modulated Proterozoic oxygen and sustained Earth’s middle age. Proc Natl Acad Sci White Sea, Russia: Implications for metazoan evolution. Science 288:841–845.
USA 106:16925–16929. 73. Butterfield NJ, Knoll AH, Swett K (1994) Paleobiology of the Neoproterozoic Svan-
43. Embley TM, Martin W (2006) Eukaryotic evolution, changes and challenges. Nature bergfjellet Formation, Spitsbergen. Fossils Strata 34:1–84.
440:623–630. 74. Summons RE, Walter MR (1990) Molecular fossils and microfossils of prokaryotes and
44. Mentel M, Martin W (2008) Energy metabolism among eukaryotic anaerobes in light protists from Proterozoic sediments. Am J Sci 290-A:212–244.
of Proterozoic ocean chemistry. Philos Trans R Soc Lond B Biol Sci 363:2717–2729. 75. Xiao SH, Knoll AH, Yuan XL, Pueschel CM (2004) Phosphatized multicellular algae in
45. Johnston DT, et al. (2010) An emerging picture of Neoproterozoic ocean chemistry: the Neoproterozoic Doushantuo Formation, China, and the early evolution of flo-
Insights from the Chuar Group, Grand Canyon, USA. Earth Planet Sci Lett 290:64–73. rideophyte red algae. Am J Bot 91:214–227.

Parfrey et al. PNAS | August 16, 2011 | vol. 108 | no. 33 | 13629
Supporting Information
Parfrey et al. 10.1073/pnas.1110633108
SI Text algae rather than on a particular node within the clade to be
Calibration Constraints. Calibration constraints (CCs) were as- conservative.
signed from the fossil record and were set to take into account the The single CC in the Excavata is placed within the euglenids.
multiple sources of uncertainty that arise from using paleonto- Although the Excavata generally have a poor fossil record, the
logical information to calibrate molecular clocks (1–3). The level euglenid Moyeria is widely distributed in the Ordovician and
of informativeness of the priors varied among calibration con- Silurian, with an earliest occurrence in the Caradocian (7), dated
straints to reflect to the level of confidence in the timing of the at 450 Ma. Moyeria is thought to have been photosynthetic based
split. For many lineages (including all Proterozoic CCs) the fossil on the patterning of its pellicle, indicating an early acquisition
record provides only a minimum divergence time, which is re- of the secondary green alga endosymbiont (8), thus the CC is
flected as a very long tail in the prior probability that extends placed at the split between photosynthetic (Euglena) and het-
back to ∼3500 Ma. In most cases, the CC was placed at the node erotrophic (Entosiphon) euglenids in the tree (Fig. 1).
where the clade with an available fossil split from its sister group; The calibration constraint for diatoms is based on the earliest
for example the first recorded angiosperm pollen (4) is used to diatom fossils from the Valanginian to Hauterivian Myogok
constrain the split of angiosperms from their gymnosperm an- Formation in Korea (9); a date of 133.9 Ma is used to represent
cestors (Table 1 and Fig. 2). In cases where the fossil falls within the upper Valanginian boundary. The CC of this node is younger
the crown clade, the CC was placed at the base of the clade, as in than in other clock analyses (10) because we do not rely on
the Endopterygota where the first Mecoptera fossils constrain Pyxidicula, a putative Toarcian diatom (11) for which the ma-
the split between Apis and Drosophila (Table 1). Minimum dates terial has been lost.
(offsets in BEAST) were assigned conservatively. We used ra- We include a CC for ciliates in the All 720, and Phan analyses
diometric dates when available, and set the minimum constraint that is based on the presence of gammacerane in Neoproterozoic
to the youngest edge of the reported confidence interval. Thus, sedimentary rocks (12). Tetrahymenol, the precursor of gam-
the minimum age of the CC for Arcellinida is 736 Ma, because macerane, is commonly found in some ciliates, although it has
arcellinid fossils are found in rocks older than 742 ± 6 Ma (Table also been found in bacteria (13). Tetrahymenol production is
1) (5). For fossils assigned to geological stages, we used the upper documented from the Oligohymenophorea and the Plagiopylea
boundary of the stage according to the 2009 International Stra- (Trimyena), which are not included in this analysis, so the CC was
tigraphy Chart published by the International Commission on placed at the stem of the Oligohymenophorea (Tetrahymena and
Stratigraphy (http://www.stratigraphy.org/). For example, angio- Paramecium). This CC was included despite the possibility of
sperm pollen is first found in Valanginian rocks (4) and so was bacterial origin, because the 736 Ma constraint is much younger
constrained to a minimum date of 133.9 Ma. than the date estimated for ciliates (∼1150 Ma) without this
Prior distributions were set in one of two ways depending on constraint in the Phan analyses.
the level of uncertainty. For clades with robust fossil records
where the maximum age of the clade is unlikely to be substantially Root of the Eukaryotic Tree of Life. Although our goal was to
earlier than its first occurrence (e.g., angiosperms), the prior elucidate timing of major events in eukaryotic evolution, we also
distribution was set to include 95% of the probable age of the explored the impact of changing the position of the root, because
clade. In contrast, Proterozoic records and fossils of groups with rooted phylogenies are crucial for interpreting the evolutionary
a poor fossilization potential provide only minimum dates for events in the history of a lineage. A root must be either provided
lineage origin and, commonly, no information on maximum clade or estimated for molecular clock analyses (14, 15). However, the
age (e.g., Arcellinida). In these cases the prior distribution was root of the eukaryotic tree of life is difficult to determine because
specified with a very long tail, as assessed in BEAUTi, that ex- the common methods for rooting phylogenies are vulnerable to
tended back to ∼3500 Ma. artifacts caused by rate heterogeneity among lineages of eukar-
Selected CCs are discussed here (see Table 1 for details of the yotes and the vast distance between eukaryotes and archaea or
remaining CCs). Fossils of the earliest red alga, Bangiomorpha, bacteria (16–18). Although numerous hypotheses have been
occur in the lower section of the Hunting Formation, Canada, proposed (19–23), the position of the root remains an open
which is bracketed by U-Pb radiometric dates on volcanic rocks debate (16, 17, 24, 25). The most popular hypothesis of recent
of 1267 ± 2 Ma and 723 ± 3 Ma. Direct Pb-Pb dates on carbo- years places the root of eukaryotes between the Opisthokonta +
nates correlative with those containing the fossils yield a much Amoebozoa (unikonts) and the remaining eukaryotes (bikonts)
narrow constraint of 1198 ± 24 Ma (6), but this date remains (19, 26), and previous molecular clock analyses of eukaryotes
unpublished, and radiometric dating of carbonates can be rooted trees in this manner (10, 27, 28). However, several lines of
problematic. The true age of the Hunting Bangiomorpha fossils evidence contradict the unikont/bikont split (23, 24), and alter-
may therefore lie closer to the lower U-Pb age constraint than native roots have been suggested, including at the base of
the upper, because of the sequence stratigraphic position of Opisthokonta (23, 29), within Archaeplastida (21, 22), or along
fossiliferous strata relative to constraining volcanic rocks and the lineage leading to Euglenozoa (20). Rooting the tree of ex-
chemo- and biostratigraphic data consistent with a later Meso- tant eukaryotes along the branch leading to Opisthokonta is
proterozoic (>1250 Ma) age (6). In most All 720, and Phan supported by ongoing gene-tree species-tree reconciliation work
analyses, the minimum date for the Bangiomorpha constraint was by Gordon Burleigh (University of Florida).
set at 1174 Ma. Given the importance of the Bangiomorpha Here, we assess the impact different positions of the root have
calibration as potentially the oldest phylogenetically constraining on estimates of the age of eukaryotes. The root is (i) estimated in
fossil by roughly 450 myr, we also ran the All 720, and Phan BEAST using the molecular clock criterion (30); (ii) placed be-
analysis with the constraint for Bangiomorpha set at 720 Ma (the tween Opisthokonta and the rest of eukaryotes (23, 29); or (iii)
minimum age for the Hunting Formation) for comparison placed between “Unikonta” and the rest of eukaryotes (19).
(analysis d). Because of the controversy surrounding Bangio- PhyloBayes requires a fixed topology for molecular dating anal-
morpha, we have placed the calibration on the base of the red yses, hence those analyses were run rooted either on Opistho-

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 1 of 15


konta or “Unikonta”. No strongly supported alternative root default settings (excluding CCs). In the first phase, initial runs
emerged from the analyses in which BEAST determined the root were done with a RAxML starting tree that had branch heights
based on the molecular clock. Rooting by the molecular clock (ages) set to 360 so that all nodes were older than the CCs
criterion uses information from branch lengths and has been assigned to them. Priors for all parameters (excluding CCs) were
shown in simulation studies to work reasonably well in situations left at default settings. Five million generations were removed
where there is no outgroup available (31). The position of the from each of these chains as burnin (as determined by conver-
root varied across runs, falling between Excavata or Excavata + gence of likelihood values in Tracer v1.5.4), and chains were
unikonts and the rest of eukaryotes (analyses e and f). All trees combined in LogCombiner v1.5.4 (distributed with BEAST) (36).
are available in SI Text (Figs. S1–S7 and Dataset S1). The final 1 million generations of the preliminary runs were used
to generate a starting tree for subsequent analyses that had a
Topology of the Eukaryotic Tree. The topology of the trees produced robust tree topology with realistic branch heights. Trees were
by BEAST through coestimation of phylogeny and dates are annotated using TreeAnnotator v1.5.4 (36) using the mean node
consistent with the broad outlines of eukaryotic topologies re- heights and maximum clade credibility tree settings. In the sec-
covered in other analyses (32, 33). Of note, three relationships ond phase of the analysis, eight runs of 10 million generations
emerge in the BEAST topologies that were not present in the each were conducted for each analysis (Table S1). Operator
RAxML analyses (Dataset S1). RAxML analyses divide Excavata values and prior distributions on substitution rates were in-
into two clades when all major lineages of Excavata are included formed from the results of initial runs. One million generations
(109-taxon set; Dataset S1); however, Excavata is monophyletic in were removed from each chain as burnin, and the remaining
all BEAST analyses (unless the root of eukaryotes falls within generations were combined from both log and tree files in
Excavata). Second, BEAST consistently places haptophytes as the LogCombiner v1.5.4. Trees were annotated in TreeAnnotator
sister clade of SAR (Fig. 2 and Figs. S1–S7). Finally, cryptomo- v1.5.4 and assessed in FigTree v1.3.1 (http://tree.bio.ed.ac.uk/
nads consistently branch as sister to kathablepharids and within software/figtree/).
the clade of primary photosynthetic eukaryotes: red algae, green Model conditions for BEAST were determined in preliminary
algae (including plants), and glaucophytes (the Archaeplastida or analyses of four chains run for 10 million generations each. If
Plantae hypothesis; Fig. 2 and Figs. S1–S7). likelihood values did not converge across four runs of 10 million
generations, as assessed in Tracer v1.5 (distributed with BEAST
Prior-Only Analyses. The differences in estimated node age between v1.5.4) (36), the model was deemed a poor fit for the data. The
the calibration sets (Phan vs. All) are driven by the sequence data, strict clock and uncorrelated exponential molecular clock models
rather than the prior distribution of calibration constraints. All were both rejected based on this criterion, as was analyzing the
720 analysis conditions in Table S1 were assessed without the data 15 genes as a single partition. If competing models converged in
to determine the impact of priors on estimated divergence dates likelihood scores, the likelihoods were compared using Bayes
(prior-only analyses). In BEAST analyses of priors alone, without factors (37) as assessed in Tracer, although we did not rely on
sequence data, yield dates for major nodes that are 200–800 myr this metric as the harmonic mean calculation of Bayes factors
younger than analyses with data. The disparity was much greater has been demonstrated to be unreliable (38). In these cases, the
for Phan analyses: prior-only analyses yielded dates 500–800 myr estimated divergence dates were compared between competing
younger here. For example, the root age is 1478 Ma in the 109- models. Fixing tree topology to the most likely RAxML tree
taxon Phan analysis b, but only 717 Ma when this analysis is run (Dataset S1) resulted in lower likelihood scores. Further, al-
without sequence data. In contrast, all PhyloBayes prior-only lowing BEAST to coestimate phylogeny and divergence dates
analyses produce dates much older than analyses run with the might yield better results for both, so for all subsequent analyses
data with the root falling between 3817 Ma and an unreasonable the topology was estimated. All 720, and Phanowing BEAST to
5047 Ma. These results demonstrate that the CCs chosen did not modify tree topology resulted in a highly supported topology that
determine our dates. was broadly consistent with other analyses (32, 33, 39).
Each gene was analyzed as a separate partition for both site
SI Materials and Methods models and molecular clock models because the analyses did not
Phylogenetic Analysis. BEAST requires a reasonable starting tree converge when the 15 genes were analyzed as a single partition.
to analyze complex datasets so the initial topology was obtained in The WAG amino acid substitution matrix was used for all genes,
RAxML. Two hundred bootstrap replicates followed by an ex- as it was the best-fitting model available in BEAST as determined
haustive maximum likelihood search were done using the MPI by PROTTEST (35). A model of amino acid substitution that
version of RaxML 7.0.4 with rapid bootstrapping and the WAG + included gamma-distributed rate classes was found to be a better
gamma model (34). The best-fitting amino acid substitution fit for the data; however, this resulted in a 10-fold computational
matrix available in BEAST was WAG for all partitions as esti- cost, thus the gamma correction was used in only a few cases for
mated in ProtTest (35). This resulted in a highly supported to- comparison and yielded similar dates and topologies compared
pology consistent with that found by RAxML analyses in Parfrey with analyses without a gamma correction. For example, adding
et al. (33) (Dataset S1). gamma correction to BEAST analyses of 109 taxa rooted on
opisthokonts with All and Phan CCs (analyses a and b) yielded
BEAST Model Conditions and Analyses. We ran preliminary analyses the same topology as Fig. 2 and Fig. S1, respectively, with the age
in BEAST to assess the impact of several options, including type of the root shifted from 1774 Ma to 1668 Ma in analysis a and
of molecular clock and partitioning of genes. Analyses were run at 1478 Ma to 1433 Ma in analysis b, where Proterozoic CCs were
Smith College and on the freely available Oslo Bioportal (www. excluded. These analyses are not included in Table S1 because
bioportal.uio.no/). Parameters were deemed a poor fit for the they were run only 5 million generations (rather than 10 million)
data if likelihood values did not converge across four runs of due to constraints on compute resources.
10 million generations. Based on this criterion, we selected the The UCL relaxed clock model was found to be the best clock
UCL relaxed clock model combined with unpartitioned genes for model available in BEAST for these data, as analyses using either
subsequent BEAST analyses (analyses a–h). a strict molecular clock or an uncorrelated exponential relaxed
A two-pronged approach was used to increase chain mixing, as clock did not converge Only uncorrelated models are imple-
measured by estimated sample size in Tracer. First, four initial mented in BEAST (36). The UCL relaxed clock is expected to
chains of 10 million generations each were run with the best perform better on datasets with deep divergences and rate het-
RAxML tree as the starting tree and the remaining priors at erogeneity across the tree, because the SD parameter captures

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 2 of 15


the variation in rates across the tree (30). The coefficient of are autocorrelated, or the UGAM model, which is nonautocor-
variation of the UCL clock ranges from 0.3 to 2.5 for different related and thus similar to BEAST (40, 41). There is much de-
genes, which indicates that rates vary between 30% and 250% bate as to whether substitution rates are best modeled as
across the tree depending on the gene. Thus, these data are not autocorrelated across the tree or uncorrelated (30, 41–43). Au-
clock-like.
tocorrelated models of the molecular clock assume that evolu-
PhyloBayes Analyses. Analyses were run in PhyloBayes version 3.2f tionary rates along a branch are dependent on the rate of the
(40). All chains were run for at least 1,000 cycles. Calibrations for parent branch (16, 41), whereas uncorrelated models draw rates
PhyloBayes were based on the calibrations in BEAST but spec- of evolution for each branch from a distribution (30, 42). These
ified as a date range as required by PhyloBayes with soft bounds clock models were chosen because CIR was shown to be a good
at the default setting (Table S3). The model of sequence evo- fit for many different datasets (41), and UGAM is similar to the
lution was the same across all PhyloBayes analyses with a gen- uncorrelated model run in BEAST. CIR (logBF 61) was pre-
eralized time-reversible (GTR) amino acid substitution matrix ferred to UGAM (logBF 32) in Bayes factor analyses comparing
(-gtr), a Dirichlet mixture profile (-cat), and a Dirichlet process
clock models to deconstrained models in PhyloBayes. Dates
modeling rates across sites (-ratecat). For each condition, two
replicate chains were run. Analyses were run with the tree to- were assessed by running readdiv with 250 generations removed
pology of Fig. 2, which was fixed and rooted either on as burn-in for each analysis. The mean dates were averaged, and
the Opisthokonta or “Unikonta”. Analyses were run under either the error bars were derived from the overall minimum and
the CIR molecular clock model, in which rates across branches maximum of the 95% confidence interval for the two chains.

1. Donoghue PCJ, Benton MJ (2007) Rocks and clocks: Calibrating the Tree of Life using 22. Rogozin IB, Basu MK, Csürös M, Koonin EV (2009) Analysis of rare genomic changes
fossils and molecules. Trends Ecol Evol 22:424–431. does not support the unikont-bikont phylogeny and suggests cyanobacterial
2. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic symbiosis as the point of primary radiation of eukaryotes. Genome Biol Evol 1:99–113.
estimation of evolutionary divergence times. Syst Biol 58:367–380. 23. Arisue N, Hasegawa M, Hashimoto T (2005) Root of the Eukaryota tree as inferred
3. Rutschmann F, Eriksson T, Salim KA, Conti E (2007) Assessing calibration uncertainty in from combined maximum likelihood analyses of multiple molecular sequence data.
molecular dating: The assignment of fossils to alternative calibration points. Syst Biol Mol Biol Evol 22:409–420.
56:591–608. 24. Roger AJ, Simpson AGB (2009) Evolution: Revisiting the root of the eukaryote tree.
4. Crane PR, Friis EM, Pedersen KR (1995) The origin and early diversification of Curr Biol 19:R165–R167.
angiosperms. Nature 374:27–33. 25. Koonin EV (2010) The origin and early evolution of eukaryotes in the light of
5. Porter SM, Meisterfeld R, Knoll AH (2003) Vase-shaped microfossils from the phylogenomics. Genome Biol 11:209.
Neoproterozoic Chuar Group, Grand Canyon: A classification guided by modern 26. Keeling PJ, et al. (2005) The tree of eukaryotes. Trends Ecol Evol 20:670–676.
testate amoebae. J Paleontol 77:409–429. 27. Douzery EJP, Snell EA, Bapteste E, Delsuc F, Philippe H (2004) The timing of eukaryotic
6. Butterfield NJ (2000) Bangiomorpha pubescens n. gen., n. sp.: Implications for the evolution: Does a relaxed molecular clock reconcile proteins and fossils? Proc Natl
evolution of sex, multicellularity, and the Mesoproterozoic/Neoproterozoic radiation Acad Sci USA 101:15386–15391.
of eukaryotes. Paleobiology 26:386–404. 28. Hug LA, Roger AJ (2007) The impact of fossils and taxon sampling on ancient
7. Gray J, Boucot AJ (1989) Is Moyeria a euglenoid? Lethaia 22:447–456. molecular dating analyses. Mol Biol Evol 24:1889–1897.
8. Leander BS, Witek RP, Farmer MA (2001) Trends in the evolution of the euglenid 29. Stechmann A, Cavalier-Smith T (2002) Rooting the eukaryote tree by using a derived
pellicle. Evolution 55:2215–2235. gene fusion. Science 297:89–91.
9. Harwood DM, Nikolaev VA, Winter DM (2007) Cretaceous records of diatom 30. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics and
evolution, radiation, and expansion. Paleontol Soc Papers 13:33–59. dating with confidence. PLoS Biol 4:e88.
10. Berney C, Pawlowski J (2006) A molecular time-scale for eukaryote evolution 31. Huelsenbeck JP, Bollback JP, Levine AM (2002) Inferring the root of a phylogenetic
recalibrated with the continuous microfossil record. Proc Biol Sci 273:1867–1872. tree. Syst Biol 51:32–43.
11. Rothpletz A (1896) On the flysch fucoids and a few other fossil algae, as well as 32. Hampl V, et al. (2009) Phylogenomic analyses support the monophyly of Excavata and
diatoms from Liassic sponge reefs (Translated from German). Z Dtsch Geol Ges 52: resolve relationships among eukaryotic “supergroups”. Proc Natl Acad Sci USA 106:
154–160. 3859–3864.
12. Summons RE, Walter MR (1990) Molecular fossils and microfossils of prokaryotes and 33. Parfrey LW, et al. (2010) Broadly sampled multigene analyses yield a well-resolved
protists from Proterozoic sediments. Am J Sci 290-A:212–244. eukaryotic tree of life. Syst Biol 59:518–533.
13. Kleemann G, et al. (1990) Tetrahymenanol from the phototrophic bacterium 34. Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the
Rhodopseudomonas palustris: First report of a gammacerane triterpene from a RAxML Web servers. Syst Biol 57:758–771.
prokaryote. J Gen Microbiol 136:2551–2553. 35. Abascal F, Zardoya R, Posada D (2005) ProtTest: Selection of best-fit models of protein
14. Renner SS, Grimm GW, Schneeweiss GM, Stuessy TF, Ricklefs RE (2008) Rooting and evolution. Bioinformatics 21:2104–2105.
dating maples (Acer) with an uncorrelated-rates molecular clock: Implications for 36. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling
north American/Asian disjunctions. Syst Biol 57:795–808. trees. BMC Evol Biol 7:214.
15. Sanderson MJ, Doyle JA (2001) Sources of error and confidence intervals in estimating 37. Suchard MA, Weiss RE, Sinsheimer JS (2001) Bayesian selection of continuous-time
the age of angiosperms from rbcL and 18S rDNA data. Am J Bot 88:1499–1516. Markov chain evolutionary models. Mol Biol Evol 18:1001–1013.
16. Roger AJ, Hug LA (2006) The origin and diversification of eukaryotes: Problems with 38. Xie W, Lewis PO, Fan Y, Kuo L, Chen MH (2011) Improving marginal likelihood
molecular phylogenetics and molecular clock estimation. Philos Trans R Soc Lond B estimation for Bayesian phylogenetic model selection. Syst Biol 60:150–160.
Biol Sci 361:1039–1054. 39. Burki F, et al. (2009) Large-scale phylogenomic analyses reveal that two enigmatic
17. Embley TM, Martin W (2006) Eukaryotic evolution, changes and challenges. Nature protist lineages, telonemia and centroheliozoa, are related to photosynthetic
440:623–630. chromalveolates. Genome Biol Evol 1:231–238.
18. Tekle YI, Parfrey LW, Katz LA (2009) Molecular data are transforming hypotheses on 40. Lartillot N, Lepage T, Blanquart S (2009) PhyloBayes 3: A Bayesian software package
the origin and diversification of eukaryotes. Bioscience 59:471–481. for phylogenetic reconstruction and molecular dating. Bioinformatics 25:2286–2288.
19. Stechmann A, Cavalier-Smith T (2003) The root of the eukaryote tree pinpointed. Curr 41. Lepage T, Bryant D, Philippe H, Lartillot N (2007) A general comparison of relaxed
Biol 13:R665–R666. molecular clock models. Mol Biol Evol 24:2669–2680.
20. Cavalier-Smith T (2010) Kingdoms Protozoa and Chromista and the eozoan root of 42. Linder M, Britton T, Sennblad B (2011) Evaluation of Bayesian models of substitution
the eukaryotic tree. Biol Lett 6:342–345. rate evolution—parental guidance versus mutual independence. Syst Biol 60:329–342.
21. Nozaki H (2005) A new scenario of plastid evolution: Plastid primary endosymbiosis 43. Ho SYW (2009) An examination of phylogenetic models of substitution rate variation
before the divergence of the “Plantae,” emended. J Plant Res 118:247–255. among lineages. Biol Lett 5:421–424.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 3 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae
Rhizaria
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Naegleria_gruberi
Sawyeria_marylandensis
Trichomonas_vaginalis
Jakoba_libera Excavata
Reclinomonas_americana
Seculamonas_ecuadoriensis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Arcella_hemisphaerica
Rhizamoeba_sp
Hartmannella_vermiformis
Acanthamoeba_castellanii
Entamoeba_histolytica Amoebozoa
Mastigamoeba_balamuthi
Dictyostelium_discoideum
Physarum_polycephalum
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. S1. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, rooted on Opisthokonta, and constructed in BEAST (analysis b).
Nodes are at mean divergence times, and gray bars represent 95% HPD of node age. (Upper) Geological time scale. (Lower) Absolute time scale (in Ma). Thick
vertical bars demarcate eras, and thin vertical lines denote periods, with dates derived from the 2009 International Stratigraphic Chart.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 4 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum
Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri Green algae
Chlamydomonas_reinhardtii
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Naegleria_gruberi
Sawyeria_marylandensis
Trichomonas_vaginalis
Jakoba_libera Excavata
Reclinomonas_americana
Seculamonas_ecuadoriensis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Arcella_hemisphaerica
Rhizamoeba_sp
Acanthamoeba_castellanii
Hartmannella_vermiformis
Entamoeba_histolytica
Amoebozoa
Mastigamoeba_balamuthi
Dictyostelium_discoideum
Physarum_polycephalum
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. S2. Time-calibrated tree of eukaryotes using All (Proterozoic and Phanerozoic) calibration points with the Bangiomorpha CC set at 720 Ma, 109 taxa,
rooted on Opisthokonta, and constructed in BEAST (analysis c). Other notes as in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 5 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Crypthecodinium_cohnii
Alexandrium_tamarense
Heterocapsa_rotundata SAR
Karenia_brevis
Perkinsus_marinus Alveolates
Theileria_parva
Plasmodium_berghei
Eimeria_tenella
Toxoplasma_gondii
Sterkiella_histriomuscorum
Stylonychia_lemnae
Tetrahymena_thermophila
Paramecium_tetraurelia
Heteromita_globosa Rhizaria
Corallomyxa_tenera
Bigelowiella_natans
Ovammina_opaca
Reticulomyxa_filosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum Stramenopiles
Aureococcus_anophagefferens
Ectocarpus_siliculosus
Phytophthora_infestans
Emiliania_huxleyi
Isochrysis_galbana Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Volvox_carteri
Chlamydomonas_reinhardtii
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri Green algae
Arabidopsis_thaliana
Oryza_sativa
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Gracilaria_changii
Chondrus_crispus Red algae
Porphyra_yezoensis
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Guillardia_theta
Goniomonas Cryptomonads
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Sawyeria_marylandensis
Naegleria_gruberi
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Diplonema_papillatum Excavata
Bodo_saltans
Streblomastix_strix
Monocercomonoides_sp
Trimastix_pyriformis
Malawimonas_jakobiformis
Malawimonas_californiana
Arcella_hemisphaerica
Hartmannella_vermiformis
Acanthamoeba_castellanii
Dictyostelium_discoideum Amoebozoa
Physarum_polycephalum
Mastigamoeba_balamuthi
Entamoeba_histolytica
Drosophila_melanogaster
Apis_mellifera
Caenorhabditis_elegans
Capitella_capitata
Aplysia_californica
Homo_sapiens
Gallus_gallus
Branchiostoma_floridae
Mnemiopsis_leidyi
Oscarella_carmela
Nematostella_vectensis
Schistosoma_mansoni Opisthokonta
Monosiga_brevicollis
Capsaspora_owczarzaki
Sphaeroforma_arctica
Saccharomyces_cerevisiae
Candida_albicans
Schizosaccharomyces_pombe
Ustilago_maydis
Phanerochaete_chrysosporium
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus

2000 1750 1500 1250 1000 750 500 250 0

Fig. S3. Time-calibrated tree of eukaryotes using All calibration points, 91 taxa, rooted on Opisthokonta, and constructed in BEAST (analysis d). Other notes as
in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 6 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Stramenopiles
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Glaucocystis_nostochinearum
Cyanophora_paradoxa Glaucocystophytes
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis
Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis

2000 1750 1500 1250 1000 750 500 250 0

Fig. S4. Time-calibrated tree of eukaryotes using All calibration points, 109 taxa, root estimated by BEAST, and constructed in BEAST (analysis e). Other notes
as in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 7 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca Rhizaria
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Stramenopiles
Heterosigma_akashiwo
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta
Leucocryptos_marina Cryptomonads
Glaucocystis_nostochinearum
Cyanophora_paradoxa
Glaucocystophytes
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus Opisthokonta
Mnemiopsis_leidyi
Nematostella_vectensis
Monosiga_brevicollis
Sphaeroforma_arctica
Amoebidium_parasiticum
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis

2000 1750 1500 1250 1000 750 500 250 0

Fig. S5. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, root estimated by BEAST, and constructed in BEAST (analysis f).
Other notes as in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 8 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Alexandrium_tamarense
Heterocapsa_rotundata
Crypthecodinium_cohnii
Karenia_brevis
Oxyrrhis_marina
SAR
Perkinsus_marinus
Theileria_parva
Plasmodium_berghei
Alveolates
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Heterosigma_akashiwo Stramenopiles
Ectocarpus_siliculosus
Apodachlya_brachynema
Phytophthora_infestans
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae Rhizaria
Bigelowiella_natans
Gromia
Corallomyxa_tenera
Heteromita_globosa
Isochrysis_galbana
Emiliania_huxleyi
Prymnesium_parvum Haptophytes
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum
Cyanophora_paradoxa Glaucocystophytes
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi Excavata
Sawyeria_marylandensis
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Aphrocallistes_vastus
Oscarella_carmela
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi

2000 1750 1500 1250 1000 750 500 250 0

Fig. S6. Time-calibrated tree of eukaryotes using All calibration points, 109 taxa, rooted on “Unikonta” and constructed in BEAST (analysis g). Other notes as
in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 9 of 15


Paleoproterozoic Mesoproterozoic Neoproterozoic Phanerozoic

Heterocapsa_rotundata
Alexandrium_tamarense
Crypthecodinium_cohnii
Karenia_brevis SAR
Oxyrrhis_marina
Perkinsus_marinus
Theileria_parva Alveolates
Plasmodium_berghei
Toxoplasma_gondii
Eimeria_tenella
Stylonychia_lemnae
Sterkiella_histriomuscorum
Nyctotherus_ovalis
Paramecium_tetraurelia
Tetrahymena_thermophila
Chilodonella_uncinata
Reticulomyxa_filosa
Ovammina_opaca
Plasmodiophora_brassicae
Bigelowiella_natans
Gromia Rhizaria
Corallomyxa_tenera
Heteromita_globosa
Thalassiosira_pseudonana
Phaeodactylum_tricornutum
Aureococcus_anophagefferens
Heterosigma_akashiwo
Ectocarpus_siliculosus
Stramenopiles
Apodachlya_brachynema
Phytophthora_infestans
Isochrysis_galbana
Emiliania_huxleyi Haptophytes
Prymnesium_parvum
Pavlova_lutheri
Oryza_sativa
Arabidopsis_thaliana
Welwitschia_mirabilis
Ginkgo_biloba
Physcomitrella_patens
Mesostigma_viride
Volvox_carteri
Chlamydomonas_reinhardtii Green algae
Dunaliella_salina
Acetabularia_acetabulum
Micromonas_pusilla
Ostreococcus_tauri
Gracilaria_changii
Chondrus_crispus
Porphyra_yezoensis Red algae
Cyanidioschyzon_merolae
Goniomonas
Guillardia_theta Cryptomonads
Leucocryptos_marina
Glaucocystis_nostochinearum Glaucocystophytes
Cyanophora_paradoxa
Trypanosoma_brucei
Leishmania_major
Bodo_saltans
Diplonema_papillatum
Euglena_longa
Euglena_gracilis
Entosiphon_sulcatum
Jakoba_libera
Reclinomonas_americana
Seculamonas_ecuadoriensis
Naegleria_gruberi
Sawyeria_marylandensis
Excavata
Trichomonas_vaginalis
Giardia_duodenalis
Spironucleus_barkhanus
Carpediemonas_membranifera
Monocercomonoides_sp
Streblomastix_strix
Trimastix_pyriformis
Malawimonas_californiana
Malawimonas_jakobiformis
Capitella_capitata
Aplysia_californica
Schistosoma_mansoni
Apis_mellifera
Drosophila_melanogaster
Caenorhabditis_elegans
Gallus_gallus
Homo_sapiens
Branchiostoma_floridae
Oscarella_carmela
Aphrocallistes_vastus
Mnemiopsis_leidyi Opisthokonta
Nematostella_vectensis
Monosiga_brevicollis
Amoebidium_parasiticum
Sphaeroforma_arctica
Capsaspora_owczarzaki
Candida_albicans
Saccharomyces_cerevisiae
Schizosaccharomyces_pombe
Phanerochaete_chrysosporium
Ustilago_maydis
Glomus_intraradices
Allomyces_macrogynus
Spizellomyces_punctatus
Acanthamoeba_castellanii
Hartmannella_vermiformis
Dictyostelium_discoideum
Physarum_polycephalum
Arcella_hemisphaerica Amoebozoa
Rhizamoeba_sp
Entamoeba_histolytica
Mastigamoeba_balamuthi

2000 1750 1500 1250 1000 750 500 250 0

Fig. S7. Time-calibrated tree of eukaryotes using Phanerozoic calibration points, 109 taxa, rooted on “Unikonta” and constructed in BEAST (analysis h). Other
notes as in Fig. S1.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 10 of 15


Table S1. Estimates of dates for the last common ancestor of extant eukaryotes across analyses
Root age, Ma

Analysis Taxa CCs Root Mean Range Model Program Tree

a 109 All Opis 1774 1632–1911 UCL BEAST Fig. 2


b 109 Phan Opis 1478 1362–1595 UCL BEAST Fig. S1
c 109 All 720 Opis 1679 1548–1797 UCL BEAST Fig. S2
d 91 All Opis 1837 1725–1954 UCL BEAST Fig. S3
e 109 All Estim 1784 1639–1939 UCL BEAST Fig. S4
f 109 Phan Estim 1506 1365–1643 UCL BEAST Fig. S5
g 109 All Uni 1717 1601–1819 UCL BEAST Fig. S6
h 109 Phan Uni 1471 1347–1604 UCL BEAST Fig. S7
i 109 All Opis 1866 1569–2235 UGAM PhyloBayes —
j 109 Phan Opis 1594 1288–1979 UGAM PhyloBayes —
k 109 All Uni 1810 1549–2161 UGAM PhyloBayes —
l 109 Phan Uni 1561 1268–1886 UGAM PhyloBayes —
m 109 All Opis 1798 1441–2133 CIR PhyloBayes —
n 109 Phan Opis 1038 889–1350 CIR PhyloBayes —
o 109 All Uni 1691 1048–2357 CIR PhyloBayes —
p 109 Phan Uni 1180 897–1839 CIR PhyloBayes —

Root age range is the 95% HPD for BEAST analyses and minimum and maximum ages of 95% confidence interval for PhyloBayes. See Table S2 for details of
taxon sampling, and Table 1 for calibration constraints. All trees are available in Dataset S1. All, 22 calibration points of Phanerozoic and Proterozoic age
included; All 720, Bangiomorpha CC set to 720 Ma; CCs, calibration constraints; CIR, autocorrelated CIR model; Estim, root estimated by BEAST; model,
molecular clock model; Opis, root constrained to Opisthokonta; Phan, calibration points of Phanerozoic age included; root, position of the root; UCL, un-
correlated log normal; UGAM, uncorrelated gamma model; Uni, root constrained to “Unikonta”.

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 11 of 15


Table S2. Details of gene and taxon sampling
Lineage Taxon* 14–3-3 40S Actin αtub βtub Ef1α Ef2 Enolase Grc5 Hsp70cyt Hsp90 MetK Rps22a Rps23a Tsec61 Sum

Alveolates Alexandrium tamarense 1 1 1 1 1 1 1 1 1 1 1 1 1 14


Alveolates Chilodonella uncinata 1 1 1 1 1 1 1 1 1 10
Alveolates Crypthecodinium cohnii 1 1 1 1 5
Alveolates Eimeria tenella 1 1 1 1 1 1 1 1 1 10
Alveolates Heterocapsa rotundata 1 1 1 1 1 6
Alveolates Karenia brevis 1 1 1 1 1 1 1 1 1 1 11
Alveolates Nyctotherus ovalis 1 1 1 1 1 1 1 1 1 1 1 12
Alveolates Oxyrrhis marina 1 1 1 1 1 6
Alveolates Paramecium tetraurelia 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Alveolates Perkinsus marinus 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Alveolates Plasmodium berghei 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Alveolates Sterkiella histriomuscorum 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Alveolates Stylonychia lemnae 1 1 1 1 1 6
Alveolates Tetrahymena thermophila 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16

Parfrey et al. www.pnas.org/cgi/content/short/1110633108


Alveolates Theileria parva 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Alveolates Toxoplasma gondii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Amoebozoa Acanthamoeba castellanii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Amoebozoa Arcella hemisphaerica 1 1 3
Amoebozoa Dictyostelium discoideum 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Amoebozoa Entamoeba histolytica 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Amoebozoa Hartmannella vermiformis 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Amoebozoa Mastigamoeba balamuthi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Amoebozoa Physarum polycephalum 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Amoebozoa Rhizamoeba sp. ATCC 50933 1 1 1 4
Animals Aphrocallistes vastus 1 1 1 1 5
Animals Apis mellifera 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Aplysia californica 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Animals Branchiostoma floridae 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Caenorhabditis elegans 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Capitella capitata 1 1 1 1 1 1 1 1 1 1 11
Animals Drosophila melanogaster 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Gallus gallus 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Homo sapiens 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Mnemiopsis leidyi 1 1 1 1 1 1 1 1 1 1 1 12
Animals Nematostella vectensis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Animals Oscarella carmela 1 1 1 1 1 1 1 1 1 1 1 12
Animals Schistosoma mansoni 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Choanoflagellida Monosiga brevicollis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Cryptophyta Goniomonas† 1 1 1 1 1 1 1 1 1 1 11
Cryptophyta Guillardia theta 1 1 1 1 1 1 1 1 1 10
Euglenozoa Bodo saltans 1 1 1 1 1 1 1 8
Euglenozoa Diplonema papillatum 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Euglenozoa Entosiphon sulcatum 1 1 1 4
Euglenozoa Euglena gracilis 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Euglenozoa Euglena longa 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Euglenozoa Leishmania major 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16

12 of 15
Table S2. Cont.
Lineage Taxon* 14–3-3 40S Actin αtub βtub Ef1α Ef2 Enolase Grc5 Hsp70cyt Hsp90 MetK Rps22a Rps23a Tsec61 Sum

Euglenozoa Trypanosoma brucei 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16


Fornicata Carpediemonas membranifera 1 1 1 1 5
Fornicata Giardia duodenalis ATCC 50803 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Fornicata Spironucleus barkhanus 1 1 1 1 1 1 1 1 1 1 1 12
Fungi Allomyces macrogynus 1 1 1 1 1 1 1 1 1 1 1 1 13
Fungi Candida albicans 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Fungi Glomus intraradices 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Fungi Phanerochaete chrysosporium 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Fungi Saccharomyces cerevisiae 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Fungi Schizosaccharomyces pombe 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Fungi Spizellomyces punctatus 1 1 1 1 1 1 1 1 1 1 1 12
Fungi Ustilago maydis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Glaucophytes Cyanophora paradoxa 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Glaucophytes Glaucocystis nostochinearum 1 1 1 1 1 1 1 1 1 1 1 1 13

Parfrey et al. www.pnas.org/cgi/content/short/1110633108


Haptophytes Emiliania huxleyi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Haptophytes Isochrysis galbana 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Haptophytes Pavlova lutheri 1 1 1 1 1 1 1 1 1 1 1 1 13
Haptophytes Prymnesium parvum 1 1 1 1 1 1 1 1 1 1 1 1 13
Heterolobosea Naegleria gruberi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Heterolobosea Sawyeria marylandensis 1 1 1 1 1 1 1 1 1 1 1 1 13
Ichthyosporea Amoebidium parasiticum 1 1 1 1 1 1 1 1 1 1 11
Ichthyosporea Capsaspora owczarzaki 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Ichthyosporea Sphaeroforma arctica 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Jakodidae Jakoba libera 1 1 1 1 1 1 1 1 1 1 1 12
Jakodidae Reclinomonas americana 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Jakodidae ‘Seculamonas ecuadoriensis’ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Kathablepharidae Leucocryptos marina 1 1 1 1 1 1 7
Malawimonas Malawimonas californiana 1 1 1 1 1 1 1 1 1 1 1 1 12
Malawimonas Malawimonas jakobiformis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Parabasalidea Trichomonas vaginalis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Preaxosytla Monocercomonoides sp. 1 1 1 1 1 6
Preaxosytla Streblomastix strix 1 1 1 1 5
Preaxosytla Trimastix pyriformis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Rhizaria Bigelowiella natans 1 1 1 1 1 1 1 1 1 1 1 12
Rhizaria Corallomyxa tenera 1 1 1 1 1 1 7
Rhizaria Gromia‡ 1 1 3
Rhizaria Heteromita§ 1 1 1 1 1 1 1 1 1 10
Rhizaria Ovammina opaca 1 1 1 4
Rhizaria Plasmodiophora brassicae 1 1 1 4
Rhizaria Reticulomyxa filosa 1 1 1 1 1 1 1 1 1 10
Red algae Chondrus crispus 1 1 1 1 1 1 1 1 1 1 1 1 13
Red algae Cyanidioschyzon merolae 1 1 1 1 1 1 7
Red algae Gracilaria changii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Red algae Porphyra yezoensis 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Stramenopiles Apodachlya brachynema 1 1 1 1 1 1 7
Stramenopiles Aureococcus anophagefferens 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15

13 of 15
Table S2. Cont.
Lineage Taxon* 14–3-3 40S Actin αtub βtub Ef1α Ef2 Enolase Grc5 Hsp70cyt Hsp90 MetK Rps22a Rps23a Tsec61 Sum

Stramenopiles Ectocarpus siliculosus 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16


Stramenopiles Heterosigma akashiwo 1 1 1 1 1 1 7
Stramenopiles Phaeodactylum tricornutum 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Stramenopiles Phytophthora infestans 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Stramenopiles Thalassiosira pseudonana 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Green algae Acetabularia acetabulum 1 1 1 1 1 1 1 1 1 10
Green algae Arabidopsis thaliana 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Green algae Chlamydomonas reinhardtii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Green algae Dunaliella salina 1 1 1 1 1 1 1 1 1 1 1 12
Green algae Ginkgo biloba 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Green algae Mesostigma viride 1 1 1 1 1 1 1 1 1 1 1 1 1 14
Green algae Micromonas pusilla 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Green algae Oryza sativa 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Green algae Ostreococcus tauri 1 1 1 1 1 1 1 1 1 1 1 1 13

Parfrey et al. www.pnas.org/cgi/content/short/1110633108


Green algae Physcomitrella patens 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16
Green algae Volvox carteri 1 1 1 1 1 1 1 1 1 1 1 1 1 1 15
Green algae Welwitschia mirabilis 1 1 1 1 1 1 1 1 1 1 1 1 1 14

*Taxa in bold are included in both the 91-taxon and 109-taxon analyses.

Composite of Goniomonas truncata and Goniomonas cf. pacifica.

Composite of G. oviformis and Gromia sp. Antarctica.
§
Composite of. H. globosa and Heteromita sp. ATCC PRA-74.

14 of 15
Table S3. PhyloBayes calibrations
Node specification Calibration†

Taxon* Species 1 Species 2 Max Min

Amniota Gallus gallus Homo sapiens 400 328.3


Angiosperms Arabidopsis thaliana Welwitschia mirabilis 425 133.9
Ascomycetes Schizosaccharomyces pombe Phanerochaete chrysosporium 1,000 400
Coccolithophores Emiliania huxleyi Isochrysis galbana 260 203.6
Diatoms Aureococcus anophagefferens Thalassiosira pseudonana 550 133.9
Dinoflagellates Karenia brevis Crypthecodinium cohnii 300 240
Embryophytes Mesostigma viride Oryza sativa 600 471
Endopterygota Apis mellifera Drosophila melanogaster 350 284.4
Eudicots Arabidopsis thaliana Oryza sativa 133.9 125
Euglenids Entosiphon sulcatum Euglena gracilis 3,000 450
Foraminifera Ovammina opaca Reticulomyxa filosa 3,000 542
Gonyaulacales Alexandrium tamarense Crypthecodinium cohnii 240 196
Pennate diatoms Phaeodactylum tricornutum Thalassiosira pseudonana 110 80
Spirotrichs Sterkiella histriomuscorum Stylonychia lemnae 3,000 444
Trachaeophytes Physcomitrella patens Arabidopsis thaliana 471 425
Vertebrates Branchiostoma floridae Homo sapiens 555 520
Animals Nematostella vectensis Capitella capitata 3,000 632
Arcellinida Arcella hemisphaerica Rhizamoeba sp 3,000 736
Bilateria Branchiostoma floridae Capitella capitata 630 555
Chlorophytes Acetabularia acetabulum Volvox carteri 3,000 700
Ciliates Paramecium tetraurelia Chilodonella uncinata 3,000 736
Florideophyceae Chondrus crispus Porphyra yezoensis 3,000 550
Red algae Cyanidioschyzon merolae Chondrus crispus 3,000 1,174

*Taxon is same as in Table 1; see Table 1 for other notes.



Calibrations in PhyloBayes are specified as a uniform distribution with minimum and maximum dates, and were run with soft bounds.

Other Supporting Information Files

Dataset S1 (XLS)

Parfrey et al. www.pnas.org/cgi/content/short/1110633108 15 of 15

Vous aimerez peut-être aussi