Académique Documents
Professionnel Documents
Culture Documents
doi: 10.1111/j.1365-294X.2006.02882.x
INVITED REVIEW
Abstract
Microbial ecology examines the diversity and activity of micro-organisms in Earths
biosphere. In the last 20 years, the application of genomics tools have revolutionized
microbial ecological studies and drastically expanded our view on the previously underappreciated microbial world. This review first introduces the basic concepts in microbial
ecology and the main genomics methods that have been used to examine natural microbial
populations and communities. In the ensuing three specific sections, the applications of
the genomics in microbial ecological research are highlighted. The first describes the
widespread application of multilocus sequence typing and representational difference
analysis in studying genetic variation within microbial species. Such investigations have
identified that migration, horizontal gene transfer and recombination are common in
natural microbial populations and that microbial strains can be highly variable in genome
size and gene content. The second section highlights and summarizes the use of four
specific genomics methods (phylogenetic analysis of ribosomal RNA, DNADNA reassociation kinetics, metagenomics, and micro-arrays) in analysing the diversity and
potential activity of microbial populations and communities from a variety of terrestrial
and aquatic environments. Such analyses have identified many unexpected phylogenetic
lineages in viruses, bacteria, archaea, and microbial eukaryotes. Functional analyses of
environmental DNA also revealed highly prevalent, but previously unknown, metabolic
processes in natural microbial communities. In the third section, the ecological implications
of sequenced microbial genomes are briefly discussed. Comparative analyses of prokaryotic
genomic sequences suggest the importance of ecology in determining microbial genome
size and gene content. The significant variability in genome size and gene content among
strains and species of prokaryotes indicate the highly fluid nature of prokaryotic genomes,
a result consistent with those from multilocus sequence typing and representational difference
analyses. The integration of various levels of ecological analyses coupled to the application
and further development of high throughput technologies are accelerating the pace of
discovery in microbial ecology.
Keywords: Cryptococous, gene genealogy, microbial diversity, microbial sex, systems microbiology
Received 22 September 2005; revision accepted 14 December 2005
Introduction
Micro-organisms have been integral to the history and
function of life on Earth. They have played central roles
in Earths climatic, geological, geochemical, and biological
evolution. However, until very recently, the general importance of micro-organisms has been appreciated by only a
few specialists. Indeed, micro-organisms are still most
often considered from an anthropocentric perspective, with
attention focused on the relatively few species that cause
human diseases and the potential of micro-organisms to
provide useful products and services. The recent advances
in genomics are offering fresh perspectives on this
previously underappreciated microbial world.
1714 J . X U
The microbial world contains a highly heterogeneous
group of organisms sharing only one common characteristic,
their small sizes. These organisms make up two (out of
three) entire Domains of life on Earth, the prokaryotic
Bacteria and Archaea (Woese 1987). Within the third
Domain, Eukarya, the majority of the phylogenetic diversity is contained within eukaryotic micro-organisms such
as protozoa, algae, and fungi. The prokaryotic life emerged
about 3.8 billion years ago, about 2 billion years before
eukaryotic life arose. Currently, microbial life forms are
found in virtually every imaginable ecological niche on
Earth, from the tropics to the Arctic and Antarctica, from
underground mines and oil fields to the stratosphere and
the top of great mountains, from deserts to the Dead Sea,
from above-ground hot springs to underwater hydrothermal vents.
Microbial ecology examines the diversity of microorganisms and how micro-organisms interact with each
other and with their environment to generate and to
maintain such diversities. Consequently, microbial ecologists have traditionally focused on two areas of study:
(i) microbial diversity, including the isolation, identification
and quantification of micro-organisms in various habitats;
and (ii) microbial activity, that is, what micro-organisms are
doing in their habitats and how their activities contribute
to the observed microbial diversity and biogeochemical
cycling.
Microbial diversity in the environment can be measured
by various indices such as phylogenetic diversity, species
diversity, genotype diversity, and gene diversity (Box 1).
Above the species level, microbial diversity is commonly
quantified based on evolutionary distances among observed
taxonomic groups from a specific environment (e.g. the
phylogenetic diversity based on a common chronometer
such as the 16S ribosomal RNA subunit). Below the species
level, microbial diversity is typically described using
population genetic parameters such as gene diversity and
genotype diversity. Gene diversity and genotype diversity
refer respectively to the probability that two randomly
drawn genes and genotypes in a population will be different.
At the species level, microbial diversity is measured as
Functional diversity
Morphological diversity
Structural diversity
Metabolic diversity
Metabolite diversity
Protein diversity
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1715
and biogeography. The patterns of distributions are often
discussed in the context of environmental factors such as
temperature, pH, salinity, pressure, the availabilities of
water and nutrients, and the sources of energy and carbon.
These ecological factors influence microbial activities and
play very important roles in determining the spatial and
temporal dynamics of micro-organisms in natural environments. Consequently, microbial ecologists often group
micro-organisms into specific metabolic categories. For
example, depending on the energy source, micro-organisms
are called either phototrophs (obtaining energy from light)
or chemotrophs (obtaining energy from chemicals). Among
chemotrophs, if the energy sources are from inorganic
molecules (such as H2S, H2, NH3, and Fe2+), they are called
chemolithotrophs. In contrast, if their energy sources are
from organic compounds, they are called chemoorganotrophs. Similarly, depending on the carbon source, microorganisms can be either autotrophs (obtaining carbon from
inorganic sources such as CO2 and HCO3 ) or heterotrophs
(obtaining carbon from organic compounds). Some microorganisms, either in a free-living state or in association
with other organisms, can use atmospheric nitrogen as its
nitrogen source. Indeed, the diversity of microbial metabolisms extends far beyond the typical animal and plant
metabolic capabilities. Even more striking are the extreme
environmental conditions where many micro-organisms
are found and thriving. These conditions include extreme
high and low pressure, pH, oxygen and metal concentration, salinity, radiation, desiccation, and temperatures
(Rothschild & Mancinelli 2001). For example, the nitratereducing chemolithoautotroph Pyrolobus fumarii can grow
at temperatures of up to 113 C (Blochl et al. 1997).
Micro-organisms in the environment are commonly
organized into several levels of hierarchical organizations,
from simple to complex: individuals, populations, guilds
(metabolically related populations), communities (sets of
interacting guilds), and ecosystems. A microbial ecosystem consists both the microbial community and its interacting biotic (macro-organisms such as plants and animals)
and abiotic environmental factors (pH, temperature, inorganic and organic nutrients, etc.). While we commonly
associate micro-organisms as decomposers of organic wastes
and pathogens of plants, animals and humans, microorganisms can also form mutualistic associations with each
other as well as be fierce predators of other micro-organisms.
For example, the minute bacteria Bdellovibrio (0.3 m in
diameter) can quickly destroy an Escherichia coli cell many
times its own size (1 2 m) (Nunez et al. 2003).
Until very recently, most of what we know about microbial diversity and microbial activity were derived from
cultured microbes and ex situ laboratory experimental
investigations. While such studies are essential, recent
investigations using high resolution microelectronic, microscopic, and genomic tools have shown that much of what
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
Genomics tools
The word genomics has become a trendy term widely
used by the scientific community and the general public.
Originally, the term was used to describe a specific discipline in genetics that deals with mapping, sequencing and
analysing genomes. A genome refers to the complete set of
genes and chromosomes in an organism. While many people
use genomics in this narrow sense, an increasing number
of people have expanded its use to include functional analysis of entire genomes as well. These functional analytical
aspects include those on whole genome RNA transcripts
(called transcriptomics), proteins (proteomics), and metabolites (metabolomics). In addition, various combinations
of -omics terms have recently become highly fashionable.
For example, the discipline that uses genomics methods to
analyse natural ecological communities has been called
metagenomics, ecological genomics, community genomics,
and environmental genomics. In this section, the main
genomics tools and methods are briefly described with a
focus on those dealing with DNA (Box 3).
1716 J . X U
DNA sequencing
The most significant technical advance in genomic is the
development of efficient, high throughput DNA-sequencing
techniques and instruments. While the basic principle for
DNA sequencing was established in the mid-1970s, it was
not until the mid-1990s when efficient automated DNA
sequencers and fluorescent dyes to tag the dideoxyribonucleotides (with one colour for each of the four types of
nucleotides) were developed. At present, high throughput
DNA sequencing facilities are found in most academic
institutions and many molecular biology laboratories.
Furthermore, faster and cheaper sequencing methods and
equipment are continuously developed. For example, the
recently developed pyrosequencing protocol used a novel
fibre-optic slide of individual wells. This method could
sequence 25 million bases in one 4-hour run with an
accuracy of 99.96% (Margulies et al. 2005).
Gas chromatography
Mass spectrophotometry
Bioinformatics
Hybridization techniques
Several other traditional DNA analytical techniques have
also been widely used in microbial ecological studies. These
include DNA re-association kinetic analysis and fluorescent
in situ hybridization (FISH). Using fluorescently tagged
specific probes, FISH allows the direct observation and
estimation of micro-organisms from specific species, genera,
families or phyla in a given environmental sample. In contrast, the analyses of DNA re-association kinetics can be
used to provide estimates on the diversity of microbial
genomes in environmental DNA samples.
More recently, the high throughput micro-array technology has been applied to analyse the distributions of
genes and species in natural microbial consortia (Zhou
2003). DNA micro-arrays are glass surfaces to which arrays
of specific DNA fragments of various lengths have been
attached at discrete locations. These fragments serve as
probes for hybridization. Under conditions suitable for
hybridization, the DNA spots on the chip are exposed to a
solution containing a complex sample of fluorescent-labelled
DNA. These arrays may contain probes of lengths from
25 to several hundred or even over a thousand base pairs.
While most micro-arrays are derived from single genomes,
arrays containing specific genes from multiple genomes
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1717
can also be very useful for studying the distributions and
activities of groups of micro-organisms in nature (Zhou
2003; Lehner et al. 2005).
No. of
genomes
Mean ( SD)
Range
Terrestrial
Multiple
Aquatic
Host-associated
Specialized
Unknown
11
65
26
122
23
3
4.92 ( 1.13)
4.29 ( 1.87)
3.14 ( 1.60)
2.57 ( 1.64)
2.29 ( 0.92)
3.47 ( 2.37)
3.287.25
1.409.12
1.317.15
0.499.11
0.715.37
0.805.31
Bioinformatics
Of all the methods mentioned above, none would have
been successful in microbial ecological research without
bioinformatics tools. Broadly defined, bioinformatics refers
to the use of computers to seek patterns in the observed
biological data and to propose mechanisms for such patterns.
As can be seen from below, bioinformatics not only can
help us directly address experimental research objectives but
also can integrate information from various sources and seeks
patterns not achievable through experimentation alone.
1718 J . X U
amplified polymorphic DNA or RAPD, amplified fragment
length polymorphisms or AFLP, restriction fragment length
polymorphisms or RFLP, PCR-RFLP and PCR fingerprinting)
that have been applied to analyse microbial populations,
DNA sequence-based typing has many advantages. First,
nucleotides in a DNA sequence are unambiguous. Such
certainty is essential for many analyses. Second, nucleotides
in a given DNA fragment typically share extended evolutionary history. Such sharing cannot be assumed between
genetic markers in different parts of genomes as those
obtained with other methods. Third, DNA sequences can
be easily stored in and retrieved from public databases
such as GenBank. Existence of such public databases makes
data-sharing among investigators possible. Fourth, many
analytical tools for DNA sequences are available. Indeed,
many methods have been developed to infer a variety of
processes governing the changes in populations and
species (Xu 2005).
MLST has been used to study the ecological genetics of
many microbial populations. It provides fine-scale measures
of gene diversity and genotype diversity among microbial
populations. These patterns of diversity have been used to
infer a variety of ecological and evolutionary processes such
as gene flow, cryptic speciation, hybridization, and the
relative importance of clonality and recombination among
analysed populations (Box 4). In human pathogenic
bacteria where much of the initial MLST work was carried
out, MLST allows the identification of medically important
strains and clones. There are several recent topical reviews
for readers interested in MLST of human bacterial pathogens (e.g. Urwin & Maiden 2003; Feil & Enright 2004). In
contrast, other environmentally more relevant groups of
micro-organisms are less researched or discussed.
Using specific examples, the following two subsections
illustrate how MLST has been used to address microbial
ecological questions. The first subsection provides a brief
description on how MLST has been used to address
evolutionary divergence, dispersion, hybridization, and
the origin of a population in a soil basidiomycete fungus,
Cryptococcus neoformans. The second subsection highlights
recent evidence for recombination in natural populations
of viruses, bacteria, protozoa, algae and fungi.
Hybridization
Niche specialization
Host shifts
Adaptive evolution
MLST in C. neoformans.
C. neoformans (= Filobasidiella neoformans) is a soil fungus
that can cause significant infections in humans and other
mammals throughout the world. This species has been
traditionally classified into five serotypes A, B, C, D, and
AD. To understand the evolutionary relationships among
strains, geographic populations, and serotypes and to address
ecological genetic questions, a series of gene genealogybased studies were conducted. The first analysed 34 strains
from various locations around the world, including 14
serotype A strains, 7 serotype D strains, 3 serotype B strains,
5 serotype C strains, 3 serotype AD strains and 2 strains
whose serotypes could not be determined (Xu et al. 2000).
Fragments of four genes were analysed for each strain, three
from different chromosomes of the nuclear genome and one
from the mitochondrial genome. Phylogenetic analysis of each
of the four genes indicated considerable divergence among
serotypes A, D, B, and C, suggesting that individual serotypes
A, D, B, and C are good phylogenetic species (Fig. 1).
However, there was little geographic pattern of genetic
variation. No correlation between geographic distance and
DNA sequence divergence among strains was observed
either within a serotype or the whole analysed population.
The results are consistent with recent dispersals of C.
neoformans throughout the world (Xu et al. 2000; Xu 2002).
Strains of serotype AD were quite different from those of
strains A, B, C, and D. While most predominantly strains of
serotypes A, B, C, and D examined so far were haploids,
strains of serotype AD are diploid or aneuploid. Furthermore, direct sequencing of PCR products from serotype
AD strains often failed to obtain clear chromatograms and
DNA sequences. Such results suggested sequence heterogeneity within individual strains. To investigate their origin
and relationships to strains of other serotypes, alleles of
two different genes from strains of serotype AD were
individually cloned, sequenced and compared to strains of
serotypes A, B, C, and D (Xu et al. 2002; Xu & Mitchell
2003). Sequence comparisons revealed that most strains
contained two different alleles with one allele highly similar to the serotype A group and the other to the serotype D
group. Further phylogenetic analyses identified that these
serotype AD strains were recent hybrids between strains
of serotypes A and D, and that there have been multiple
hybridization events in C. neoformans (Fig. 2; Xu et al. 2002;
Xu & Mitchell 2003). A recent study applied the same
MLST method to identify the origin of a Cryptococcus population responsible for an unusual outbreak in animal and
human populations on Vancouver Island, British Columbia,
Canada (Kidd et al. 2005). The analyses suggested that the
Vancouver Island population contained at least two evolutionary divergent elements shared by strains from many
other geographic areas, consistent with cryptic speciation
and recent migration observed earlier for Cryptococcus (Xu
et al. 2000; Kidd et al. 2005).
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1719
Fig. 1 One most parsimonious tree for 34 isolates of Cryptococcus neoformans from each of the four gene regions sequenced. CI, consistency
index; RI, retention index. Numbers above each branch are bootstrap values > 50% and based on 500 replicates. For URA5 and LAC trees,
branches with > 50% of bootstrap values were also strict consensus branches. Strain designation indicates serotype, isolate name, and
geographic origin (CA, California; NYC, New York City; NC, North Carolina, all from the USA). With the exception of five strains (see text),
all major phylogenetic groups correspond to traditional classifications. Of the two serologically untypable strains, one (M0024) clustered
consistently with the serotype D group and the other (M0053) clustered consistently with the serotype A group. Two of the three strains of
serotype AD, CN110.97 and CN196.88, clustered consistently with the serotype A group, while the other (KW5) lacked a consistent affinity
with any of the serotypes. Scale bar represents one nucleotide substitution. (Xu et al. 2000). Reproduced by permission.
1720 J . X U
Fig. 2 One of the 10 most parsimonious
trees for the 28 LAC sequences from 14 strains
of serotype AD in Cryptococcus neoformans.
For comparison, five representative sequences
from serotype A (E1, CN-A, MMRL750, J10
and ZG280) and five from serotype D (B10,
CN-D, J9, MMRL751 and MMRL757) were
included in this figure. These 10 sequences
were shown in Fig. 1 and represented the
genetic diversity of serotypes A and D
strains. Numbers above branches are
bootstrap values > 50% and based on 1000
replicates. Designations for strains of serotypes A and D included the isolate name,
geographic origin (CA, California; NYC,
New York City, both in the USA), and
serotype. For the 28 serotype AD sequences,
strain designations are followed by 1 or
2 to indicate the two alleles within each
strain. Midpoint rooting is used for this
phylogeny but the tree topology is identical
to that when serotype B or C sequences were
used as outgroups. Scale bar represents
one nucleotide substitution (Xu et al. 2002).
Reproduced by permission.
Box 5 Genomic studies suggest all microbial populations have a clonal component. However, signatures
of recombination are pervasive in natural populations of viruses, bacteria, fungi, algae and protozoa.
Despite significant efforts, no ancient asexual microbes
have been convincingly demonstrated.
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1721
and occasionally across kingdoms and/or domains. Indeed,
signatures of horizontal gene transfer are ubiquitous
among the sequenced prokaryotic genomes (e.g. Koonin
2003).
Recombination in eukaryotic microbes. Similar to observations
in natural viral and bacterial populations, molecular
investigations have identified that almost all eukaryotic
microbial populations show signatures of recombination
in nature. Examples include those from the algal species
Bostrychia moritziana (West & Zuccarello 1999); pathogenic
protozoan species such as Trypanosoma cruzi (the causal
agent of African sleeping sickness, Bogliolo et al. 1996) and
the malaria parasites Plasmodium falciparum (Conway et al.
1999) and Plasmodium vivax (Putaporntip et al. 2002); fungal
species such as C. neoformans mentioned above (Xu &
Mitchell 2003). Interestingly, many of the fungal species
previously thought to reproduce only asexually (the
Deuteromycota or Fungi Imperfecti) have been found to
contain signatures of recombination in natural populations
(Xu 2005). Among examined fungi, the degrees of sexuality
differ greatly, from panmictic to largely clonal (James 2005;
Pujol et al. 2005; Xu et al. 2005). At present, plant and human
pathogens dominate the examined species in the literature.
However, limited evidence from other groups of fungi
suggests a similar pattern: abundant evidence for clonality
and limited but unambiguous evidence for recombination
(James 2005; Pujol et al. 2005; Xu et al. 2005).
1722 J . X U
Fig. 3 Overview of the representational
difference analysis of genomic differences
between strains of Sinorhizobium meliloti
(modified from Guo et al. in press). Tester
(T): ATCC9930. Driver (D): Rm1021. Filled
black boxes: DNA adaptors. Unfilled boxes:
tester DNA. Shaded boxes: driver DNA.
Fig. 4 Application of micro-array in the analysis of genomic differences between strains of Sinorhizobium meliloti. In this figure, red
represents hybridization signal from one strain; green represents hybridization signal from a different strain; and yellow represents that
both strains have the probe sequence. In each of the four subarrays, there are three vertically divided repeats. As can be seen from the arrays,
repeatability is high of using micro-array to screen for gene content differences among strains.
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1723
Box 7 Genomic analysis of natural microbial communities are revealing extremely rich and highly
variable DNA sequences from forest soils, pastures,
aquatic environments in both pristine and contaminated environments. Bioinformatic analyses of such
sequences suggest the existence of many uncultured
taxonomic groups of viruses, bacteria, archaea, fungi
and protozoa.
1724 J . X U
five main phyla: Chytridiomycota, Zygomycota, Glomeromycota, Basidiomycota, and Ascomycota (Moncalvo 2005).
Several recent studies of environmental DNA identified
major groups of unexpected fungal diversity in a variety of
environments. For example, in the analysis of fungal DNA
from the roots of the grass Arrhenatherum elatius, Vandenkoornhuyse et al. (2002) found 49 unique phylotypes from
a random library of 200 18S rRNA clones. Surprisingly,
only 7 of the 49 were found closely related to known sequences (> 99% identity). They found five distinct lineages
significantly different from all known fungal sequences (in
a pool of over 1200 at their time of analysis). In another
study by Schadt et al. (2003), culture-independent methods
were used to assess the seasonal dynamics of fungal
diversity in tundra soil in Colorado. Results revealed three
major groups of fungi significantly different from existing
classes and phyla. Their results also demonstrated that
fungi account for the majority of the biomass under snow
in the analysed environment (Schadt et al. 2003). Results
from these and other fungal community studies suggest
that there are likely over 1.5 million species of fungi in
Earths biosphere, a number about 20 times of the currently
named fungal species.
Viruses. Viruses are extremely abundant in natural environments. They contribute significantly to both prokaryote
and eukaryote population dynamics. Current cultureindependent studies identified that both DNA-based and
RNA-based viruses are common in terrestrial as well as
freshwater and marine environments (Edwards & Rohwer
2005). For example, in an analysis of picorna-like viruses (a
group of positive-sense single-stranded RNA viruses that
are major pathogens to plants and animals), Culley et al.
(2003) identified high, unexpected diversity in the sea.
Indeed, all of the picorna-like sequences from marine
samples were different from known picorna-like viruses
in the databases. Of specific note is a virus isolated in this
study that is a lytic pathogen to a toxic-bloom-forming alga
Heterosigma akashiwo. This result suggests that picorna-like
viruses may be important contributors in the regulation of
marine phytoplankton population dynamics.
Metagenomics
Metagenomics refers to the study of the collective genomes
in an environmental community. Such a community may
be a soil or a marine water sample that contains substantially more genetic information than is available in the
cultured subset. Studies of metagenomes typically involve
cloning fragments of DNA isolated directly from microbes
in natural environments, followed by sequencing and
functional analysis of the cloned fragments. While most of
the techniques for metagenomics have existed for quite
some time and are used routinely in molecular biology
research, their application in analysing unknown environmental DNA samples have opened a floodgate of exciting
research findings.
The phylogenetic analysis of environmental microbial
diversity was an early form of metagenomics. Over the
years, several significant trends for metagenomic studies
have emerged. First, the cloned DNA fragments have been
getting larger and larger in attempts to clone long stretches
of DNA from the same genome to allow the study of the
structure and function of potentially whole unknown/
uncultured genomes in the environments. Such an objective has propelled the development of new DNA isolation
methods as well as improved cloning systems. At present,
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1725
1726 J . X U
availability) within the soil, pH and the availability of
various nutrients. In addition, unlike aquatic habitats, soil
surfaces may undergo dramatic daily or seasonal cyclic
changes in its physicalchemical properties. Such spatial
and temporal environmental microheterogeneity poses
significant challenges for microbial ecologists. However,
recent investigations especially those based on cultureindependent approaches are revealing the amazing diversities of micro-organisms in the soil.
Many studies of soil microbial diversity have been
carried out. Based on a variety of culture-independent
methods, current estimates indicate that a single gram of
soil may contain over 10 billion microbial cells representing several thousand to over a million distinct genomic
species (e.g. Torsvik et al. 2002; Gans et al. 2005). This number
is remarkable given that the total number of known
prokaryotes listed in the website of the National Center
for Biotechnology Information is about 17 000 (including
uncultured prokaryotes). Comparisons of culture-dependent
and independent methods revealed that in most soil
environmental samples, only 0.11% of microbial species
are cultured by standard microbiological methods. Therefore, a tremendous amount of microbial genetic, physiological and metabolic diversities in the soil remain to be
discovered and explored. Significant efforts are underway
to clone and analyse the soil metagenome diversity. Daniel
(2005) summarized the studies of soil metagenomic libraries
constructed to date. These libraries include soil samples
from a variety of ecological niches, including meadows,
crop fields, and forests.
Functional analyses of the soil metagenome are typically
conducted by one of two approaches. The first is based on
nucleotide sequences using either PCR or target-specific
probes to screen the soil metagenome library. This approach
has been used successfully to clone genes with highly
conserved domains, e.g. the gluconic acid reductase, an
essential enzyme during glucose metabolism (Eschenfeldt
et al. 2001). The second approach is based on functional
screening for metabolic activity of metagenomic clones.
Several novel genes coding for proteases, lipases, amylases,
agarases, alcohol oxidoreductases, antibiotics, and antibiotic resistance have been found through this screening
(Voget et al. 2003). Some of these products hold great
commercial potential and are actively pursued by biotechnology companies.
Metagenomic analysis of a microbial community from an acid
mine drainage. Acid mine drainages are seminatural
environments rich in extremophiles. These drainages are
created as a result of mining and the exposure of predominantly ferrous iron in pyrite (FeS2) to the oxygen-rich
atmosphere. Iron is one of the most abundant elements in
Earths crust and exists naturally in two oxidative states,
ferrous (Fe2+) and ferric (Fe3+). In nature, these two forms
cycle as a result of reduction and oxidation by microorganisms and by abiotic geochemical processes. The
reduction of Fe3+ to Fe2+ occurs in anoxic environment (e.g.
bogs and waterlogged soil) by bacteria such as Shewanella
putrefaciens, with organic compounds in these environments
acting as the electron donor. In contrast, the oxidation
occurs in oxygenic environment with O2 as the electron
acceptor. Though the released energy is small during
oxidation, several groups of chemolithotrophic organisms
(e.g. Acidithiobacillus ferrooxidans and Leptospirillum
ferrooxidans) can actively participate in the reaction and
thrive in such environments by oxidizing a large amount
of ferrous iron. Because pyrite (FeS2) is one of the most
common forms of iron in nature, the oxidation of pyrite
will release large amounts of sulphate ( SO2
4 ) and sulfuric
acid, allowing the development of acid conditions in the
surrounding environment with pH values as low as 0.
Mixing of acidic mine water with natural waters in rivers
and lakes causes major environmental problems.
The metagenomic analyses of a single biofilm sample
from an acid mine drainage from the Richmond Mine at Iron
Mountain, California, have provided important insights
into the microbial community structure (Tyson et al. 2004).
From the 78 Mb sequences obtained from this sample, the
genomes of the dominant species were constructed. These
included the dominant bacterium Leptospirillum group II
(10X coverage) and the dominant Archaeon, Ferroplasma
acidarmanus (also 10X coverage). Ferroplasma is a group of
cell wall-less prokaryotes. These two species were also
found to be dominant in this community by other analytical
methods. In addition to the above two genomes, other
reconstructed partial genomes were also identified,
including that of a group III Leptospirillum (3X coverage),
and an unknown species in the genus Sulfobacillus (0.5X
coverage) that is closely related to the cultured Sulfobacillus
thermosulfidooxidans.
Bioinformatics analyses of the metagome sequence data
identified several interesting results. First, the Leptospirillum
group III strain was found to contain genes homologous to
those for biological nitrogen fixation. This knowledge subsequently led to the design of a selective isolation strategy
that allowed the isolation of this organism (Allen & Banfield
2005). Second, genes involved in essential pathways (such
as nitrogen and carbon dioxide fixation and iron metabolism) in the above chemolithoautotrophs were revealed.
Third, the genomic sequence data identified genetic polymorphisms for many genes and suggested evidence for
genetic recombination in the Ferroplasma acidarmanus
population of this community. The metagenome sequence
information established a solid foundation for fine-scale
comparisons of microbial communities. In addition, a
recent proteomic analysis of this community identified an
abundant novel protein, a cytochrome, as an essential component to iron oxidation and acid mine drainage formation
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1727
(Ram et al. 2005). These results have the potential to guide
the remediation of sites contaminated by acid mine
drainages.
Micro-arrays
Micro-array technology is a powerful, high throughput
experimental system that allows the simultaneous analysis
of thousands to hundreds of thousands of genes at the
same time. Originally developed for monitoring wholegenome gene expressions, micro-arrays have been used
for other purposes such as the genome-wide mutational
screening for single nucleotide polymorphisms and the
distributions of species and strains in natural microbial
communities. Recently, several types of micro-arrays have
been developed and evaluated for bacterial detection
and microbial community analysis. These arrays include
(i) phylogenetic oligonucleotide arrays that contain
signature sequences from rRNA of specific groups of
organisms; (ii) community genome arrays that contain
highly specific signature gene sequences from known
cultured microbial species; and (iii) functional gene arrays
that contain conserved domains of genes involved in
specific metabolic pathways such as the biogeochemical
cycling of carbon, nitrogen, sulphate, phosphate and metals
(Zhou 2003). The number of genes and the sizes of arrayed
DNA fragments in the functional gene arrays can vary
according to analytical purposes.
Preliminary evaluations suggested micro-arrays have a
great potential for the detection, identification and characterization of micro-organisms in natural habitats (Wu et al.
2004). For example, Loy et al. (2002) constructed a microarray with 132 16S rRNA-targeted oligonucleotide probes
(18 nucleotides long) representing all recognized groups
of sulphate-reducing prokaryotes and showed that this
micro-array could be used to distinguish most of the
reference strains. Using this array, they determined the
diversity of sulphate-reducing prokaryotes in periodontal
tooth pockets and a hypersaline cyanobacterial mat. Results
from the micro-array study were similar to those from
cloning and sequencing of environmental 16S rRNA. These
Box 9 The published 250 prokaryotic genomes as of September 2005 suggest several general features of these
genomes relevant to microbial ecology:
1.
2.
3.
Prokaryotic genomes are highly variable in genome size and gene content among strains from both within and
between species.
Microbial species with narrow ecological niches generally have smaller genomes than those with broader
ecological niches.
A large fraction (20 40%) of identified open reading frames in sequenced microbial genomes code for proteins
with unknown functions.
Most of these genes are likely regulated by ecological-niche specific factors.
1728 J . X U
both genome size and gene content (Table 1). Among the
completely sequenced and annotated 250 unique prokaryotic genomes (four strains were sequenced twice for a total
of 254 completed genomes as of August 2005), the genome
sizes vary by over 18 folds, from the smallest archaeon
Nanoarchaeum equitans (0.49 Mb, Waters et al. 2003) to the
largest Streptomyces avermitilis (9.12 Mb, Omura et al. 2001).
The genome sizes vary not only among species but also
among strains within individual species. An example is the
common Escherichia coli where whole genome sequences of
four strains are now available: the model laboratory strain
K12, the enterohemorrhagic O157:H7 RIMD and O157:H7
EDL933, and the uropathogenic CFT073 (Parkhill &
Thomson 2004). While all three pathogenic strains have
genomes essentially colinear with each other and with the
nonpathogenic K12, both the genome size and gene content vary considerably among the four strains. For example, the two pathogenic O157:H7 strains have genomes
over 5.5 Mb, almost 1 Mb bigger than that of strain K12
(4.6 Mb) and about 300 kb bigger than that of strain CFT073
(5.2 Mb). Furthermore, about 25% of the genes in the pathogenic O157:H7 strains were not found in strain K12.
When all four strains are considered, only about 3000 of the
total genes were shared from the total of 4288, 5349, 5361
and 5379 predicted protein-coding genes, respectively,
for strains K12, O157:H7 RIMD, O157:H7 EDL933 and
CFT073, respectively. Most of these extra genes have
unusual sequence characteristics and were likely obtained
through horizontal gene transfer events from external
sources and by the action of mobile genetic elements. Some
of these genes play important roles in their ecological
adaptation, including adhesion to specific host cell types.
Comparisons between strains in other human pathogenic
bacteria (e.g. Streptococcus pneumoniae and Burkholderia
cepacia) as well as the nonpathogenic plant symbiont Si.
meliloti revealed similarly highly variable genome size and
gene contents (Fraser et al. 2004; Guo et al. 2005; Sun S.,
unpublished). At present, population-level studies of
genome size and gene content variations are still very
limited to human pathogens.
Second, species with narrow ecological niches (e.g. obligate human pathogens) on average have smaller genomes
than those capable of living in diverse ecological conditions
(Table 1). For example, the obligate intracellular pathogen
Mycoplasma genitalium has a genome size of 580 kb (encoding 484 genes) and that of the amphids Buchnera aphidicola
has a genome size of 650 kb (504 genes). These genomes
lack many of the genes essential for metabolic functions in
many free-living organisms. The deletion and degeneration
of such genes were likely due to their nonessential functions in obligate parasites because the hosts can provide
such resources to the cells. Indeed, in several obligate intracellular parasites such as Rickettsia prowazekii and Rickettsia
conorii, there is evidence that their genomes are in the
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1729
environments? And, how best to use microbial ecological
data gained through genomic analysis in practical applications such as mining, environmental remediation, the control
of infectious diseases, the modulation of the global climate,
and the production of biotechnology goods and services?
To address these questions, an interdisciplinary systems
approach is needed. This approach requires the integration
of the analyses at various levels of ecological organization,
from subcellular and cellular levels to those of individuals,
populations, communities and ecosystems. The approach
also requires the development and complementary analysis of biological variations at the genome, transcriptome,
proteome and metabolome levels. Indeed, the American
Society of Microbiology has issued a call to create systems
microbiology and systems microbial ecology to coordinate
such efforts and to set it a priority area for future development (Buckley 2005). There is no doubt that such coordinated efforts will reveal many exciting new discoveries.
Acknowledgements
I thank Dr Hong Guo for preparing Figs 3 and 4 and Dr Turlough
M. Finan for comments on the manuscript. During the preparation
of this review, research in my lab is supported by the Natural
Sciences and Engineering Research Council (NSERC) of Canada,
the Ontario Premiers Research Excellence Award, and Genome
Canada.
References
Allen EE, Banfield JF (2005) Community genomics in microbial
ecology and evolution. Nature Reviews. Microbiology, 3, 489498.
Allen NL, Hilton AC, Betts R, Penn CW (2001) Use of representational difference analysis to identify Escherichia coli O157specific DNA sequences. FEMS Microbiology Letters, 197, 195
201.
Andersson SGE (2004) Obligate intracellular pathogens. In:
Microbial Genomes (eds Fraser CM, Read TD, Nelson KE),
pp. 291308. Humana Press, Totowa, New Jersey.
Bart A, Dankertvan J, der Ende A (2000) Representational difference
analysis of Neisseria meningitidis identifies sequences that are
specific for the hyper-virulent lineage III clone. FEMS Microbiology Letters, 188, 111114.
Beja O, Aravind L, Koonin EV, Suzuki MT et al. (2000) Bacterial
rhodopsin: evidence for a new type of phototrophy in the sea.
Science, 289, 19021906.
Beja O, Spudich EN, Spudich JL, Leclerc M, DeLong EF (2001)
Proteorhodopsin phototrophy in the ocean. Nature, 411, 786
789.
Beja O, Suzuki MT, Heidelberg JF et al. (2002) Unsuspected diversity among marine aerobic anoxygenic phototrophs. Nature,
415, 630633.
Bergthorsson U, Ochman H (1995) Heterogeneity of genome sizes
among natural isolates of Escherichia coli. Journal of Bacteriology,
177, 57845789.
Bergthorsson U, Ochman H (1998) Distribution of chromosome
length variation in natural isolates of Escherichia coli. Molecular
Biology and Evolution, 15, 6 16.
Blochl E, Rachel R, Burggraf S, Hafenbradl D, Jannasch HW,
2006 Blackwell Publishing Ltd, Molecular Ecology, 15, 17131731
1730 J . X U
Hugenholtz P, Goebel BM, Pace NR (1998) Impact of cultureindependent studies on the emerging phylogenetic view of
bacterial diversity. Journal of Bacteriology, 180, 47654774.
James TY (2005) The population genetics of phycomycetes. In:
Evolutionary Genetics of Fungi (eds Xu J), pp. 117148 Horizon
Biosciences, Norfolk, UK.
Kanagawa K, Oishi M, Negoro S, Urable I, Okada H (1993)
Characterization of the 6-aminohexanoate-dimer hydrolase from
Pseudomonas sp. Nk87. Journal of General Microbiology, 139, 787795.
Karner MB, DeLong EF, Karl DM (2001) Archaeal dominance in
the mesopelagic zone of the Pacific Ocean. Nature, 409, 507510.
Keese P, Gibbs A (1993) Plant viruses: master explorers of evolutionary space. Current Opinion in Genetics and Development, 3,
873877.
Kidd SE, Guo H, Bartlett K, Xu J, Kronstad JW (2005) Comparative
gene genealogies indicate that two clonal lineages of Cryptococcus
gattii in British Columbia resemble strains from other geographical areas. Eukaryotic Cell, 4, 16291638.
Konstantinidis K, Tiedje JM (2004) Microbial diversity and genomics. In: Microbial Functional Genomics (eds Zhou J, Thompson DK,
Xu Y, Tiedje JM), pp. 2140. John Wiley & Sons, New Jersey.
Konstantinidis KT, Tiedje JM (2005) Genomic insights that
advance the species definition for prokaryotes. Proceedings of the
National Academy of Sciences, USA, 102, 25672572.
Koonin EV (2003) Horizontal gene transfer: the path to maturity.
Molecular Microbiology, 50, 725727.
Lehner A, Loy A, Behr T et al. (2005) Oligonucleotide microarray
for identification of Enterococcus species. FEMS Microbiology Letters,
246, 133142.
Lenski RE (1993) Assessing the genetic structure of microbial
populations. Proceedings of the National Academy of Sciences, USA,
90, 43344336.
Lisitsyn N, Lisitsyn N, Wigler M (1993) Cloning the differences
between two complex genomes. Science, 259, 946951.
Liu Y, Zhou J, Omelchenko MV, Beliaev AS, Venkateswaran A,
Stair J, Wu L, Thompson DK, Xu D, Rogozin IB, Gaidamakova EK,
Zhai M, Makarova KS, Koonin EV, Daly MJ (2003) Transcriptome dynamics of Deinococcus radiodurans recovering from
ionizing radiation. Proceedings of the National Academy of Sciences,
USA, 100, 41914196.
Loy A, Lehner A, Lee N et al. (2002) Oligonucleotide microarray
for 16S rRNA gene-based detection of all recognized lineages
of sulfate-reducing prokaryotes in the environment. Applied
Environmental Microbiology, 68, 5064 5081.
Loy A, Schulz C, Lucker S et al. (2005) 16S rRNA gene-based
oligonucleotide microarray for environmental monitoring of
the betaproteobacterial order Rhodocyclales. Applied and
Environmental Microbiology, 71, 1373 1386.
Maiden MC, Bygraves JA, Feil E et al. (1998) Multilocus sequence
typing: a portable approach to the identification of clones within
populations of pathogenic microorganisms. Proceedings of the
National Academy of Sciences, USA, 95, 31403145.
Margulies M, Egholm M, Altman WE et al. (2005) Genome
sequencing in microfabricated high-density picolitre reactors.
Nature, 437, 376380.
Mayden RL (1997) A hierarchy of species concepts: the denouement in the saga of the species problem. In: Species: The Unit of
Biodiversity (eds Claridge MF, Dawah HA, Wilson MR), pp. 381
424. Chapman & Hall, London.
Maynard Smith J, Smith NH, ORourke M, Spratt BG (1993) How
clonal are bacteria? Proceedings of the National Academy of Sciences,
USA, 90, 43844388.
M I C R O B I A L E C O L O G I C A L G E N O M I C S 1731
Vandenkoornhuyse P, Baldauf SL, Leyval C, Straczek J, Young JP
(2002) Extensive fungal diversity in plant roots. Science, 295,
2051.
Venter JC, Remington K, Heidelberg JF et al. (2004) Environmental
genome shotgun sequencing of the Sargasso Sea. Science, 304,
6674.
Voget S, Leggewie C, Uesbeck A, Raasch C, Jaeger KE, Streit WR
(2003) Prospecting for novel biocatalysts in a soil metagenome.
Applied and Environmental Microbiology, 69, 6235 6242.
Waters E, Hohn MJ, Ahel I et al. (2003) The genome of Nanoarchaeum
equitans: insights into early archaeal evolution and derived
parasitism. Proceedings of the National Academy of Sciences, USA,
100, 1298412988.
West JA, Zuccarello GC (1999) Biogeography of sexual and
asexual populations in Bostrychia moritziana (Rhodomelaceae,
Rhodophyta). Phycological Research, 47, 115 123.
Woese CR (1987) Bacterial evolution. Microbiological Reviews, 51,
221271.
Wu L, Thompson DK, Liu X et al. (2004) Development and
evaluation of microarray-based whole-genome hybridization
for detection of microorganisms within the context of environmental applications. Environmental Science and Technology, 38,
67756782.
Xu J (2002) Mitochondrial DNA polymorphisms in the human
pathogenic fungus Cryptococcus neoformans. Current Genetics, 41,
4347.
Xu J (2004) The prevalence and evolution of sex in microorganisms.
Genome, 47, 775780.
Xu J (2005) Evolutionary Genetics of Fungi. Horizon Biosciences,
Norfolk, UK.
Xu J, Cheng M, Tan Q, Pan Y (2005) Molecular population genetics
of basidiomycete fungi. In: Evolutionary Genetics of Fungi (eds Xu
J), pp. 221252. Horizon Biosciences, Norfolk, UK.
Xu J, Mitchell TG (2003) Comparative gene genealogical analyses