Académique Documents
Professionnel Documents
Culture Documents
com/
in Dental Research
Published by:
http://www.sagepublications.com
On behalf of:
International and American Associations for Dental Research
Additional services and information for Advances in Dental Research can be found at:
Subscriptions: http://adr.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
What is This?
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.
81
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.
Currently, there are two approaches that are used in microbial The information obtained by NGS is not targeted to a priori-
community sequence data analysis. In the analysis based on selected taxa as, e.g., in the microarray approach. This allows
OTUs, each OTU is treated equally. This way, poorly defined for open-ended view on a whole breadth of the microbiome and
microbial ecosystems can be compared without the need to provides a new opportunity for oral-health-related studies.
exclude the undefined, potentially novel phylotypes that cannot Research questions on microbial stability and ecological shifts
be assigned to a consensus taxonomy lineage. Another approach due to environmental and host factors can now be addressed on
is based on phylogeny. Here, the phylogenetic information of a full scale of community complexity.
each OTU is used to account for the degree of divergence
between sequences. It has been shown that variances in 16S CONs of Amplicon Sequencing in
rRNA sequences correlate positively with phenotypic variances
of microbiota (Nubel et al., 1999). By the phylogenetic approach,
Microbiome Studies and Ways to Increase
different communities or sample types are compared based on the Success Rate
their evolutionary distances. The sequencing throughput of NGS provides exciting opportu-
Regardless of which of the two approaches is used, there is a nities for research. The down side of this enthusiasm is that the
broad range of diversity measures that can be applied to the quality of data generated is often undermined by failure to
obtained sequence dataset. Alpha diversity regards diversity address fundamental aspects of experimental design (Rogers
within each sample, e.g., species richness, species evenness, and and Bruce, 2010). Only 18% of sequencing studies published in
diversity. The oral microbiome, for instance, represents a micro- 2009 in major ecological journals analyzed replicate samples
bial ecosystem with low evennessfew taxa such as strepto- (Prosser, 2010). This may be due to a belief that these cutting-
cocci, veillonellae, and prevotellae dominate the samples edge technologies are exempt from normal standards, and that
obtained from dental plaque or saliva (Keijser et al., 2008; Cri- the costs involved justify the lack of proper design (Prosser,
elaard et al., 2011). Alpha diversity can be used to describe the 2010). With the decreasing costs of sequencing and the avail-
stability of the particular ecosystem. Highly diverse ecosystems ability of sample identification tags, lack of sample replicates
are considered more stable or healthier than communities that should not be accepted. As with any conventional studies, statis-
are dominated by few taxa. tics should be planned in advance, before the start of the study.
Beta diversity considers the differences (both qualitative and The major disadvantage of current NGS used for amplicon
quantitative) between different environments. Beta diversity tests sequencing is the short read length. This precludes accurate taxo-
include statistical comparisons that allow for the assessment of nomic identification and results in low taxonomic resolution.
association of a certain microbial profile with a certain clinical Most reads can be identified to genus level, but only a fraction to
status. One of the widely accepted methods, UniFrac (Lozupone species level. This depends on 16S rRNA gene sequence homol-
and Knight, 2005), is a phylogeny-based method that can be used ogy among the members of the same genus. Depending on a
to detect qualitative (Unweighted UniFrac) and quantitative dif- hypervariable region of the 16S rRNA gene that is targeted, dif-
ferences (Weighted UniFrac) among the different sample groups ferent taxa are either missed or can be classified only at a higher
(Lozupone et al., 2007). Making sense out of this data deluge is taxonomic level (family, class, or even phylum). The problem can
and will be the major challenge. Next to microbiology, training in be diminished by in silico analyses of 16S rRNA sequences prior
bioinformatics and molecular microbial ecology will become to selection of the target region (Brandt et al., 2012). However,
mandatory for the researchers of today and tomorrow. some taxa, like several closely related streptococci, will be hard
or even impossible to distinguish by 16S gene sequence. In that
PROS of Amplicon Sequencing by the Ngs case, either a more specific gene instead of the small subunit
Approach rRNA gene should be targeted, or a full metagenomic sequencing
(described below) should be applied.
Amplicon sequencing by the NGS approach provides high- As any other molecular-biology-based method, NGS suffers
throughput sequence information at an exceptional depth. from typical bias of DNA extraction and amplification (Hong
Hundreds of thousands of sequences can be obtained from a et al., 2009; Nadkarni et al., 2009), such as selectivity of prim-
single sample (Keijser et al., 2008), or, if required, specific ers and intrinsic differences in the amplification efficiency of
nucleotide barcodes can be added to mark each sample. This templates. Unfortunately, bias by DNA amplification by poly-
will reduce not only the sequencing depth but also, simultane- merase chain-reaction (PCR) can be avoided only by avoiding
ously, the costs incurred per individual sample. There is a trade- the amplification step itself. This would mean choosing the full
off, however. By reducing the sequencing depth, one might lack metagenomic sequencing approach instead of amplicon sequenc-
the discriminatory power for effects of subtle interventions, ing of hypervariable regions of the 16S rRNA gene.
such as supplementation with pre- and probiotics. Major eco- The quality of the DNA extraction protocol should be evalu-
logical shifts, such as effects of potent antimicrobials, would ated in advance. In standard protocols, all DNA, including DNA
still be discernible at a relatively low sequencing depth from dead or damaged cells and extracellular matrix, is used in
(Kuczynski et al., 2010). The actual number of reads per sam- the downstream analysis. In intervention studies, where treat-
ple, however, needs to be validated per intervention. ment has an antimicrobial potential, it would lead to underesti-
The results of NGS are not biased by fast-growing, easily mation of the effects of the intervention. The solution lies in the
cultivable taxa, and allow for hypothesis-driven research on removal of all DNA that does not originate from intact cells
previously unknown and unclassified micro-organisms. before the DNA extraction process. One such approach is to
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.
Figure. Flow diagram of two next-generation sequencing approaches for microbiome studies. (A) Sequencing of DNA amplicons targeting spe-
cific 16S rRNA gene fragments (hypervariable regions). The obtained sequence data are compared with sequences in small subunit ribosomal
RNA gene databases and are used for taxonomic profiling and diversity analyses. (B) Direct sequencing of random DNA fragments, also called
metagenomic shotgun sequencing. A sequence contig is a contiguous, overlapping sequence resulting from the re-assembly of the small DNA frag-
ments. The obtained sequence data are compared with full-genome reference databases and are used to describe the predominant functions of the
microbial communities, as well as to identify the microbial taxa. Steps indicated in gray differ between the two methods.
incubate samples with propidium monoazide (PMA) (Loozen Another cleaning step besides removing the noise is iden-
et al., 2011), where PMA will bind to extracellular DNA and to tification and removal of chimeric sequences (Edgar et al.,
DNA of cells that have lost their structural integrity. By expo- 2011; Haas et al., 2011). Chimeras are sequences that are cre-
sure to visible light, this reaction becomes irreversible and ren- ated in the PCR amplification process from two or more tem-
ders the PMA-bound DNA unable to act as a template for PCR plates instead of a single parent template. Exact reasons for
amplification (Rogers and Bruce, 2010). Membrane integrity, chimera formation are not well-understood, but it has been
however, is considered a conservative criterion for microbial associated with PCR conditions, fragment length, and, possibly,
viability, and other approaches related to cell activity have been sample composition.
proposed (Nocker and Camper, 2009). After the cleaning steps, the sequences are clustered into
Any sequencing method, but especially high-throughput OTUs at a predetermined similarity level, usually 97%. It has
NGS technologies, suffers from sequencing errors (Kunin et al., been demonstrated that the specific choice of the clustering
2010; Balzer et al., 2011). Although newest-generation tech- algorithm affects the output by either under- or overestimating
nologies are able to generate up to 1,000-bp-long reads, there is the diversity of the sample (Sun et al., 2011). New, improved
a trade-off in positive correlation between the read length and algorithms appear as we speak, and preclude direct comparison
the error rate. Data pre-processing after the sequencing is used with earlier results. For that, all the steps of the pipelinefrom
to minimize the inaccuracy of the output. Low-quality reads, sequence preprocessing until cleaning and clusteringshould
reads below or above the certain length cutoff, and reads with be performed at once.
ambiguous base call and homopolymers are usually filtered out Thus far, statistical data analyses have overlooked the fact
from the dataset. After these pre-processing steps, the dataset is that microbiome profiles are not representations of absolute
frequently used for community analyses, while more recent measurements (e.g., microbial counts), but are typical examples
reports indicate the need for additional cleaning steps (Kunin of compositional data, which are in the form of relative propor-
et al., 2010). PyroNoise (Quince et al., 2009) and Denoiser tions of taxa. These taxa can be either observed or missed,
(Reeder and Knight, 2010) are two examples of the tools that are depending on the sampling effort, amplification bias, and
applied to remove the sequencing noise. In this process, how- sequencing depth. An increase in the relative abundance of some
ever, valid, low-abundance sequences are also de-noised to taxa will be accompanied by a compositional decrease of other
appear more similar to reads that are found at a higher abun- taxa. This may lead to statistically significant, though spurious,
dance and are assumed to be without errors. This, in turn, may correlations without any biological dependence among the taxa
lead to clustering of the sequences into lower numbers of OTUs involved. Computational tools suitable for compositional data
and to underestimation of sample diversity. analyses should be developed and implemented.
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.
Direct Shotgun Sequencing of Collective Balzer S, Malde K, Jonassen I (2011). Systematic exploration of error
sources in pyrosequencing flowgram data. Bioinformatics 27:i304-i309.
Genome of the Microbiome (Metagenome) Belda-Ferre P, Alcaraz LD, Cabrera-Rubio R, Romero H, Simon-Soro A,
Mira A, et al. (2011). The oral metagenome in health and disease. ISME
Amplicon sequencing (sequencing of a targeted fragment of 16S J 6:46-56.
rRNA gene) provides information on only the microbial taxa Brandt BW, Bonder MJ, Huse SM, Zaura E (2012).TaxMan: a server to trim
(the players) in the community. With NGS technologies, full rRNA reference databases. Nucl Acids Res [Epub ahead of print
genomic DNA can be sequenced without the targeting step and 5/22/2012] (in press).
Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, et al. (2009). The
without PCR bias of the amplicon sequencing (Fig., B). Instead Ribosomal Database Project: improved alignments and new tools for
of sequencing a single isolated clone (as with the traditional rRNA analysis. Nucleic Acids Res 37:D141-D145.
cloning and sequencing approach), the genome of the entire Crielaard W, Zaura E, Schuller A, Huse S, Montijn R, Keijser BJ, et al.
community (metagenome) is sequenced. For this, extracted (2011). Exploring the oral microbiota of children at various develop-
DNA is sheared into random fragments and sequenced directly. mental stages of their dentition in the relation to their oral health. BMC
Med Genomics 4:22.
Contaminations, e.g., human DNA, should be removed either Dewhirst FE, Chen T, Izard J, Paster BJ, Tanner AC, Yu WH, et al. (2010).
prior to (Hunter et al., 2011) or after the sequencing step by The human oral microbiome. J Bacteriol 192:5002-5017.
filtering the data. Then, exhaustive bioinformatics steps follow, Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R (2011). UCHIME
where DNA fragments need to be assembled into genomes improves sensitivity and speed of chimera detection. Bioinformatics
27:2194-2200.
against a reference genome database, followed by functional
Haas BJ, Gevers D, Earl AM, Feldgarden M, Ward DV, Knight R, et al.
gene annotation (Mitra et al., 2011; Simon and Daniel, 2011). (2011). Chimeric 16S rRNA sequence formation and detection in
The major difficulty in the approach is genome assembly while Sanger and 454-pyrosequenced PCR amplicons. Genome Res 21:494-
information is incomplete. First, the sampling is incomplete, and 504.
most genomes are partially sequenced, if at all. Sequences Henry CS, Overbeek R, Xia F, Best A, Glass E, Gilbert J, et al. (2011).
Connecting genotype to phenotype in the era of high-throughput
originating from predominant micro-organisms will dominate sequencing. Biochim Biophys Acta 1810:967-977.
the data, while less abundant species might be missed if Hong S, Bunge J, Leslin C, Jeon S, Epstein SS (2009). Polymerase chain
sequencing depth is not sufficient. Second, the information on reaction primers miss half of rRNA microbial diversity. ISME J 3:1365-
individual genomes (reference genomes) is often incomplete, 1373.
lacks accuracy, and has inconsistent annotations, and may pre- Hunter SJ, Easton S, Booth V, Henderson B, Wade WG, Ward JM, et al.
(2011). Selective removal of human DNA from metagenomic DNA
clude the mapping of individual reads to the species of origin. samples extracted from dental plaque. J Basic Microbiol 51:442-446.
Nevertheless, impressive insights have already been obtained in Keijser BJ, Zaura E, Huse SM, van der Vossen JM, Schuren FH, ten Cate
research on the gastrointestinal tract (Arumugam et al., 2011). JM, et al. (2008). Pyrosequencing analysis of the oral microflora of
The stage in the oral field has just been set (Xie et al., 2010; healthy adults. J Dent Res 87:1016-1020.
Belda-Ferre et al., 2011). Kuczynski J, Costello EK, Nemergut DR, Zaneveld J, Lauber CL, Knights
D, et al. (2010). Direct sequencing of the human microbiome readily
reveals community differences. Genome Biol 11:210.
Toward the Interactome Kunin V, Engelbrektson A, Ochman H, Hugenholtz P (2010). Wrinkles in
the rare biosphere: pyrosequencing errors can lead to artificial inflation
Future developments should bring us toward translating the of diversity estimates. Environ Microbiol 12:118-123.
Loozen G, Boon N, Pauwels M, Quirynen M, Teughels W (2011). Live/dead
genotype into phenotype (Henry et al., 2011). The question that
real-time polymerase chain reaction to assess new therapies against
we should answer is What are they doing? instead of just dental plaque-related pathologies. Mol Oral Microbiol 26:253-261.
wondering Who is there? The expertise of researchers will Lozupone C, Knight R (2005). UniFrac: a new phylogenetic method for
need to go beyond the field of microbiology and cariology, and comparing microbial communities. Appl Environ Microbiol 71:8228-
will have to apply systems ecology principles. A complex and 8235.
Lozupone CA, Hamady M, Kelley ST, Knight R (2007). Quantitative and
integrated systems biology and ecology approach should bring qualitative diversity measures lead to different insights into factors
us closer to understanding the underlying forces that facilitate that structure microbial communities. Appl Environ Microbiol 73:1576-
the stability (or imbalance) of the microbiome. The integration 1585.
of bacterial, viral, and fungal meta-omes such as the meta- Mitra S, Rupek P, Richter D, Urich T, Gilbert J, Meyer F, et al. (2011).
transcriptome, meta-proteome, and meta-metabolome, together Functional analysis of metagenomes and metatranscriptomes using
SEED and KEGG. BMC Bioinformatics 12(Suppl 1):S21.
with the host as a major co-factor, should be the ultimate goal in Nadkarni MA, Martin FE, Hunter N, Jacques NA (2009). Methods for opti-
unraveling the complexity of the oral interactome. mizing DNA extraction before quantifying oral bacterial numbers by
real-time PCR. FEMS Microbiol Lett 296:45-51.
Nocker A, Camper AK (2009). Novel approaches toward preferential detec-
Acknowledgments tion of viable cells using nucleic acid amplification techniques. FEMS
Microbiol Lett 291:137-142.
The author received no financial support and declares no poten- Nubel U, Garcia-Pichel F, Kuhl M, Muyzer G (1999). Quantifying microbial
tial conflicts of interest with respect to the authorship and/or diversity: morphotypes, 16S rRNA genes, and carotenoids of oxygenic
publication of this article. phototrophs in microbial mats. Appl Environ Microbiol 65:422-430.
Prosser JI (2010). Replicate or lie. Environ Microbiol 12:1806-1810.
Quince C, Lanzen A, Curtis TP, Davenport RJ, Hall N, head IM, et al.
References (2009). Accurate determination of microbial diversity from 454 pyrose-
quencing data. Nat Methods 6:639-641.
Arumugam M, Raes J, Pelletier E, Le Paslier D, Yamada T, Mende DR, Reeder J, Knight R (2010). Rapidly denoising pyrosequencing amplicon
et al. (2011). Enterotypes of the human gut microbiome. Nature reads by exploiting rank-abundance distributions. Nat Methods 7:668-
473:174-180; erratum in Nature 474:666, 2011). 669.
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.
Rogers GB, Bruce KD (2010). Next-generation sequencing in the analysis Sun Y, Cai Y, Huse SM, Knight R, Farmerie WG, Mai V, et al. (2011). A
of human microbiota: essential considerations for clinical application. large-scale benchmark study of existing algorithms for taxonomy-
Mol Diagn Ther 14:343-350. independent microbial community analysis. Brief Bioinform 13:107-
Simon C, Daniel R (2011). Metagenomic analyses: past and future trends. 121.
Appl Environ Microbiol 77:1153-1161. Xie G, Chain PS, Lo CC, Liu KL, Gans J, Qui F, et al. (2010). Community
Siqueira JF Jr, Fouad AF, Rocas IN (2012). Pyrosequencing as a tool for and gene composition of a human dental plaque microbiota obtained by
better understanding of human microbiomes. J Oral Microbiol [Epub metagenomic sequencing. Mol Oral Microbiol 25:391-405.
ahead of print 1/23/2012] (in press).
Downloaded from adr.sagepub.com at Sichuan University on September 17, 2012 For personal use only. No other uses without permission.