Vous êtes sur la page 1sur 9

Characterization of the complete chloroplast genome of Hevea brasiliensis reveals

genome rearrangement, RNA editing sites and phylogenetic relationships


Sithichoke Tangphatsornruang , Pichahpuk Uthaipaisanwong, Duangjai Sangsrakru, Juntima Chanprasert,
Thippawan Yoocha, Nukoon Jomchai, Somvong Tragoonrung
National Center for Genetic Engineering and Biotechnology, 113 Phaholyothin Rd., Klong 1, Klong Luang, Pathumthani, 12120, Thailand
a b s t r a c t a r t i c l e i n f o
Article history:
Accepted 5 January 2011
Available online 15 January 2011
Received by Jean-Marc Deragon
Keywords:
Hevea brasiliensis
Chloroplast/plastid genome
RNA editing
Rubber tree (Hevea brasiliensis) is an economical plant and widely grown for natural rubber production.
However, genomic research of rubber tree has lagged behind other species in the Euphorbiaceae family. We
report the complete chloroplast genome sequence of rubber tree as being 161,191 bp in length including a
pair of inverted repeats of 26,810 bp separated by a small single copy region of 18,362 bp and a large single
copy region of 89,209 bp. The chloroplast genome contains 112 unique genes, 16 of which are duplicated
in the inverted repeat. Of the 112 unique genes, 78 are predicted protein-coding genes, 4 are ribosomal
RNA genes and 30 are tRNA genes. Relative to other plant chloroplast genomes, we observed a unique
rearrangement in the rubber tree chloroplast genome: a 30-kb inversion between the trnE(UUC)-trnS(GCU)
and the trnT(GGU)-trnR(UCU). A comparison between the rubber tree chloroplast genes and cDNA sequences
revealed51RNAediting sites inwhichmost (48sites) were locatedin26proteincodinggenes andthe other 3sites
were in introns. Phylogenetic analysis based on chloroplast genes demonstrated a close relationship between
Hevea and Manihot in Euphorbiaceae and provided a strong support for a monophyletic group of the eurosid I.
2011 Elsevier B.V. All rights reserved.
1. Introduction
Chloroplasts are plant organelles with their own genome containing
genes coding for transcription, translation machinery and components
of the photosynthetic complex. Since the rst complete chloroplast (cp)
genome sequence of liverwort (Marchantia polymorpha) reported in
1986 (Ohyama et al., 1986), more than 150 chloroplast genomes have
been sequenced and characterized; disclosing an enormous amount of
evolutionary and functional information of chloroplasts. Chloroplast
genomes are sufciently large and complex to include structural and
point mutations that are useful for evolutionary studies from intraspe-
cic to interspecic levels (Neale et al., 1988; McCauley, 1992; Graham
and Olmstead, 2000; Provan et al., 2001). Structural mutations such as
gene duplications of tRNA genes (Hipkins et al., 1995), rpl19, rpl2, rpl23
(Bowman et al., 1988), psbA (Lidholm et al., 1991); losses of ndh genes
(Wakasugi et al., 1994), hypothetical chloroplast open reading frame
(ycf) genes, infA, and accD (Hiratsuka et al., 1989; Maier et al., 1995;
Millen et al., 2001); as well as rearrangements of cp genomes (Palmer
et al., 1987; Wolfe et al., 1991; Wojciechowski et al., 2004; Guo et al.,
2007; Tangphatsornruang et al., 2010b) have been reported in plants
and algae. Therefore, chloroplast genome sequences have been used to
study phylogenetic relationships (Provan et al., 2001; Lee et al., 2006;
Tangphatsornruang et al., 2010b), test hypotheses of seed dispersal,
intraspecic differentiation and interspecic introgression (Petit et al.,
2003, 2005).
In chloroplasts, transcripts undergo a series of RNA processing
steps such as inton splicing, polycistronic cleavage, and RNA editing.
RNA editing is a mechanism to change genetic information at the
transcript level by nucleotide insertion, deletion or conversion (Bock,
2000; Knoop, 2010). Since the rst report of RNA editing in
chloroplast in the maize rpl2 gene (Hoch et al., 1991), several editing
sites have been reported in Arabidopsis thaliana (Tillich et al., 2005),
Atropa belladonna (Schmitz-Linneweber et al., 2002), Lotus japonicus
(Kato et al., 2000), black pine (Wakasugi et al., 1996), cassava (Daniell
et al., 2008), pea (Miyamoto et al., 2002), tobacco (Sasaki et al., 2003),
maize (Maier et al., 1995; Halter et al., 2004) and rice (Corneille et al.,
2000). Comparison of sequences surrounding the editing sites
revealed no consensus sequence or secondary structure (Hirose
et al., 1999). This raised a question of how RNA editing sites are
recognized. Previous studies suggested the involvement of distinct
cis-acting elements and trans-acting factors in recognition of an
individual editing site (Chaudhuri et al., 1995; Bock et al., 1996;
Chaudhuri and Maliga, 1996; Hirose and Sugiura, 2001; Miyamoto et al.,
Gene 475 (2011) 104112
Abbreviations: bp, base pair; cp, chloroplast; IDP, Isopentenyl diphosphate; MVA,
Mevalonate; MEP, 1-Deoxy-D-xylulose 5-phosphate/2-C-methyl-D-erythritol 4-phos-
phate; H. brasiliensis, Hevea brasiliensis; PCR, Polymerase chain reaction; RCA, Rolling
cycle amplication; ML, Maximum likelihood; MP, Maximum parsimony; TBR, Tree
bisection and reconnection; A, Adenosine; C, Cytidine; I, Inosine; U, Uridine; G,
Guanosine; LSC, Large single copy; SSC, Small single copy; IR, Inverted repeat; ycf,
Hypothetical chloroplast open reading frame; EST, Expressed sequence tag.
Corresponding author. Tel.: +66 2 564 6700x3259; fax: +66 2 564 6584.
E-mail address: sithichoke.tan@biotec.or.th (S. Tangphatsornruang).
0378-1119/$ see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.gene.2011.01.002
Contents lists available at ScienceDirect
Gene
j our nal homepage: www. el sevi er. com/ l ocat e/ gene
2002; Lurin et al., 2004; Kotera et al., 2005; Hayes and Hanson, 2007).
The RNA-binding pentatricopeptide repeat (PPR) proteins were identi-
ed as trans-acting factor responsible for targeting specic editing
events (Lurin et al., 2004; Kotera et al., 2005; Hammani et al., 2009).
Hevea brasiliensis is a perennial plant in the Euphorbiaceae family
and is the most widely cultivated species for commercial production
of natural rubber. The chemical composition of natural rubber is cis-
polyisoprene, a high-molecular weight polymer formed from sequen-
tial condensation of isopentenyl diphosphate (IDP) units catalysed by
the action of rubber transferase (Cornish, 2001a). IDP is also an
important intermediate for biosynthesis of essential oils, abscisic acid,
cytokinin, phytoalexin, sterols, chlorophyll, carotenoids and gibber-
ellins (Chappell, 1995a; McGarvey and Croteau, 1995; Lichtenthaler et
al., 1997; Cornish, 2001b). There are two IDP biosynthesis pathways:
the mevalonate (MVA) pathway which occurs in cytosol (Chappell,
1995b); and the 1-deoxy-D-xylulose 5-phosphate/2-C-methyl-D-
erythritol 4-phosphate (MEP) pathway which occurs in plastids
(Lichtenthaler, 1999; Ko et al., 2003). One approach to improving
rubber production in H. brasiliensis would be to engineer chloroplasts
and modify metabolic ux to produce more biosynthetic intermedi-
ates. The availability of the complete chloroplast genome sequence
should also facilitate the chloroplast transformation technique. The
improved transformation efciency and foreign gene expression can
be achieved through utilization of endogenous anking sequences
and regulatory elements (Birch-Machin et al., 2004; Maliga, 2004;
Tangphatsornruang et al., 2010a). Transformation of chloroplast
genome offers a number of advantages over nuclear transformation
including a high level of transgene expression, polycistronic tran-
scription, lack of gene silencing or positional effect and transgene
containment (Daniell et al., 2002; Maliga, 2002, 2004; Bock, 2007).
We sequenced the chloroplast genome of H. brasiliensis in order to
gain information for genome annotation, comparative genomic studies
and also to lay the groundwork for chloroplast engineering. We
employed the massively-parallel pyrosequencing technology developed
by 454 Life Sciences Technology (Margulies et al., 2005). This technology
has been applied to the sequencing of genomes, transcriptome proling
and methylation studies. Previous work demonstrated the success of
high throughput sequencing technology in obtaining chloroplast
genome sequences (Cai et al., 2006; Moore et al., 2006; Cronn et al.,
2008; Tangphatsornruang et al., 2010b). This overcomes the traditional
labor-intensive methods involving isolationof chloroplast DNAfollowed
by random shearing and cloning into vectors; or long PCR amplication
by conserved primers (Goremykin et al., 2003, 2004; Dhingra and Folta,
2005; Heinze, 2007), or rolling circle amplication (RCA) (Jansen et al.,
2005; Bausher et al., 2006). In this study, we determined the complete
nucleotide sequence of the H. brasiliensis chloroplast genome, annotated
it, compared the structures with other plant species, identied RNA
editing sites, and used the rubber tree chloroplast genome to determine
phylogenetic relationships among angiosperms.
2. Materials and methods
2.1. DNA sequencing, assembly and annotation
DNAwas isolated from1 g of leaves of H. brasiliensis, clone RRIM600,
using the DNeasy Plant Mini Kit (Qiagen). The DNA(10 g) was sheared
by nebulization, subjected to 454 library preparation and shotgun
sequencing using the GS FLX Titaniumplatform(Margulies et al., 2005)
at the in-house facility (National Center for Genetic Engineering and
Biotechnology, Thailand). The obtained nucleotide sequence reads were
assembled using Newbler de novo sequence assembly software
(Roche). The chloroplast genome sequence was compared with the
reference sequence from the complete chloroplast genome of Manihot
esculenta (Daniell et al., 2008) usinga Sequencher software (Gene Codes
Corporation). Remaining gaps were closed by PCR and Sanger
sequencing using BigDye Terminator v3.1 Cycle sequencing kit. The 3
primer pairs used for closing the gaps are 1) gap_LSCF: 5-GGGCTC TAA
AAA GAC ATC TCC A-3, gap_LSCR: 5-CTT TCT GTC TTT CAC GAT TCC
A-3, 2) gap_SSC1F: 5-TGTATGACCATCGAGGAACTTG-3, gap_SSC1R:
5-GTCGGAGTGATGGAAAAGAAAG-3 and3) gap_SSC2F: 5-GCTGAA
TAGACAAAT CGA TTGAA-3, gap_SSC2R: 5-TGATCC ATT TTC TAGCCC
AAG-3. PCR products were puried by electrophoresis in agarose gel
using Qiaquick Gel Extraction Kit (QIAGEN).
2.2. Genome analysis
The genome was annotated using the program DOGMA (Dual
Organellar GenoMe Annotator (Wyman et al., 2004)). The predicted
annotations were veried using BLAST similarity search (Altschul et al.,
1990). All genes, rRNAs, and tRNAs were identied using the plastid/
bacterial genetic code. The chloroplast genome of H. brasiliensis was
compared with chloroplast genomes of Arabidopsis (Sato et al., 1999),
Populus, Jatropha and Manihot (Daniell et al., 2008) using a Mauve
software (Darling et al., 2004). REPuter (Kurtz and Schleiermacher,
1999) was used to identify and locate direct repeat and inverted repeat
sequences in the rubber tree chloroplast genome with criteria cutoff
n30 bp, and a sequence identity 90%.
2.3. RNA editing
To reveal RNA editing sites, more than two million cDNA
sequences of rubber trees were downloaded from the DDBJ read
archive (ID=DRA000170) and used to align with the protein coding
genes extracted from the rubber tree chloroplast genome using GS
Reference Mapper version 2.3 (Roche).
Some RNA editing sites (rps2eU134TI, rps14eU149PL, ndhKeU65SL,
petBi178, ndhBeC1290YY, ndhBeU467PL, ndhDeU887PL, ndhDeU878SL
and ndhDeU599SL) were conrmed by sequencing of cDNA products
by Sanger sequencing. In brief, total RNA was extracted from 0.5 g of
young leaf usingConcertPlant RNAReagent (Invitrogen), treatedwith
DNA-free DNaseI (Ambion) and converted to a pool of cDNA using
RevertAid H minus First Strand cDNA synthesis kit (Fermentas). Primer
sequences were given in Supplementary Fig. 2.
2.4. Phylogenetic analysis
A set of 33 protein-coding genes including atpA, atpB, atpE, atpF,
atpH, atpI, ccsA, cemA, matK, petA, petG, petN, psaA, psaB, psaC, psbC,
psbD, psbE, psbF, psbI, psbJ, psbK, psbN, psbZ, rbcL, rpl2, rpl20, rpoB,
rpoC2, rps4, rps14, rps15 and ycf3 from 39 chloroplast genomes
representing all lineages of angiosperms, were analyzed. These 33
genes are commonly present in all 39 chloroplast genomes and
publicly available in the GenBank database. Sequences were aligned
using MUSCLE (version 3.6) (Edgar, 2004) and edited manually. For
maximum likelihood (ML) analysis, RAxML version 7.0 (Stamatakis,
2006) was used with the GTR+I +G matrix. The local bootstrap
probability of each branch was calculated by 100 replications.
Phylogenetic analyses using maximum parsimony (MP) were per-
formed using PAUP version 4.0b10 (Swofford, 2002). MP searches
included 1000 randomaddition replicates and a heuristic search using
tree bisection and reconnection (TBR) branch swapping with the
Multrees option. Bootstrap analysis was performed with 100
replicates with TBR branch swapping. TreeView (Page, 1996) was
used for displaying and printing phylogenetic trees.
3. Results and discussion
3.1. Sequencing and assembly of the H. brasiliensis chloroplast genome
A total of 995,092 quality ltered sequence reads was generated
with the average read length of 332 bases covering 330 Mb. From the
assembly analysis, 3 contigs, assembled from 60,855 reads (5.49%),
105 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
were shown to be parts of the chloroplast genome by alignment with the
M. esculenta chloroplast genome. The proportion of sequences from the
chloroplast genomeinrubber tree(5.49%) is similar toaprevious studyin
mungbean(5.22%) (Tangphatsornruang et al., 2010b). The gaps between
contigs were locatedinthe large single copy (LSC; betweentrnS-GCUand
trnE-UUC) with the size of 84 bp, in the small single copy (SSC; between
ndhF andtrnL-UAG) withthesizeof 1074 bpandat the junctionbetween
SSC-IRa withthe size of 299 bp. Acommoncharacteristic of these gaps is
the presence of multiple copies of high AT repeats as also found by
Tangphatsornruang et al., 2010b. Closing of the gaps with Sanger
sequencing resulted in a complete chloroplast genome sequence.
Since 454 sequencing technology has a limitation in reading long
homopolymer regions (Moore et al., 2006; Huse et al., 2007;
Tangphatsornruang et al., 2010b), we performed Sanger sequencing
of all homopolymers (N7 bp) present in the chloroplast genome
(Supplementary Table 1). Throughout the rubber tree chloroplast
genome, there are 229 homopolymers (N7 bp); 45 homopolymers are
present in 18 coding genes and 184 are present in non-coding regions.
Among the protein coding sequences, ycf1 contains the highest
number of homopolymers (21) and followed by ycf2 (4). The longest
stretch of homopolymer is 19 bp located in the intergenic region
between atpF and atpA. Out of 229 homopolymers, 221 were polyA/T
and only 8 were polyG/C. We observed that the number of corrected
homopolymeric bases from GS FLX Titanium in this study were 258
out of 2227 (11.58%) which were 4 times higher than the previous
report on errors in homopolymers by the previous version of the GS
FLX platform (Tangphatsornruang et al., 2010b).
The complete chloroplast genome sequence was reported in the
NCBI database (HQ285842). The chloroplast genome contains a pair of
identical inverted repeat regions (IRA and IRB), which are 26,810 bp
each. The inverted repeats are separated by a large single-copy (LSC)
region of 89,209 bp and a small single-copy (SSC) region of 18,362 bp.
Fig. 1. Map of the H. brasiliensis chloroplast genome. The thick lines indicate the extent of the inverted repeats (IRa and IRb) which separate the genome into small and large single
copy regions. Genes on the outside of the map are transcribed clockwise and those on the inside of the map are transcribed counter clockwise. Genes containing introns and
psuedogenes are marked with * and # respectively. Arrows indicate the positions of a 30-kb unique rearrangement in relative to the cassava chloroplast genome.
106 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
3.2. Genome content and organization
The positions of all the genes identied in the H. brasiliensis
chloroplast genome and functional categorization of these genes are
presented in Fig. 1. The genome contains 112 unique genes including
30 tRNA genes, 4 rRNA genes and 78 predicted protein coding genes
(Table 1). In addition, there are 16 genes duplicated in the inverted
repeat (IR), making a total of 128 genes present in the rubber tree
chloroplast genome. Coding regions (90,532 bp; 56.16%) account for
over half of the chloroplast genome, with the peptide-coding regions
forming the largest group (78,681 bp; 48.81%) followed by ribosomal
RNA genes (9050 bp; 5.61%) and transfer RNA genes (2801 bp; 1.74%).
The remaining 43.84% is covered by intergenic regions (29.61%) and a
total of 23 introns (13.27%) present within 22 genes (or 17 unique
genes). The trnK-UUU gene has the largest intron (2535 bp) in which
the matK gene is present. There are unique 30 tRNA genes (7 tRNA
genes are duplicated in the IR) which recognize all RNA codons for 20
amino acids according to the wobble rubles. Based on the sequences of
protein-coding genes and tRNA genes within the chloroplast genome,
we were able to deduce the frequency of codon usage as summarized
in Supplementary Table 3. We observed that the codon usage was
biased towards a high representation of A and U at the third codon
position like in all other land plants (Shimada and Sugiura, 1991; Cai
et al., 2006; Gao et al., 2009). The rubber tree psbC and rps19 genes
contain GUG as a start codon. Sequence alignment between the
chloroplast genome and the rubber tree ESTs also conrmed that both
psbC and rps19 transcripts have GUG as the start codons. Studies of
psbC and rps19 translation also revealed that GUG codon is the
initiation codon in several plants and algae (Rochaix et al., 1989;
Carpenter et al., 1990; Yukawa et al., 2005; Kuroda et al., 2007).
The previously sequenced chloroplast genomes of Malpighiales
(Manihot, Jatropha and Populus) and Hevea as reported here were
compared with the Arabidopsis chloroplast genome as the reference
sequence (Darling et al., 2004) (Supplementary Fig. 1). We observed a
unique genome rearrangement of a 30 kb fragment in the LSC
between trnS(GCU)-trnE(UUC) and trnR(UCU)-trnT(GGU) in the
rubber tree chloroplast genome compared with others. Although,
we were unable to identify any signicant repeats in spaces between
the rubber tree trnT(GGU)-trnR(UCU) and trnE(UUC)-trnS(GCU),
these regions are biased towards high AT content, 85.54% and
84.92%, respectively.
Analysis of the repeat sequences in the rubber tree chloroplast
genome identied twenty ve direct repeats and seventeen inverted
repeats of 30 bp or longer with a sequence identity of 90%
(Supplementary Table 4). Eighteen repeats are 30 to 40 bp long,
eleven repeats are 4150 bp long, seven repeats are 5180 bp long,
and six repeats are longer than 80 bp. The longest repeat in rubber
tree chloroplast DNA is a 151-bp direct repeat between the trnG-GCC
and trnT-GGU. Most of the direct repeats are distributed within the
intergenic spacer regions, the intron sequences, and in the tRNA, and
ycf2 genes.
Two ycf genes (ycf15 and ycf68) are probably not functional in the
rubber tree chloroplast genome due to the presence of premature stop
codons. In several chloroplast genomes, ycf15 and ycf68 have also
been reported as non-functional genes (Sato et al., 1999; Schmitz-
Linneweber et al., 2001; Steane, 2005; Raubeson et al., 2007; Daniell
et al., 2008). The infA gene is present but probably non-functional
in the rubber tree chloroplast genome due to the presence of a
Table 1
Genes encoded by the Hevea brasiliensis chloroplast genome.
1. Photosystem I: psaA, psaB, psaC, psaI, psaJ, ycf3
a
, ycf4
2. Photosystem II: psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK,
psbL, psbM, psbN, psbT, psbZ
3. Cytochrome b6/f: petA, petB
b
, petD
b
, petG, petL, petN
4. ATP synthase: atpA, atpB, atpE, atpF, atpH, atpI
5. Rubisco: rbcL
6. NADH oxidoreductase: ndhA
b
, ndhB
b,c
, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH,
ndhI, ndhJ, ndhK
7. Large subunit ribosomal proteins: rpl2
b,c
, rpl14, rpl16
b
, rpl20, rpl22, rpl23
c
,
rpl32, rpl33, rpl36
8. Small subunit ribosomal proteins: rps2, rps3, rps4, rps7
c
, rps8, rps11,
rps12
b,c,d
, rps14, rps15, rps16
b
, rps18, rps19
9. RNAP: rpoA, rpoB, rpoC1
b
, rpoC2
10. Other proteins: accD, ccsA, cemA, clpP
a
, matK
11. Proteins of unknown function: ycf1, ycf2
c
12. Ribosomal RNAs: rrn16
c
, rrn23
c
, rrn4.5
c
, rrn5
c
13. Transfer RNAs: A(UGC)
b,c
, C(GCA), D(GUC), E(UUC), F(GAA), G(GCC)
b
,
G(UCC), H(GUG), I(CAU)
c
, I(GAU)
b,c
, K(UUU)
b
, L(CAA)
c
, L(UAA)
b
, L(UAG),
fM(CAU), M(CAU), N(GUU)
c
, P(UGG), Q(UUG), R(ACG)
c
, R(UCU), S(GCU),
S(GGA), S(UGA), T(GGU), T(UGU), V(GAC)
c
, V(UAC)
b
, W(CCA), Y(GUA)
a
Gene containing two introns.
b
Gene containing a single intron.
c
Two gene copies in the IRs.
d
Gene divided into two independent transcription units.
Table 2
RNA editing events in the rubber tree chloroplast genes.
The annotation nomenclature of RNA editing events is
according to Lenz et al., 2009.
Number RNA editing sites
1 matKeU1168RW
2 matKeU634HY
3 matKeU149SF
4 rps16i493
5 rpoBeU551SL
6 rpoC1eU41SL
7 rpoC2eU3746SL
8 rps2eU134TI
9 rps2eU248SL
10 atpIeU635SL
11 psbDeU435II
12 rps14eU149PL
13 rps14eU80SL
14 ndhKeU65SL
15 ndhCeU323SL
16 psbEeU214PS
17 petLeU5PL
18 rps18eU221SL
19 clpPeU556HY
20 psbBeU414II
21 petBi178
22 petBeU611SL
23 petDeU481SL
24 rpoAeU836SL
25 rpoAeU200SF
26 rpl23eU89SL
27 ycf2eU467PL
28 ycf2eC1608VV
29 ycf2eA1645VI
30 ndhBeU1481PL
31 ndhBeC1290YY
32 ndhBeU1255HY
33 ndhBeU59SL
34 ndhBeU830SL
35 ndhBeU746SF
36 ndhBeU611SL
37 ndhBeU586HY
38 ndhBeU542TM
39 ndhBeU467PL
40 ndhBeU149SL
41 rps12-3endi186
42 ndhDeU887PL
43 ndhDeU878SL
44 ndhDeU674SL
45 ndhDeU599SL
46 ndhDeU313RW
47 ndhEeU233PL
48 ndhGeU347PL
49 ndhAeU961PS
50 ndhAeU566SL
51 ndhHeU505HY
107 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
premature stop codon in both chloroplast DNA and cDNA sequences.
The loss of infA from the chloroplast genome has been reported to
occur multiple times during the angiosperm evolution (Millen et al.,
2001). The infA gene has been lost from the M. esculenta chloroplast
genome, the closest fully sequenced relative to H. brasiliensis; but it is
present in Populus, another plant species in the Malpighiales order
(Millen et al., 2001; Daniell et al., 2008).
3.3. RNA editing search by comparison between coding sequences and
cDNAs
To determine the RNA editing sites, we compared the protein
coding sequences extracted from the rubber tree chloroplast genome
with 2,265,782 rubber tree cDNA sequences downloaded from the
DDBJ read archive (ID=DRA000170). The chloroplast gene sequences
matched 52,971 out of 2.2 million rubber tree ESTs (2.23%). There
were 6765 EST reads (2,059,201 bp) mapped to chloroplast protein
coding genes which is equivalent to 23 coverage of the chloroplast
coding region. Table 2 presents 51 RNA editing sites identied and
named according to the proposed universal nomenclature by Lenz
et al., 2009 (Lenz et al., 2009). Forty eight were in protein coding
regions of 26 protein coding genes, 3 were in introns of rps16, petB and
rps12. Out of 48 RNA editings in mRNA, a C-to-U change was the most
common (45), followed by a U-to-C change (2) and a G-to-A change
(1). In chloroplasts and mitochondria of seed plants, a conversion
from C to U is the most predominant form (Bock, 2000). The reverse
U-to-C editing is rarely observed in seed plants (Gualberto et al., 1990;
Schuster et al., 1990); but it is common in hornworts and ferns
(Yoshinaga et al., 1996; Steinhauser et al., 1999; Vangerow et al.,
1999). The two U-to-C events of RNA editing were found only in ndhB
and ycf2 transcripts which are very close to each other (7266 bp
apart). It is also possible that this fragment of the chloroplast genome
may be transferred to a mitochondrial genome where extensive RNA
editing events occur. Several lines of evidence have suggested
translocation of chloroplast DNA fragments to mitochondrial genomes
in many plant species (Stern and Lonsdale, 1982; Stern and Palmer,
1984; Moon et al., 1987). Further experiments on cDNA sequencing of
transcripts extracted from isolated chloroplasts will be required to
test this hypothesis.
An uncommon G-to-A change at the ycf2eA1645VI observed here
has never been reported in chloroplasts of higher land plants before.
Although, an A-to-I/G editing has been commonly observed in tRNAs
to expand the ability to read additional codons (Ptzinger et al., 1990;
Dao et al., 1994; Agris et al., 2007). Recently, the adenosine deaminase
gene acting on tRNAs (ADAT) responsible for the editing of the
adenosine at the wobble position of cp-tRNA
Arg
(ACG) has been
identied in Arabidopsis chloroplasts (Delannoy et al., 2009; Karcher
and Bock, 2009). However, it should be noted that the number of
edited mRNA found in the ycf2 transcript (from G to A) was only two
compared with six unedtited ycf2 transcripts, and this G-to-A
conversion may be due to sequencing error which may overestimate
the number of RNA editing events in this study.
There are 45 non-synonymous substitutions which are present
most frequently in ndhB (11) and followed by ndhD (5). The ndhB
At.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDSTSDQKDIPWLYFISSTSFVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
At.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDLTSDQKDIPWLYFISSTSFVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
-------------------------------------------------*------------------------------------------------------------------------------
Sl.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDSTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
Sl.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDLTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
-------------------------------------------------*------------------------------------------------------------------------------
Nt.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDSTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
Nt.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSLIFPECILIFGLILLLMIDLTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
-------------------------------------------------*------------------------------------------------------------------------------
Hb.genomic MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDSTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
Hb.cDNA MIWHVQNENFILDSTRIFMKAFHLLLFDGSFIFPECILIFGLILLLMIDLTSDQKDIPWLYFISSTSLVMSITALLFRWREEPMISFSGNFQTNNFNEIFQFLILLCSTLCIPLSVEYIECTEMAITE
-------------------------------------------------*------------------------------------------------------------------------------
i GG CG C S CS SG S GG SSS G S GSSGG Q G Q S G S G G S S Q At.genomic FLLFILTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDIRSNEATMKYLLMGGASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPSHQWTPDV
At.cDNA FLLFILTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDIRSNEATMKYLLMGGASSSILVYGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV
---------------------------*---------------------------------------*----------------------------------------------------*-------
Sl.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATMKYLLMGGASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSPAPSHQWTPDV
Sl.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMGGASSSILVYGFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV
---------------------------*---------------------------------------*-------*-----------------------------------------*--*-------
Nt.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATMKYLLMGGASSSILVHGFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSPAPSHQWTPDV
Nt.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMGGASSSILVYGFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV
* * * * * ---------------------------*---------------------------------------*-------*-----------------------------------------*--*-------
Hb.genomic FLLFVLTATLGGMFLCGANDLITIFVAPECFSLCSYLLSGYTKKDVRSNEATTKYLLMGGASSSILVHAFSWLYGSSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPSHQWTPDV
Hb.cDNA FLLFVLTATLGGMFLCGANDLITIFVALECFSLCSYLLSGYTKKDVRSNEATMKYLLMGGASSSILVYAFSWLYGLSGGEIELQEIVNGLINTQMYNSPGISIALIFITVGIGFKLSLAPFHQWTPDV
---------------------------*------------------------*--------------*-------*--------------------------------------------*-------
At.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMIFGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYIAMNLGTFACIILFGLRTGTDNIRDY
At.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMIFGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYIAMNLGTFACIILFGLRTGTDNIRDY
--------------------*-*---------------------------------------------------------------------------------------------------------
Sl i YEGSPTPVVAFLSVTSKVAASASATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY Sl.genomic YEGSPTPVVAFLSVTSKVAASASATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
Sl.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFNIPFYFSSNEWHLLLEILAILSMILGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
--------------------*-*---------------------------------------------------------------------------------------------------------
Nt.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMILGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
Nt.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMILGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNDGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
--------------------*-*---------------------------------------------------------------------------------------------------------
Hb.genomic YEGSPTPVVAFLSVTSKVAASASATRIFDIPFYFSSNEWHLLLEILAILSMIVGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
Hb.cDNA YEGSPTPVVAFLSVTSKVAALALATRIFDIPFYFSSNEWHLLLEILAILSMIVGNLIAITQTSMKRMLAYSSIGQIGYVIIGIIVGDSNGGYASMITYMLFYISMNLGTFACIVLFGLRTGTDNIRDY
* * --------------------*-*---------------------------------------------------------------------------------------------------------
At.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLHLFWCGWQAGLYFLVSIGLLTSVLSIYYYLKIIKLLMTGRNQEITPHMRNYRISPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDTLFSF
At.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVSIGLLTSVLSIYYYLKIIKLLMTGRNQEITPHMRNYRISPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDTLFSF
----------------------------------*--------------------------------------------------------------------------*------------------
Sl.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLF--
Sl.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDSLF--
-------------------------------------------------------------------------------------------------------------*----------------
Nt genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLF Nt.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIIAIAQDSLF--
Nt.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIIAIAQDSLF--
-------------------------------------------------------------------------------------------------------------*----------------
Hb.genomic AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLHLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTIPGISMNPIVEIAQDTLF--
Hb.cDNA AGLYTKDPFLALSLALCLLSLGGLPPLAGFFGKLYLFWCGWQAGLYFLVLIGLLTSVVSIYYYLKIIKLLMTGRNQEITPHVRNYRRSPLRSNNSIELSMIVCVIASTILGISMNPIVEIAQDTLF--
----------------------------------*----------*---------------------------------------------------------------*----------------
Fig. 2. Sequence alignment of ndhB proteins translated fromchloroplast genomes before RNA editing and cDNAs after RNA editing of Arabidopsis thaliana (At), Solanum lycopersicum
(Sl), Nicotiana tabaccum (Nt) and Hevea brasiliensis (Hb) using CLUSTAL 2.0.12. Stars represent RNA editing sites and hyphens represent unedited sites.
108 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
transcripts were also found to be highly edited in other plants such as
maize, sugarcane, rice, barley, tomato, tobacco and Arabidopsis (Freyer
et al., 1995; Kahlau et al., 2006; Chateigner-Boutin and Small, 2007).
Fig. 2 shows amino acid sequence alignment of the ndhB proteins
from Arabidopsis, tobacco, tomato and rubber tree with RNA editing
positions. All RNA editing events in the highly edited ndhB transcripts
maintained the conserved ndhB amino acid sequences in all 4 plant
species. We also observed that 40 RNA editing events in rubber tree
chloroplasts caused amino acid changes for highly hydrophobic
residues (such as L, F, I, M, V and W) with conversions from serine
to leucine as the most frequent transitions. The majority of RNA
editing in messenger RNAs occurred at the second codon position
(36), followed by the rst codon position (10) and the third codon
position (2). In RNA editing events at the second codon, there was a
bias toward pyrimidine nucleotide at the 5 upstream and purine
nucleotide at the 3 downstream. However, it is unclear whether these
biases are due to evolutionary or mechanism limitation of the editing
process.
3.4. Phylogenetic analysis
Our phylogenetic data set included 33 protein coding genes for 39
plant taxa (Supplementary Table 5), including 37 angiosperms and
two outgroup gymnosperms (Ginkgo and Pinus). These 33 genes are
present in the chloroplast genome of each of the 39 species so a
problem with missing data from the sequence alignment was
minimized. The sequence alignment that was used for phylogenetic
analyses comprised 26,585 characters. ML analysis resulted in a single
Medicago
Trifolium
Cicer
Lotus
Phaseolus
Vigna
Glycine
Cucumis I Cucurbitales
Hevea
Manihot
Jatropha
P. al ba
P. tr ichocarpa
Gossypium I Malvales

Arabidopsis I Brassicales
Citrus I Sapindales
Eucalyptus
Oenothera
S. lycopersicum
S. bulbocastanum
Atropa
Nicotiana
Daucus
Panax
Spinacia I Caryophyllales
Ranunculus I Ranunculales
Sorghum
Saccharum
Zea
Triticum
Oryza
Typha
Acorus I Acorales
Calycanthu s I Laurales
Nymphaea
Nuphar
Amborella I Amborellales
Ginkgo I Ginkgoales
Pinus I Pinales
EUASTERIDS I
EUASTERIDS II
EUROSIDS II
Fabales
EUROSIDS I
Poales
MONOCOTS
ROSIDS
ASTERIDS
EUDICOTS
BASAL ANGIOSPERMS
Nymphaeales
GYMNOSPERMS
MAGNOLIIDS
Malpighiales
Myrtales
Solanales
Apiales
0.1
Substitutions/site
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
97
100
100
100
100
100
100
100
73
100
100
100
99
100
100
100
100
100
100
100
100
Fig. 3. The phylogenetic relationships based on 33 protein-coding genes from 39 plant taxa with the ML value of lnL=230655.55. Numbers above node are bootstrap support
values. Ordinal and higher level group names are also indicated.
109 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
tree with ln L=230655.55 (Fig. 3). ML bootstrap values were also
high, with values of 95% for 36 of the 37 nodes, and 34 nodes with
100% bootstrap support (Fig. 3). MP analysis resulted in a single
resolved tree with a length of 40,109, a consistency index of 0.48 and a
retention index of 0.657 (not shown). Bootstrap analyses indicated
that there were 32 out of 36 nodes with values of 100%.
Both the MP and ML trees had similar topologies with two major
clades, Monocots and Eudicots with Amborella as the earliest
diverging angiosperm lineage. The only incongruence between the
MP and ML trees is the position of Calycanthus. In the MP tree,
Calycanthus was placed sister to Eudicots; whereas it was positioned
close to both Monocots and Eudicots in the ML tree. This incongruence
was observed in previous phylogenetic studies (Leebens-Mack et al.,
2005; Bausher et al., 2006; Jansen et al., 2006; Ruhlman et al., 2006).
Some studies supported Monocots as the sister clade to Magnoliids+
Eudicots (Nickrent et al., 2002; Zanis et al., 2002). However,
phylogenies based on phytochromes (Mathews and Donoghue,
1999), 17 cp genes (Graham and Olmstead, 2000), 21 cp genes
(Tangphatsornruang et al., 2010b) and 61 cp genes (Cai et al., 2006;
Lee et al., 2006; Hansen et al., 2007) supported Magnoliids as sister to
Monocot and Eudicot. By sequencing three chloroplast genomes of
Magnoliids, Cai et al., 2006 provided strong support for Monocots and
Eudicots as sister clades with Magnoliids diverging before the
MonocotsEudicots split.
Our MP and ML trees revealed a monophyly of the Monocots and
Eudicots where Ranunculales was placed sister to the remaining
Eudicots. The overall structure of the trees is similar to the previously
reported trees (Lee et al., 2006; Daniell et al., 2008; Logacheva et al.,
2008; Tangphatsornruang et al., 2010b). Addition of the H. brasiliensis
chloroplast genes placed Hevea sister to Manihot and grouped
together with Jatropha and Populus in the Malpighiales order and
provided a strong support for a monophyletic group of the eurosid I.
The relationships in the Malpighiales order were also supported by
the study based on the atpF gene (Daniell et al., 2008).
4. Conclusion
We performed shotgun genome sequencing of H. brasiliensis using
the 454 pyrosequencing technology and obtained the complete
chloroplast genome sequence. The approach has been demonstrated
here as a fast and efcient way for obtaining organellar genomes.
Gene content and structural organization of the rubber tree
chloroplast genome are similar to that of M. esculenta, with an
exception of the 30-kb fragment rearrangement in the LSC. By
comparing the rubber tree chloroplast genes and the cDNA sequences,
we determined the distribution and the location of RNA editing sites
in the chloroplast genome. The proposed phylogenetic relationships
among angiosperms, based on chloroplast DNA sequences including
those of the rubber tree chloroplast DNA reported here, provided a
strong support for a monophyletic group of the eurosid I and
demonstrated a close relationship between Hevea, Manihot, Jatropha
and Populus in Malpighiales.
Supplementary materials related to this article can be found online
at doi:10.1016/j.gene.2011.01.002.
Acknowledgements
We acknowledge funding support by the National Center for
Genetic Engineering and Biotechnology, Thailand.
References
Agris, P.F., Vendeix, F.A., Graham, W.D., 2007. tRNA's wobble decoding of the genome:
40 years of modication. J. Mol. Biol. 366, 113.
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment
search tool. J. Mol. Biol. 215, 403410.
Bausher, M.G., Singh, N.D., Lee, S.B., Jansen, R.K., Daniell, H., 2006. The complete
chloroplast genome sequence of Citrus sinensis (L.) Osbeck var Ridge Pineapple:
organization and phylogenetic relationships to other angiosperms. BMC Plant Biol.
6, 21.
Birch-Machin, I., Newell, C.A., Hibberd, J.M., Gray, J.C., 2004. Accumulation of rotavirus
VP6 protein in chloroplasts of transplastomic tobacco is limited by protein stability.
Plant Biotechnol. J. 2, 261270.
Bock, R., 2000. Sense from nonsense: how the genetic information of chloroplasts is
altered by RNA editing. Biochimie 82, 549557.
Bock, R., 2007. Plastid biotechnology: prospects for herbicide and insect resistance,
metabolic engineering and molecular farming. Curr. Opin. Biotechnol. 18, 100106.
Bock, R., Hermann, M., Kossel, H., 1996. In vivo dissection of cis-acting determinants for
plastid RNA editing. EMBO 15, 50525059.
Bowman, C.M., Barker, R.F., Dyer, T.A., 1988. In wheat ctDNA, segments of ribosomal
protein genes are dispersed repeats, probably conserved by nonreciprocal
recombination. Curr. Genet. 14, 127136.
Cai, Z., Penaor, C., Kuehl, J.V., Leebens-Mack, J., Carlson, J.E., dePamphilis, C.W.,
Boore, J.L., Jansen, R.K., 2006. Complete plastid genome sequences of Drimys,
Liriodendron, and Piper: implications for the phylogenetic relationships of
magnoliids. BMC Evol. Biol. 6, 77.
Carpenter, S.D., Charite, J., Eggers, B., Vermaas, W.F., 1990. The psbC start codon in
Synechocystis sp. PCC 6803. FEBS Lett. 260, 135137.
Chappell, J., 1995a. The biochemistry and molecular biology of isoprenoid metabolism.
Plant Physiol. 107, 16.
Chappell, J., 1995b. Biochemistry and molecular biology of the isoprenoid biosynthetic
pathway in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 46, 521547.
Chateigner-Boutin, A.L., Small, I., 2007. A rapid high-throughput method for the
detection and quantication of RNA editing based on high-resolution melting of
amplicons. Nucleic Acids Res. 35, e114.
Chaudhuri, S., Maliga, P., 1996. Sequences directing C to U editing of the plastid psbL
mRNA are located within a 22 nucleotide segment spanning the editing site. EMBO
J. 15, 59585964.
Chaudhuri, S., Carrer, H., Maliga, P., 1995. Site-specic factor involved in the editing of
the psbL mRNA in tobacco plastids. EMBO J. 14, 29512957.
Corneille, S., Lutz, K., Maliga, P., 2000. Conservationof RNAediting betweenrice and maize
plastids: are most editing events dispensable? Mol. Gen. Genet. 264, 419424.
Cornish, K., 2001a. Similarities and differences in rubber biochemistry among plant
species. Phytochemistry 57, 11231134.
Cornish, K., 2001b. Similarities and differences in rubber biochemistry among plant
species. Phytochemistry 57, 11231134.
Cronn, R., Liston, A., Parks, M., Gernandt, D.S., Shen, R., Mockler, T., 2008. Multiplex
sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis
technology. Nucleic Acids Res. 36.
Daniell, H., Khan, M.S., Allison, L., 2002. Milestones in chloroplast genetic engineering:
an environmentally friendly era in biotechnology. Trends Plant Sci. 7, 8491.
Daniell, H., Wurdack, K.J., Kanagaraj, A., Lee, S.B., Saski, C., Jansen, R.K., 2008. The
complete nucleotide sequence of the cassava (Manihot esculenta) chloroplast
genome and the evolution of atpF in Malpighiales: RNA editing and multiple losses
of a group II intron. Theor. Appl. Genet. 116, 723737.
Dao, V., Guenther, R., Malkiewicz, A., Nawrot, B., Sochacka, E., Kraszewski, A.,
Jankowska, J., Everett, K., Agris, P.F., 1994. Ribosome binding of DNA analogs of
tRNA requires base modications and supports the extended anticodon. Proc.
Natl Acad. Sci. USA 91, 21252129.
Darling, A.C., Mau, B., Blattner, F.R., Perna, N.T., 2004. Mauve: multiple alignment of
conserved genomic sequence with rearrangements. Genome Res. 14, 13941403.
Delannoy, E., Le Ret, M., Faivre-Nitschke, E., Estavillo, G.M., Bergdoll, M., Taylor, N.L.,
Pogson, B.J., Small, I., Imbault, P., Gualberto, J.M., 2009. Arabidopsis tRNA adenosine
deaminase arginine edits the wobble nucleotide of chloroplast tRNAArg(ACG) and
is essential for efcient chloroplast translation. Plant Cell 21, 20582071.
Dhingra, A., Folta, K.M., 2005. ASAP: amplication, sequencing & annotation of
plastomes. BMC Genomics 6.
Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high
throughput. Nucleic Acids Res. 32, 17921797.
Freyer, R., Lopez, C., Maier, R.M., Martin, M., Sabater, B., Kossel, H., 1995. Editing of the
chloroplast ndhB encoded transcript shows divergence between closely related
members of the grass family (Poaceae). Plant Mol. Biol. 29, 679684.
Gao, L., Yi, X., Yang, Y.X., Su, Y.J., Wang, T., 2009. Complete chloroplast genome sequence
of a tree fern Alsophila spinulosa: insights into evolutionary changes in fern
chloroplast genomes. BMC Evol. Biol. 9, 130.
Goremykin, V.V., Hirsch-Ernst, K.I., Wol, S., Hellwig, F.H., 2003. Analysis of the
Amborella trichopoda chloroplast genome sequence suggests that amborella is not a
basal angiosperm. Mol. Biol. Evol. 20, 14991505.
Goremykin, V.V., Hirsch-Ernst, K.I., Wol, S., Hellwig, F.H., 2004. The chloroplast
genome of Nymphaea alba: whole-genome analyses and the problem of identifying
the most basal angiosperm. Mol. Biol. Evol. 21, 14451454.
Graham, S.W., Olmstead, R.G., 2000. Utility of 17 chloroplast genes for inferring the
phylogeny of the basal angiosperms. Am. J. Bot. 87, 17121730.
Gualberto, J.M., Weil, J.H., Grienenberger, J.M., 1990. Editing of the wheat coxIII
transcript: evidence for twelve C to U and one U to C conversions and for sequence
similarities around editing sites. Nucleic Acids Res. 18, 37713776.
Guo, X., Castillo-Ramirez, S., Gonzalez, V., Bustos, P., Fernandez-Vazquez, J.L.,
Santamaria, R.I., Arellano, J., Cevallos, M.A., Davila, G., 2007. Rapid evolutionary
change of common bean (Phaseolus vulgaris L) plastome, and the genomic
diversication of legume chloroplasts. BMC Genomics 8, 228.
Halter, C.P., Peeters, N.M., Hanson, M.R., 2004. RNA editing in ribosome-less plastids of
iojap maize. Curr. Genet. 45, 331337.
110 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
Hammani, K., Okuda, K., Tanz, S.K., Chateigner-Boutin, A.L., Shikanai, T., Small, I., 2009. A
study of new Arabidopsis chloroplast RNA editing mutants reveals general features
of editing factors and their target sites. Plant Cell 21, 36863699.
Hansen, D.R., Dastidar, S.G., Cai, Z., Penaor, C., Kuehl, J.V., Boore, J.L., Jansen, R.K., 2007.
Phylogenetic and evolutionary implications of complete chloroplast genome
sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus
(Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol.
Phylogenet. Evol. 45, 547563.
Hayes, M.L., Hanson, M.R., 2007. Identication of a sequence motif critical for editing of
a tobacco chloroplast transcript. RNA 13, 281288.
Heinze, B., 2007. A database of PCR primers for the chloroplast genomes of higher
plants. Plant Meth. 3, 4.
Hipkins, V.D., Marshall, K.A., Neale, D.B., Rottmann, W.H., Strauss, S.H., 1995. A mutation
hotspot in the chloroplast genome of a conifer (Douglas-r: Pseudotsuga) is caused
by variability in the number of direct repeats derived from a partially duplicated
tRNA gene. Curr. Genet. 27, 572579.
Hiratsuka, J., Shimada, H., Whittier, R., Ishibashi, T., Sakamoto, M., Mori, M., Kondo, C.,
Honji, Y., Sun, C.R., Meng, B.Y., et al., 1989. The complete sequence of the rice (Oryza
sativa) chloroplast genome: intermolecular recombination between distinct tRNA
genes accounts for a major plastid DNA inversion during the evolution of the
cereals. Mol. Gen. Genet. 217, 185194.
Hirose, T., Sugiura, M., 2001. Involvement of a site-specic trans-acting factor and a
common RNA-binding protein in the editing of chloroplast mRNAs: development of
a chloroplast in vitro RNA editing system. EMBO J. 20, 11441152.
Hirose, T., Kusumegi, T., Tsudzuki, T., Sugiura, M., 1999. RNA editing sites in tobacco
chloroplast transcripts: editing as a possible regulator of chloroplast RNA
polymerase activity. Mol. Gen. Genet. 262, 462467.
Hoch, B., Maier, R.M., Appel, K., Igloi, G.L., Kossel, H., 1991. Editing of a chloroplast
mRNA by creation of an initiation codon. Nature 353, 178180.
Huse, S.M., Huber, J.A., Morrison, H.G., Sogin, M.L., Welch, D.M., 2007. Accuracy and
quality of massively parallel DNA pyrosequencing. Genome Biol. 8, R143.
Jansen, R.K., Raubeson, L.A., Boore, J.L., dePamphilis, C.W., Chumley, T.W., Haberle, R.C.,
Wyman, S.K., Alverson, A.J., Peery, R., Herman, S.J., Fourcade, H.M., Kuehl, J.V.,
McNeal, J.R., Leebens-Mack, J., Cui, L., 2005. Methods for obtaining and analyzing
whole chloroplast genome sequences. Meth. Enzymol. 395, 348384.
Jansen, R.K., Kaittanis, C., Saski, C., Lee, S.B., Tomkins, J., Alverson, A.J., Daniell, H., 2006.
Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome
sequences: effects of taxon sampling and phylogenetic methods on resolving
relationships among rosids. BMC Evol. Biol. 6, 32.
Kahlau, S., Aspinall, S., Gray, J.C., Bock, R., 2006. Sequence of thetomatochloroplast DNAand
evolutionary comparison of solanaceous plastid genomes. J. Mol. Evol. 63, 194207.
Karcher, D., Bock, R., 2009. Identication of the chloroplast adenosine-to-inosine tRNA
editing enzyme. RNA 15, 12511257.
Kato, T., Kaneko, T., Sato, S., Nakamura, Y., Tabata, S., 2000. Complete structure of the
chloroplast genome of a legume, Lotus japonicus. DNA Res. 7, 323330.
Knoop, V., 2010. When you can't trust the DNA: RNA editing changes transcript
sequences. Cell. Mol. Life Sci 68, 567586.
Ko, J.H., Chow, K.S., Han, K.H., 2003. Transcriptome analysis reveals novel features of the
molecular events occurring in the laticifers of Hevea brasiliensis (para rubber tree).
Plant Mol. Biol. 53, 479492.
Kotera, E., Tasaka, M., Shikanai, T., 2005. A pentatricopeptide repeat protein is essential
for RNA editing in chloroplasts. Nature 433, 326330.
Kuroda, H., Suzuki, H., Kusumegi, T., Hirose, T., Yukawa, Y., Sugiura, M., 2007.
Translation of psbC mRNAs starts from the downstream GUG, not the upstream
AUG, and requires the extended ShineDalgarno sequence in tobacco chloroplasts.
Plant Cell Physiol. 48, 13741378.
Kurtz, S., Schleiermacher, C., 1999. REPuter: fast computation of maximal repeats in
complete genomes. Bioinformatics 15, 426427.
Lee, S.B., Kaittanis, C., Jansen, R.K., Hostetler, J.B., Tallon, L.J., Town, C.D., Daniell, H., 2006.
The complete chloroplast genome sequence of Gossypium hirsutum: organization
and phylogenetic relationships to other angiosperms. BMC Genomics 7, 61.
Leebens-Mack, J., Raubeson, L.A., Cui, L., Kuehl, J.V., Fourcade, M.H., Chumley, T.W.,
Boore, J.L., Jansen, R.K., depamphilis, C.W., 2005. Identifying the basal angiosperm
node in chloroplast genome phylogenies: sampling one's way out of the Felsenstein
zone. Mol. Biol. Evol. 22, 19481963.
Lenz, H., Rudinger, M., Volkmar, U., Fischer, S., Herres, S., Grewe, F., Knoop, V., 2009.
Introducing the plant RNA editing prediction and analysis computer tool PREPACT
and an update on RNA editing site nomenclature. Curr. Genet. 56, 189201.
Lichtenthaler, H., 1999. The 1-deoxy-D-xylulose-5-phosphate pathway of isoprenoid
biosynthesis in plants. Annu. Rev. Plant Physiol. Plant Mol. Biol. 50, 4765.
Lichtenthaler, H.K., Schwender, J., Disch, A., Rohmer, M., 1997. Biosynthesis of
isoprenoids in higher plant chloroplasts proceeds via a mevalonate-independent
pathway. FEBS Lett. 400, 271274.
Lidholm, J., Szmidt, A., Gustafsson, P., 1991. Duplication of the psbA gene in the
chloroplast genome of two Pinus species. Mol. Gen. Genet. 226, 345352.
Logacheva, M.D., Samigullin, T.H., Dhingra, A., Penin, A.A., 2008. Comparative chloroplast
genomics andphylogenetics of Fagopyrumesculentumssp. ancestralea wildancestor
of cultivated buckwheat. BMC Plant Biol. 8, 59.
Lurin, C., Andres, C., Aubourg, S., Bellaoui, M., Bitton, F., Bruyere, C., Caboche, M., Debast,
C., Gualberto, J., Hoffmann, B., Lecharny, A., Le Ret, M., Martin-Magniette, M.L.,
Mireau, H., Peeters, N., Renou, J.P., Szurek, B., Taconnat, L., Small, I., 2004. Genome-
wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their
essential role in organelle biogenesis. Plant Cell 16, 20892103.
Maier, R.M., Neckermann, K., Igloi, G.L., Kossel, H., 1995. Complete sequence of the
maize chloroplast genome: gene content, hotspots of divergence and ne tuning of
genetic information by transcript editing. J. Mol. Biol. 251, 614628.
Maliga, P., 2002. Engineering the plastid genome of higher plants. Curr. Opin. Plant Biol.
5, 164172.
Maliga, P., 2004. Plastid transformation in higher plants. Annu. Rev. Plant Biol. 55,
289313.
Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka,
J., Braverman, M.S., Chen, Y.J., Chen, Z., Dewell, S.B., Du, L., Fierro, J.M., Gomes, X.V.,
Godwin, B.C., He, W., Helgesen, S., Ho, C.H., Irzyk, G.P., Jando, S.C., Alenquer, M.L., Jarvie,
T.P., Jirage, K.B., Kim, J.B., Knight, J.R., Lanza, J.R., Leamon, J.H., Lefkowitz, S.M., Lei, M., Li,
J., Lohman, K.L., Lu, H., Makhijani, V.B., McDade, K.E., McKenna, M.P., Myers, E.W.,
Nickerson, E., Nobile, J.R., Plant, R., Puc, B.P., Ronan, M.T., Roth, G.T., Sarkis, G.J., Simons,
J.F., Simpson, J.W., Srinivasan, M., Tartaro, K.R., Tomasz, A., Vogt, K.A., Volkmer, G.A.,
Wang, S.H., Wang, Y., Weiner, M.P., Yu, P., Begley, R.F., Rothberg, J.M., 2005. Genome
sequencing in microfabricated high-density picolitre reactors. Nature 437, 376380.
Mathews, S., Donoghue, M.J., 1999. The root of angiosperm phylogeny inferred from
duplicate phytochrome genes. Science 286, 947950.
McCauley, D.E., 1992. The use of chloroplast DNA polymorphism in studies of gene ow
in plants. Trends Ecol. Evol. 10, 198202.
McGarvey, D.J., Croteau, R., 1995. Terpenoid metabolism. Plant Cell 7, 10151026.
Millen, R.S., Olmstead, R.G., Adams, K.L., Palmer, J.D., Lao, N.T., Heggie, L., Kavanagh, T.A.,
Hibberd, J.M., Gray, J.C., Morden, C.W., Calie, P.J., Jermiin, L.S., Wolfe, K.H., 2001.
Many parallel losses of infA from chloroplast DNA during angiosperm evolution
with multiple independent transfers to the nucleus. Plant Cell 13, 645658.
Miyamoto, T., Obokata, J., Sugiura, M., 2002. Recognition of RNA editing sites is directed
by unique proteins in chloroplasts: biochemical identication of cis-acting
elements and trans-acting factors involved in RNA editing in tobacco and pea
chloroplasts. Mol. Cell. Biol. 22, 67266734.
Moon, E., Kao, T.H., Wu, R., 1987. Rice chloroplast DNA molecules are heterogeneous as
revealed by DNA sequences of a cluster of genes. Nucleic Acids Res. 15, 611630.
Moore, M.J., Dhingra, A., Soltis, P.S., Shaw, R., Farmerie, W.G., Folta, K.M., Soltis, D.E.,
2006. Rapid and accurate pyrosequencing of angiosperm plastid genomes. BMC
Plant Biol. 6, 17.
Neale, D.B., Saghai-Maroof, M.A., Allard, R.W., Zhang, Q., Jorgensen, R., 1988. Chloroplast
DNA diversity in populations of wild and cultivated barley. Genetics 120, 11051110.
Nickrent, D.L., Blarer, A., Qiu, Y.-L., Soltis, D.E., Soltis, P.S., Zanis, M., 2002. Molecular data
place Hydnoraceae with Aristolochiaceae. Am. J. Bot. 89, 18091817.
Ohyama, K., Fukuzawa, H., Kohchi, T., Shirai, H., Sano, T., Sano, S., Umesono, K., Shiki, Y.,
Takeuchi, M., Chang, Z., Aota, S., Inokuchi, H., Ozeki, H., 1986. Chloroplast gene
organization deduced fromcomplete sequence of liverwort Marchantia polymorpha
chloroplast DNA. Nature 322, 572574.
Page, R.D., 1996. TreeView: an application to display phylogenetic trees on personal
computers. Comput. Appl. Biosci. 12, 357358.
Palmer, J.D., Osorio, B., Aldrich, J., Thompson, W.F., 1987. Chloroplast DNA evolution
among legumes: loss of a large inverted repeat occurred prior to other sequence
rearrangements. Curr. Genet. 11, 275286.
Petit, R.J., Aguinagalde, I., de Beaulieu, J.L., Bittkau, C., Brewer, S., Cheddadi, R., Ennos, R.,
Fineschi, S., Grivet, D., Lascoux, M., Mohanty, A., Muller-Starck, G.M., Demesure-
Musch, B., Palme, A., Martin, J.P., Rendell, S., Vendramin, G.G., 2003. Glacial refugia:
hotspots but not melting pots of genetic diversity. Science 300, 15631565.
Petit, R.J., Duminil, J., Fineschi, S., Hampe, A., Salvini, D., Vendramin, G.G., 2005.
Comparative organization of chloroplast, mitochondrial and nuclear diversity in
plant populations. Mol. Ecol. 14, 689701.
Ptzinger, H., Weil, J.H., Pillay, D.T., Guillemaut, P., 1990. Codon recognition
mechanisms in plant chloroplasts. Plant Mol. Biol. 14, 805814.
Provan, J., Powell, W., Hollingsworth, P.M., 2001. Chloroplast microsatellites: new tools
for studies in plant ecology and evolution. Trends Ecol. Evol. 16, 142147.
Raubeson, L.A., Peery, R., Chumley, T.W., Dziubek, C., Fourcade, H.M., Boore, J.L., Jansen, R.K.,
2007. Comparative chloroplast genomics: analyses including newsequences fromthe
angiosperms Nuphar advena and Ranunculus macranthus. BMC Genomics 8, 174.
Rochaix, J.D., Kuchka, M., Mayeld, S., Schirmer-Rahire, M., Girard-Bascou, J., Bennoun, P.,
1989. Nuclear and chloroplast mutations affect the synthesis or stability of the
chloroplast psbC gene product in Chlamydomonas reinhardtii. EMBO J. 8, 10131021.
Ruhlman, T., Lee, S.B., Jansen, R.K., Hostetler, J.B., Tallon, L.J., Town, C.D., Daniell, H.,
2006. Complete plastid genome sequence of Daucus carota: implications for
biotechnology and phylogeny of angiosperms. BMC Genomics 7.
Sasaki, T., Yukawa, Y., Miyamoto, T., Obokata, J., Sugiura, M., 2003. Identication of RNA
editing sites in chloroplast transcripts from the maternal and paternal progenitors
of tobacco (Nicotiana tabacum): comparative analysis shows the involvement of
distinct trans-factors for ndhB editing. Mol. Biol. Evol. 20, 10281035.
Sato, S., Nakamura, Y., Kaneko, T., Asamizu, E., Tabata, S., 1999. Complete structure of
the chloroplast genome of Arabidopsis thaliana. DNA Res. 6, 283290.
Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G., Mache, R.,
2001. The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide
sequence and gene organization. Plant Mol. Biol. 45, 307315.
Schmitz-Linneweber, C., Regel, R., Du, T.G., Hupfer, H., Herrmann, R.G., Maier, R.M.,
2002. The plastid chromosome of Atropa belladonna and its comparison with that of
Nicotiana tabacum: the role of RNA editing in generating divergence in the process
of plant speciation. Mol. Biol. Evol. 19, 16021612.
Schuster, W., Hiesel, R., Wissinger, B., Brennicke, A., 1990. RNA editing in the
cytochrome b locus of the higher plant Oenothera berteriana includes a U-to-C
transition. Mol. Cell. Biol. 10, 24282431.
Shimada, H., Sugiura, M., 1991. Fine structural features of the chloroplast genome:
comparison of the sequenced chloroplast genomes. Nucleic Acids Res. 19, 983995.
Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
analyses with thousands of taxa and mixed models. Bioinformatics 22, 26882690.
Steane, D.A., 2005. Complete nucleotide sequence of the chloroplast genome from the
Tasmanian blue gum, Eucalyptus globulus (Myrtaceae). DNA Res. 12, 215220.
111 S. Tangphatsornruang et al. / Gene 475 (2011) 104112
Steinhauser, S., Beckert, S., Capesius, I., Malek, O., Knoop, V., 1999. Plant mitochondrial
RNA editing. J. Mol. Evol. 48, 303312.
Stern, D.B., Lonsdale, D.M., 1982. Mitochondrial and chloroplast genomes of maize have
a 12-kilobase DNA sequence in common. Nature 299, 698702.
Stern, D.B., Palmer, J.D., 1984. Extensive and widespread homologies between mitochon-
drial DNA and chloroplast DNA in plants. Proc. Natl Acad. Sci. USA 81, 19461950.
Swofford, D.L., 2002. PAUP: Phylogenetic Analysis Using Parsimony version 4.0b.
Sinauer Associates, Sunderland, Massachusetts.
Tangphatsornruang, S., Birch-Machin, I., Newell, C.A., Gray, J., 2010a. The effect of
different 3 untranslated regions on the accumulation and stability of transcripts of
a gfp transgene in chloroplasts of transplastomic tobacco. Plant Mol. Biol.
doi:10.1007/s11103-010-9689-1 (Epub).
Tangphatsornruang, S., Sangsrakru, D., Chanprasert, J., Uthaipaisanwong, P., Yoocha, T.,
Jomchai, N., Tragoonrung, S., 2010b. The chloroplast genome sequence of
mungbean (Vigna radiata) determined by high-throughput pyrosequencing:
structural organization and phylogenetic relationships. DNA Res. 17, 1122.
Tillich, M., Funk, H.T., Schmitz-Linneweber, C., Poltnigg, P., Sabater, B., Martin, M., Maier, R.M.,
2005. Editing of plastid RNA in Arabidopsis thaliana ecotypes. Plant J. 43, 708715.
Vangerow, S., Teerkorn, T., Knoop, V., 1999. Phylogenetic information in the
mitochondrial nad5 gene of pteridophytes: RNA editing and intron sequences.
Plant Biol. 1, 235243.
Wakasugi, T., Tsudzuki, J., Ito, S., Nakashima, K., Tsudzuki, T., Sugiura, M., 1994. Loss of
all ndh genes as determined by sequencing the entire chloroplast genome of the
black pine Pinus thunbergii. Proc. Natl Acad. Sci. USA 91, 97949798.
Wakasugi, T., Hirose, T., Horihata, M., Tsudzuki, T., Kossel, H., Sugiura, M., 1996. Creation
of a novel protein-coding region at the RNA level in black pine chloroplasts: the
pattern of RNA editing in the gymnosperm chloroplast is different from that in
angiosperms. Proc. Natl Acad. Sci. USA 93, 87668770.
Wojciechowski, M.F., Lavin, M., Sanderson, M.J., 2004. A phylogeny of legume
(Leguminosae) based on analysis of the plastid matK gene resolves many well-
supported subclades within the family. Am. J. Bot. 91, 18461862.
Wolfe, K.H., Morden, C.W., Palmer, J.D., 1991. Ins and outs of plastid genome evolution.
Curr. Opin. Genet. Dev. 1, 523529.
Wyman, S.K., Jansen, R.K., Boore, J.L., 2004. Automatic annotation of organellar genomes
with DOGMA. Bioinformatics 20, 32523255.
Yoshinaga, K., Iinuma, H., Masuzawa, T., Uedal, K., 1996. Extensive RNA editing of U to C
in addition to C to U substitution in the rbcL transcripts of hornwort chloroplasts
and the origin of RNA editing in green plants. Nucleic Acids Res. 24, 10081014.
Yukawa, M., Tsudzuki, T., Sugiura, M., 2005. The 2005 version of the chloroplast DNA
sequence from tobacco (Nicotiana tabacum). Plant Mol. Biol. Rep. 23, 17.
Zanis, M.J., Soltis, D.E., Soltis, P.S., Mathews, S., Donoghue, M.J., 2002. The root of the
angiosperms revisited. Proc. Natl Acad. Sci. USA 99, 68486853.
112 S. Tangphatsornruang et al. / Gene 475 (2011) 104112

Vous aimerez peut-être aussi