Vous êtes sur la page 1sur 18

This article was downloaded by: [Arpa Samadder]

On: 21 February 2015, At: 07:34


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and Dynamics


Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/tbsd20

Eukaryotic tRNA paradox


a

Sanga Mitra , Arpa Samadder , Pijush Das , Smarajit Das & Jayprokas Chakrabarti

ad

Computational Biology Group, Indian Association for the Cultivation of Science, Jadavpur,
Kolkata 700032, India
b

Cancer Biology & Inflammatory Disorder Division, Indian Institute of Chemical Biology,
Kolkata, India
c

Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, University


of Gothenburg, Gothenburg, Sweden
d

Gyanxet, BF 286 Salt Lake, Kolkata, India


Published online: 18 Feb 2015.

Click for updates


To cite this article: Sanga Mitra, Arpa Samadder, Pijush Das, Smarajit Das & Jayprokas Chakrabarti (2015): Eukaryotic tRNA
paradox, Journal of Biomolecular Structure and Dynamics, DOI: 10.1080/07391102.2014.1003198
To link to this article: http://dx.doi.org/10.1080/07391102.2014.1003198

PLEASE SCROLL DOWN FOR ARTICLE


Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions

Journal of Biomolecular Structure and Dynamics, 2015


http://dx.doi.org/10.1080/07391102.2014.1003198

Eukaryotic tRNA paradox


Sanga Mitraa, Arpa Samaddera, Pijush Dasb, Smarajit Dasc and Jayprokas Chakrabartia,d*
a
Computational Biology Group, Indian Association for the Cultivation of Science, Jadavpur, Kolkata 700032, India; bCancer Biology
& Inammatory Disorder Division, Indian Institute of Chemical Biology, Kolkata, India; cDepartment of Medical Biochemistry and
Cell Biology, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden; dGyanxet, BF 286 Salt Lake, Kolkata, India

Communicated by Ramaswamy H. Sarma

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

(Received 23 October 2014; accepted 25 December 2014)


tRNAs are widely believed to segregate into two classes, I and II. Computational analysis of eukaryotic tRNA entries in
Genomic tRNA Database, however, leads to new, albeit paradoxical, presence of more than a thousand class-I tRNAs
with uncharacteristic long variable arms (V-arms), like in class-II. Out of 62,202 tRNAs from 69 eukaryotes, as many as
1431 class-I tRNAs have these novel extended V-arms, and we refer to them as paradoxical tRNAs (pxtRNAs). A great
majority of these 1431 pxtRNA genes are located in intergenic regions, about 18% embedded in introns of genes or
ESTs, and just one in 3UTR. A check on the conservations of 2D and 3D base pairs for each position of these pxtRNAs
reveals a few variations, but they seem to have almost all the known features (already known identity and conserved elements of tRNA). Analyses of the A-Box and B-Box of these pxtRNA genes in eukaryotes display salient deviations
from the previously annotated conserved features of the standard promoters, whereas the transcription termination signals
are just canonical and non-canonical runs of thymidine, similar to the ones in standard tRNA genes. There is just one
such pxtRNAProAGG gene in the entire human genome, and the availability of data allows epigenetic analysis of this
human pxtRNAProAGG in three different cell lines, H1 hESC, K562, and NHEK, to assess the level of its expression.
Histone acetylation and methylation of this lone pxtRNAProAGG gene in human differ from that of the nine standard
human tRNAProAGG genes. The V-arm nucleotide sequences and their secondary structures in pxtRNA differ from that
of class-II tRNA. Considering these differences, hypotheses of alternative splicing, non-canonical intron and gene transfer are examined to partially improve the Cove scores of these pxtRNAs and to critically question their antecedence and
novelty.
Keywords: epigenetic analysis; A-Box; B-Box; conserved features; transcription termination signal; non-canonical intron

1. Introduction
tRNAs are primarily molecules needed for translation of
mRNAs into proteins (Novoa & de Pouplana, 2012;
Woese, 1970). Over the years, many other usages of
tRNAs have come to light (Raab et al., 2012). tRNAs are
believed to be separable into two distinct classes, denoted
I and II. In class-I tRNAs, the V-arm is up to 6 nucleotides long. For the class-II, the V-arm is longer, extending
up to 23 nucleotides. That aside, there are other differences in the D-arm, such as the base positions 1322 in
D-stems remain unpaired in class-II (Marck & Grosjean,
2002). tRNAs appear universally in all three domains of
life, albeit with minor differences (Fujishima & Kanai,
2014). They are also known to be present, whole or part,
in virus genomes (Das, Mitra, Sahoo, & Chakrabarti,
2014). In recent years, there has been an increasing interest in eukaryotic tRNAs for a number of reasons (Sharp,
Schaack, Cooley, Burke, & Sll, 1985). First, the numbers of eukaryotic tRNA genes are unexplainably high
(Goodenbour & Pan, 2006). The latest version of
Genomic tRNA Database (http://gtrnadb.ucsc.edu/) has
*Corresponding author. Email: j.chakrabarti@gyanxet.com
2015 Taylor & Francis

62,202 tRNA entries for just 69 eukaryotes (Chan &


Lowe, 2009). The presence of huge number of tRNA
genes has begun to refocus attention for new insights on
eukaryotic tRNAs (Bermudez-Santana et al., 2010). Second, tRNAs are increasingly coming under suspicion as
being causes of human disease (Abbott, Francklyn, &
Robey-Bond, 2014). tRNA modications are highly
correlated with human diseases (Torres, Batlle, &
de Pouplana, 2014). Third, recent transcriptomics data
suggest that in higher eukaryotes, very high percentages
of genes are actually transcribed, though what this may
mean for the tRNA genes remains to be seen (Pertea,
2012). Last, but not the least, tRNA biogenesis and its
quality control are functionally coupled with cell cycle
progression (Ghavidel et al., 2007; Kramer & Hopper,
2013; Weinert & Hopper, 2007).
tRNAs of eukaryotes were revisited to map the
diversity of tRNA genes; the non-standard ones were the
foci (Sugahara, Fujishima, Morita, Tomita, & Kanai,
2009). Subtle but signicant differences underlined
tRNAs of non-vertebrates (without skeleton) and

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

S. Mitra et al.

vertebrates (with skeleton). From the recent version of


the Genomic tRNA Database, a large number of tRNAs
were scanned that maintained their gross structure, yet
with signicant differences at the sequence level. While
studying the tRNA secondary motifs, several assortments
were found. Although most followed the well-known
secondary structures (RajBhandary & Khrer, 2006),
there were many with signicant deviations (Maruyama,
Sugahara, Kanai, & Nozaki, 2010; Randau & Sll, 2008;
Soma, 2014; Soma et al., 2013). Some of these deviant
structures correlated to those of organellar tRNAs
(Frazer-Abel & Hagerman, 2004), but the others stood
apart from motifs known so far. Remarkably, among
such deviated tRNA sequences in eukaryotes were the
more than a thousand class-I tRNA genes with uncharacteristic long V-arms. It was well known that long V-arms
belonged to class-II tRNAs of leucine and serine in
eukaryotes (Marck & Grosjean, 2002). In this paper,
long V-arms are reported for tRNAs corresponding to all
the class-I tRNAs of the remaining eighteen amino acids.
These paradoxical tRNAs are denoted as pxtRNAs here.
pxtRNAs appeared to have standard cloverleaf tRNAs,
with just deviations in V-arms, and innocuously listed in
the Genomic tRNA Database. The conserved sequence
features of pxtRNA genes vis--vis their normal counterpart were extensively studied. To check for their gene
expressions, the promoter organizations (Galli, Hofstetter,
& Birnstiel, 1981), namely the A-Box and the B-Box of
pxtRNA genes, were found to have salient differences
from the previously annotated conserved promoter
regions of standard tRNA genes. Further, we studied the
pxtRNA gene regulation, by observing their transcription
termination signals. The inspection of the signal
sequences revealed the existence of standard canonical
and non-canonical terminators, reinforcing that probably
RNA polymerase III transcribed the pxtRNA genes.
There were eukaryotes rich in pxtRNAs, one
example being Felis catus. The human genome had just
one pxtRNAProAGG gene, along with nine standard
tRNAProAGG genes. Because of this negligible presence,
it was suspected that the pxtRNA biogenesis in human
could differ substantially from that in Felis catus. However, the availability of huge amount of human ChIP-Seq
data allowed epigenetic analysis based on polymerase
enzyme epigenetic markers on human pxtRNAProAGG
gene. A comprehensive study of histone modications of
pxtRNAProAGG gene across three different cell lines,
NHEK, K562, and H1 hESC, characterized the regulatory elements and their functional interactions. Interestingly, it was found that the lone human pxtRNAProAGG
gene did not show any modications in the three different cell lines compared to the nine standard human
tRNAProAGG genes.
When the nucleotide sequences and the secondary
structures of the V-arms of the pxtRNAs were compared

with that of the class-II, they turned out to be quite


different. The respective Cove scores were different as
well; the pxtRNAs generally had somewhat lower values. These differences persuasively pointed to the different origins of pxtRNAs. However, the very signicant
presence of pxtRNAs in many eukaryotes made them
important to investigate.
Hence, several hypotheses were advanced; among
them, one possibility was that these unusual V-arms were
instances of non-canonical introns (NCIs). It was noted
that in eukaryotes, NCIs were observed in a red alga,
Cyanidioschyzon merolae (Matsuzaki et al., 2004; Soma
et al., 2007, 2013); a green alga, Ostreococcus
lucimarinus (Maruyama et al., 2010); and nucleomorph,
Guillardia theta (Kawach et al., 2005). In red alga and
in nucleomorph, there were instances of NCIs in V-arms
(Maruyama et al., 2010; Yoshihisa, 2014). An example
was found in the euryarchaea, Natrinema pellirubrum,
where a paradoxically long V-arm of a class-I tRNA
gene could be alchemized with NCIs, but with a new
anticodon in the mature tRNA. If something similar happened to eukaryotic pxtRNAs, these might be examples
of NCIs in higher eukaryotes, perhaps requiring new biochemical pathways of intron elimination. Underlying it
all could be the more profound issue of the origin of
class-I and their divergence from class-II.
The framework of our observations and analyses is
presented in Figure 1.
The large number, 62,202, of eukaryotic tRNAs
called for development of specialized tools to efciently
handle their novel features. Optical character recognition,
OCR, was adapted and developed to handle the task
(Mori, Suen, & Yamamoto, 1992).

2. Materials and methods


2.1. Data
The fasta sequences of 62,202 tRNA genes, of non-vertebrates and vertebrates, encoded in the nuclear genomes,
cataloged in Genomic tRNA Database, were retrieved for
our study. Several in-house algorithms were developed
to handle this huge array of data. The already denoted
pseudo-tRNAs, numbering 1473, and the undetermined
ones, 185 in number, were excluded from this analysis.
Epigenetic analysis involved data mining from UCSC
Genome browser (https://genome.ucsc.edu/) (Kent et al.,
2009), using the Chromatin State Segmentation data
from Encode/Broad Histone data (Broad Institute and
Bradley E. Bernstein lab at the Massachusetts General
Hospital/Harvard Medical School and Manolis Kelliss
Computational Biology group at the Massachusetts Institute of Technology), and Histone Modications by ChIPseq from ENCODE/Broad Institute (Broad Institute and
in the Bernstein lab at the Massachusetts General

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Eukaryotic tRNA paradox

Figure 1. Framework of our study.


Notes: The framework presents the main analyses done in the manuscript to unravel the pxtRNA and the tools used in each step, as
well as the main results.

S. Mitra et al.

Hospital/Harvard Medical School). Febrauary 2009


(GRch37/hg19) human genome data were used for
pxtRNAProAGG gene analysis.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

2.2. pxtRNA in silico scanning


We adapted and developed OCR technology to classify
different structures and features of tRNAs on the basis of
their characteristics, such as number of arms (including
V-arm), number of base pairs present in each arm, anticodon, and some other features with image processing.
To outline the quality control steps towards pxtRNAs, a
weight value, derived from perfect tRNA structures
(class-I and class-II), to measure the deviations of tRNA
images, was used. The weight value of tRNA of each
class was determined by considering the perfect tRNA
structures of each class. A perfect class-I tRNA structure
contained four bonded arms, nucleotide at 73rd position,
and a paired bond at 1322 position, whereas a perfect
class-II tRNA structure contained ve arms, nucleotide
at 73rd position, unpaired nucleotides at 1322 position.
When signicant deviations were observed from the
above perfect cases, it led to new, albeit deviant, tRNAs.
Among these new set were the pxtRNAs.
Images of tRNA in jpeg format were downloaded
from tRNAscan-SE search server (Lowe & Eddy, 1997).
The images were made of the characters (i.e. nucleotides)
A, U, G, and C. In OCR, images of handwritten, typewritten, or printed text (usually captured by a scanner)
were converted into machine-editable text. Figure S1
summarizes the procedure of how tRNA images were
converted into text format with characters (nucleotides)
and its coordinates (with respect to the image). Each
nucleotide had its own coordinate location dependent
upon the frame of the image. Initially, all the tRNA
images with near-equal frame size were preprocessed.
Images were binarized, followed by cropping of nucleotide to extract the feature, and stored in a database for
future usage. Once past the training phase, we entered the
recognition phase where the digital image of each tRNA
was entered into the cropping phase, to crop each nucleotide. The features were extracted from that cropped
image. Next, the feature was looked for in the database.
Next was biological property identication from
tRNA image. The biological property was required to
distinguish each tRNA to classify the given image.
tRNA arms, bonding in the stems, and anticodon were
recognized as distinguishable markers of tRNAs. It was
noted that class-I tRNA had four arms, and the class-II
had ve. The base pair position 1322 distinguished
class-I tRNAs from class-II. The detailed algorithms for
training phase and identication are in supplementary
information Text S1.
The aim of the study was to decipher tRNAs that differed in structure from the standard ones. In distorted

tRNAs, there were two kinds of deviations compared to


the normal. In some cases, tRNAs showed a lot of distortions in the stems, or the loops. These tRNAs were
considered the Undet in tRNAscan-SE search server.
The other variety of tRNAs, numbering 1431, looked
normal at rst glance, but had long V-arms even though
they had class-I anticodons. These were the pxtRNAs.
2.3. Conserved features and consensus region
determination
In order to study the conserved features, we divided the
pxtRNA into two parts: the core pxtRNA region (all the
arms and loops except the V-arm) and the V-arm.
First, the conserved elements of the core pxtRNA
region were identied for 31 distinct positions of secondary cloverleaf structure, mainly the 2D (base pairs
required to form cloverleaf structure of tRNA) and 3D
(intramolecular base pairings required to form L-shaped
tertiary structure of tRNA) pairs (Marck & Grosjean,
2002). These conserved pxtRNA identities were compared with conserved elements of normal tRNAs.
The V-arms of all pxtRNAs were extracted and
arranged both in horizontal and vertical directions in an
excel sheet, generating a 1431 1431 matrix. The
SmithWaterman algorithm was applied to generate the
matching score of a pair. Once all the scores were generated, the average of the score for each V-arm was calculated. The best average score reected the consensus
V-arm of pxtRNAs. The conserved stretch of V-arm of
the class-II tRNA was deduced in the same manner.
2.4. Promoter analysis
The nucleotides from N8 to N19 (N means nucleotide),
known as A-Box, as well as nucleotides from N52 to N62,
known as B-Box, were extracted computationally for each
pxtRNA gene and stored in an excel le. The software
WebLogo (Crooks, Hon, Chandonia, & Brenner, 2004)
was used to generate the consensus logo of A-Box and
B-Box. Further ltering criteria of amino acid, short- and
long-variant A-Box, were applied on the raw data-set.
2.5. T stretch detection
The downstream regions (up to 500 nucleotides) of
pxtRNA genes were retrieved from NCBI. The presence
of canonical as well as non-canonical runs of thymidine
was searched along with their coordinates.
2.6. Loss and Gain of pxtRNA
The 69 eukaryotes were separated into non-vertebrates
and vertebrates, each further segregated into 9 and 14
groups, respectively. Based on loss and gain of V-arm, a

Eukaryotic tRNA paradox


score was generated. Depending on the score, two matrices, one 9 9 for non-vertebrates, the other 14 14 for
vertebrates, were formed. Euclidean distances were calculated. Finally, phylogenetic tree was constructed
[Tree = Seqlinkage (Dist, Method, Names), where
Dist = Euclidean distance, Method = UPGMA/Neighborhood joining, Names = Non-vertebrates/Vertebrates]. The
distance calculation and tree generation were done in
MATLAB
(http://www.mathworks.in/products/matlab/)
platform.

pared to the standard tRNAs were mapped and identied


to analyze the variations. Moreover, the promoter
sequences and transcription terminator sequences of these
pxtRNA genes were carefully analyzed. A comparative
analysis of chromatin state and their corresponding histone
modications, such as methylation and acetylation, of the
lone human pxtRNAProAGG gene with nine standard
human tRNAProAGG genes was performed. Finally, the
study of loss and gain of V-arms in class-I tRNAs provided the lineages where most of pxtRNAs were present
in eukaryotes.

3. Results

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

In-silico scanning of tRNAs of 69 eukaryotes from


Genomic tRNA Database revealed 1431 pxtRNAs, whose
V-arm-lengths ranged from 9 to 26 nucleotides (Table 1).
The conserved elements of these pxtRNA sequences comTable 1.

3.1. Conserved features and consensus region


comparison
The pxtRNAs seemed to have almost all conserved features of normal tRNAs, except in a few cases. In the

Summary of pxtRNAs present in eukaryotes.


Length of
the V-arm

Sl. No.

Species name

Non-Vertebrates
1.
Strongylocentrotus purpuratus
2.
Pristionchus pacicus
3.
Physcomitrella patens
4.
Sorghum bicolor
5.
Vitis vinifera
6.
Caenorhabditis brenneri
7.
Caenorhabditis briggsae
8.
Caenorhabditis elegans
9.
Caenorhabditis japonica
10.
Caenorhabditis remanei
11.
Oryza sativa
12.
Populus trichocarpa
13.
Arabidopsis thaliana
14.
Glycine max
15.
Medicago truncatula
Vertebrates
1.
Danio rerio
2.
Gasterosteus aculeatus
3.
Takifugu rubripes
4.
Xenopus tropicalis
5.
Bos taurus
6.
Ovis aries
7.
Sus scrofa
8.
Ailuropoda melanoleuca
9.
Canis familiaris
10
Felis catus
11.
Petromyzon marinus
12.
Ornithorhynchus anatinus
13.
Loxodonta africana
14.
Heterocephalus glaber
15.
Mus musculus
16.
Homo sapiens
17
Pan troglodytes
18.
Pongo pygmaeus abelii
Total

Max

Min

Total no. of tRNA genes with V-arm

G
I,T
F
Y
D,G,M,Y
Y, V
A , R, C, G, I, P
R, D, H, I, Q,G,K
V, G, I, F, P, T, W
V, R, N, Q, G, H, I, P
V, Y
C
K, Y
Y
Y

14
15
15
15
19
17
19
17
16
17
15
19
15
15
15

14
12
15
14
13
14
14
15
13
10
14
19
13
15
14

1
10
1
2
9
70
29
13
13
23
8
1
5
2
3

R, N, C, G, I, M, F, P, T, Y
I, Q
H
N, Y
A,R, C, E, G, K, W
K, E
N, D
A, R, C, Q, G, K,P, T
R, Q, E, G, K, P, W
R, N, C, Q, E, G, H, I, K, F, P, T, W
R, N, P, V
C, F, P, V
G, T
A
A
P
P, V
P

15
14
14
14
24
19
22
17
17
26
15
14
17
15
16
12
15
12
26

13
14
14
14
9
10
12
12
12
11
13
12
14
12
16
12
12
12
9

23
2
1
5
17
2
7
105
44
1016
5
4
4
2
1
1
2
1
1431

Amino acid (One letter code)

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

S. Mitra et al.

case of pxtRNAAla, the trios of conserved series of base


pairs in acceptor-stem were replaced by G1-C72,
C2-G71, and C3-G70. Noteworthy was the position 20
in D-loop, where the highly conserved A was completely
absent for pxtRNAArg, instead any of the three bases, U/
G/C, was found. Moreover, for pxtRNAArg and
pxtRNATrp, the 3D base pairs were G15-C/U48 in place
of A15-U48 of tRNAArg and tRNATrp. Changes were
observed at positions 370 and 568 for pxtRNAGln,
with C-G and G-C, respectively. The uniqueness of
pxtRNAGlu at the 1st base pair of A-stem was nullied
by G1-C72. The exclusive feature of tRNAPro, that is,
G10-U25 and C73 were not reected in pxtRNAPro;
instead, G10-C25 and U73 were found maximum times.
The exceptional C1-G72 of tRNATyr was not observed
for pxtRNATyr, instead G1-C72 dominated. The U60 was
dominant in pxtRNAVal. Overall 766 in A-stem had a
tendency to remain unpaired with G:G. Similarly 1322
of D-stem had mismatched base pair in the decreasing
order of C:A > G:A > A:A , except in some cases where
paired C-G and G-U were found. The 3D-base pairs
remained highly conserved for standard tRNAs compared
to pxtRNAs.
For the V-arms, from N44 to N48, of all pxtRNAs,
the consensus sequence shared by them was determined.
Using SmithWaterman algorithm (Liao, Yin, & Cheng,
2004), the stretch UCAUGAUCUCACGGCUC
appeared common to all pxtRNAs. To check if this was
a replica of the V-arm of class-II tRNAs, we extracted
the V-arms of tRNALeu and tRNASer to be the conserved
CUGGGGUCUCCCCGC. The differences between the

two stretches were well evident. Further, to support our


observation, we deduced their secondary structures via
RNAfold (Hofacker et al., 1994) and checked their folding through tRNAscan-SE search server, using their corresponding whole tRNA. The V-arms of standard class-II
tRNAs formed the well-dened hairpin loop structures,
but the ones for the pxtRNAs had weaker energies
(Figure 2). When visualized through tRNAscan-SE
search server, the V-arm of the standard tRNA formed a
three base pairs G-C-rich stem structure, whereas that of
pxtRNA formed a two base pairs relaxed stem structure.
The striking presence of the nucleotides UCUC in the
loops of secondary structures of V-arms of pxtRNAs and
of class-II tRNAs was noticed. Thus, the conserved
sequence differences, and the secondary structure divergence, indicated that pxtRNAs and standard class-II
tRNAs perhaps had different origins.
The secondary structure of all V-arms of pxtRNAs
conrmed the formation of hyphenated hairpin palindromes, but most of them had delicate structures. However, the formation of hairpin loops indicated that
compensatory mutations were helping to maintain the
stem structure for these pxtRNAs, or rather their V-arms,
and that implied some functionality for them.
3.2. pxtRNA expression and regulation
To determine the expression and regulation of pxtRNA
genes, their promoter sequences, positions in the
genomes, and nally their transcription terminal signals
were investigated.

Figure 2. V-arm comparison.


Notes: The consensus V-arm of all pxtRNAs is compared to that of class-II tRNAs. This gure clearly reveals the differences that
pxtRNA has at the level of V-arm compared to that of class-II tRNA. The strong V-arm of class-II tRNA compared to weak V-arm of
pxtRNA points to plausible different functionalities.

Eukaryotic tRNA paradox

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

3.2.1. Promoter detection


To gain further insight into the process of transcription,
and the expressions of these pxtRNA genes, it was helpful to identify the transcriptional motifs, known as promoters, recognized by the RNA polymerase III. In
eukaryotes, RNA polymerase III recognized and bound
to type 2 promoters of tRNAs, 5S rRNAs, as well as U6
snRNAs to initiate transcription (Good et al., 2013;
Hamada, Huang, Lowe, & Maraia, 2001).
It was known that two separated sequence stretches,
namely the A-Box and the B-Box, in the coding part of
the tRNA genes were essential and sufcient for the initiation of transcription. The A-Box, located N8 to N19,
at the 5 end of the tRNA gene, was specic to the initiation of transcription; the B-Box from N52 to N62 in
the TC arm served as the other promoter (Naykova
et al., 2003). Additionally, it was reported that among
eukaryotes, the A-Box had two subclasses, the short-variant, A1, 11 bases long, and the long-variant, A2, of 12
bases; the B-Box was always 11 bases in length
(Rogozin et al., 2000). The A-Box was generally of the
short-type in pxtRNA gene. A comparison between
pxtRNA genes and tRNA genes revealed the variations
in the A-Boxes and the B-Boxes (Figure 3).
Comparison of the short A-Box with the long
A-Box brought out an important characteristic of

pxtRNA promoters; C12 in short was replaced by T12 in


long. Some other identifying characteristics of pxtRNA
gene included the relatively high occurrence of C13 in
the long, but G13 for short. Corresponding to short and
long A-Boxes, variability was found in the B-Box. For
pxtRNA gene, corresponding to short-variant A-Box,
T54, A59, and T60 predominated in B-Box, but A54,
G59, and C60 of B-Box correlated to the long-variant
A-Box. In the B-Box of pxtRNA genes, the consensus
sequence, GGTTCGANNCC (Galli, Hofstetter, &
Birnstiel, 1981), was generally maintained. Noteworthy
was the reversion (transition/transversion) of nucleotide
predominance in the B-Box of pxtRNA gene with
change of the A-Box variant (Figure 3). In both the ABox and the B-Box promoters, transition and transversion of nucleotides were noticed (Figure 3). Surprisingly,
the dual variant of A-Box had a dichotomous presence
of transition and transversion at N13. Frequency of transition was higher, being one for the A-Box and two for
the B-Box, whereas number of transversion was low,
being just one in the B-Box, and none at all for the
A-Box. Mutagenesis in pxtRNA promoters might impose
some functionality change in their activities.
These variations in promoters of the pxtRNA genes
provided the motive to check whether the A-Box had
any subtle differences across the eighteen amino acids.

Figure 3. A-Box and B-Box Promoter.


Notes: A comparison of the short variant A-Box (11 bases in length) with the long variant (12 bases in length) is presented. The
gure shows that with changing length of A-Box, the corresponding B-Box experiences alterations. The changes at the nucleotide
level are represented in the gure. In this gure, A-Box stretch corresponds to 8th to 19th nucleotide base positions of tRNA gene;
the B-Box corresponds to 52nd to 62nd nucleotide base positions of tRNA gene.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

S. Mitra et al.

In a previous report, the sequence heterogeneity of


A-Box for all tRNA gene types for eukaryotes was illustrated (Naykova et al., 2003). Both variants of
A-Box promoters of pxtRNA genes of 18 amino acids
were compared with the previously annotated A-Box of
standard tRNA genes. It was remarkable that positions 8,
14, 15, 18, and 19 of A-Box of these pxtRNA genes
were more conserved and characterized by the consensus
Y.RR.RR (Y means T/C; R stands for G/A, rst .
between Y and R represents the nucleotides between N8
and N14, and second . represents nucleotides between
N15 and N18 of A-Box promoters). In pxtRNA genes, it
was always T8, never C8. Only exception to this consensus motif was pxtRNATrp gene with A8/T8. A detailed
analysis of promoters of pxtRNA genes vis--vis the
standard tRNA genes is shown in Table 2. A survey of
Table 2 demonstrated that N12 and N13 positions of the
A-Box were affected more in pxtRNA gene. Extending
the inter amino acid comparison to the B-Box, T54 dominated but was sometimes replaced by A54 in
pxtRNAArg and pxtRNAGln, genes. It was believed earlier that G57 was generally conserved in all eukaryotic
tRNA genes (Marck & Grosjean, 2002). Remarkably,
A57 was observed in the B-Box of in pxtRNACys,
pxtRNAGlu and pxtRNATyr genes. N59 was the most variable nucleotide in the B-Box of pxtRNA genes with all

Table 2.

possible nucleotide combinations. Moreover, pyrimidine


shuttling was observed at N60, where either T60 or C60
dominated. Rest of the positions remained almost conserved.
3.2.2. Transcription termination signal location
To verify whether these pxtRNA genes were transcript
candidates of RNA polymerase III, the anticipated runs
of thymidine downstream had to be ascertained. The
study revealed the presence of both canonical and noncanonical stretches of T (Table 3). Among canonical
stretches, while T4 (i.e. TTTT) was the most abundant,
T5 and T6 were also observed. Noteworthy was the fact
that these canonical stretches were not always located in
the immediate vicinity, sometimes even 240 nucleotides
downstream. In cases where the canonical terminal signals were missing from immediate downstream, noncanonical T-stretches took their places (Gunnery, Ma, &
Mathews, 1999). Mostly disrupted T5 and T4 were
encountered (Orioli et al., 2011). Though most of the
pxtRNA genes had either canonical or non-canonical Tstretches, about 5% pxtRNA genes failed to have any
transcription termination signal within 500 nucleotides
downstream. In about 20% cases (out of all the pxtRNAs
for which transcription termination sequences were

A-box promoter comparison of pxtrnA genes with normal tRNA genes.


Nucleotide Position of the A-Box

Sl. No.

tRNA gene types


Ala

10

11

12

13

15

16

CG
TG
TA
CT
TG
CG

GC

GA

CT
GC

TA
GA

GT
CT

GT
GC

1.
2.
3.

tRNA
tRNAArg
tRNAAsn

4.
5.
6.

tRNAAsp
tRNACys
tRNAGln

GA
GA

GC

CT

7.

tRNAGlu

8.
9.
10.
11.
12.
13.

tRNAGly
tRNAHis
tRNAIle
tRNALys
tRNAMet
tRNAPhe

GC

GA

CT

14.
15.

tRNAPro
tRNAThr

16.
17.

tRNATrp
tRNATyr

GA

18.

tRNAVal

CA
CT

CT
AT

CT
TG
CG
CT
CG
TG

GT
AT

GC

GA

*A-Box begins form N8 (8th Nucleotide) of tRNA gene. In this Table, only those positions, which show variations, are represented.

Eukaryotic tRNA paradox


Table 3.

Transcription termination signal of pxtRNA genes in the respective species. (V = A/G/C).


Canonical
stretches
T6

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Sl. No.

T5

Non-canonical stretches
Disrupted T5

T4

Species name

Disrupted T4

T2VT3

T3VT2

T3V2T2

TVT4

TVT3

T3VT

T2VT2

T2V2T2

Non-vertebrates
1.
Strongylocentrotus purpuratus
2.
Pristionchus pacicus
3.
Physcomitrella patens
4.
Sorghum bicolor
5.
Vitis vinifera
6.
Caenorhabditis brenneri
7.
Caenorhabditis briggsae
8.
Caenorhabditis elegans
9.
Caenorhabditis japonica
10.
Caenorhabditis remanei
11.
Oryza sativa
12.
Populus trichocarpa
13.
Arabidopsis thaliana
14.
Glycine max
15.
Medicago truncatula

U
U

U
U
U
U
U
U

U
U
U
U
U

U
U
U
U
U
U
U

U
U

U
U

Vertebrates
1.
Danio rerio
2.
Gasterosteus aculeatus
3.
Takifugu rubripes
4.
Xenopus tropicalis
5.
Bos taurus
6.
Ovis aries
7.
Sus scrofa
8.
Ailuropoda melanoleuca
9.
Canis familiaris
10
Felis catus
11.
Petromyzon marinus
12.
Ornithorhynchus anatinus
13.
Loxodonta africana
14.
Heterocephalus glaber
15.
Mus musculus
16.
Homo sapiens
17
Pan troglodytes
18.
Pongo pygmaeus abelii

U
U
U

U
U

U
U
U
U
U
U
U

U
U

U
U

U
U
U

determined), T3 was encountered, but had either canonical or non-canonical T-stretches downstream. There were
cases of two to three runs of T3 followed by canonical
or non-canonical termination signal. In Table S1, we
note the canonical and non-canonical thymidine stretches
downstream of T3. In just about 1% of cases, there were
no canonical or non-canonical termination signal following T3 in Pan troglodytes, Pongo pygmaeus abelii, and
Felis catus. There was an earlier report that T3 made
RNA polymerase III leaky, synthesizing other novel ncRNAs (Orioli et al., 2011). However, T3 followed by
canonical or non-canonical thymidine stretches presumably kept the outcome of transcription unaltered. From
the nature of transcription termination signals, it was
assumed that RNA polymerase III transcribed pxtRNA
genes. However, the presence of long 3 trailer

sequences, and the absence of termination signals in a


few cases, placed a question mark on whether all the
pxtRNA genes were transcribed into pxtRNAs.
To investigate further, the locations of pxtRNA genes
within the genomes were scanned. We used the NCBI
genbank le and the UCSC Genome browser to detect
their locations. It was reported earlier that relaxed terminator, which allowed leaky RNA polymerase III termination due to substantial RNA polymerase III read through,
along with tRNA maturation could produce other small
RNAs (Barbezier et al., 2009; Orioli et al., 2011). It was
necessary to check whether the pxtRNA genes were
dicistronic. A thorough investigation determined that
pxtRNA genes were mostly intergenic. In rest of the
cases, pxtRNA genes were embedded within introns of
some coding genes or ESTs. In a single case in Felis

10

S. Mitra et al.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

catus, pxtRNAArgTCG was found located at the 3UTR of


the gene-CCDC55. In previous studies, it was established
that a mature tRNA could be derived from a hybrid
premRNA/pretRNA complex (Segni, Gastaldi, &
Tochhini-Valentini, 2008). On this basis, it was presumed
that the embedded pxtRNA genes could be transcribed
successfully. The termination signals of embedded
pxtRNA genes were compared with that of intergenic
ones. It was found that both had combinations of canonical and non-canonical T-stretches. In Figure 4(A), we
portray the probable locations of pxtRNA genes along
with their abundances. In Figure 4(B), we highlight the
location of those pxtRNA genes that were either embedded or had other molecular entities upstream or downstream.
3.3. Epigenetic analysis of the lone pxtRNA gene in
human
Eukaryotic tRNA genes were transcribed by RNA
polymerase III (Haeusler & Engelke, 2006; Zhang,
Lukoszek, Mueller-Roeber, & Ignatova, 2011). Recent
studies demonstrated that the promoter regions of
genes, which were transcribed by either RNA polymer-

ase III or RNA polymerase II, shared similar epigenetic


signatures (Barski et al., 2010; Ebersole et al., 2011;
Oler et al., 2010). In human, there was only one
pxtRNAProAGG gene among ten reported tRNAProAGG
genes, and the rest nine tRNAProAGG genes were standard class-I. Due to the constraint of epigenetic data
available in UCSC genome browser, transcriptomics of
only the single human pxtRNAProAGG gene and nine
human tRNAProAGG genes could be performed. For a
census of global chromatin environment and change in
DNA based biological functions, histone modications
between the standard tRNAProAGG genes (Kouzarides,
2007) and pxtRNAProAGG gene in Homo sapiens were
compared. Epigenetic modications in pxtRNAProAGG
gene were analyzed in three different cell lines: embryonic stem cells (H1 hESC), erythrocytic leukemia cells
(K562), and normal epidermal keratinocytes (NHEK).
Different sets of chromatin state corresponding to active
and weak promoter, inactive and poised promoter,
strong and weak enhancer, insulator heterochromatin,
and transcriptional transition and elongation were used.
These chromatin states portrayed a dynamic landscape
of modications across the different cell types. The
modication data ([H (3) {K (4) me1,2,3 (9) me3 ac

Figure 4. pxtRNA genes loci. (A) The overall pxtRNA gene locus within genome is represented graphically. From this graph it is
clear where pxtRNA genes occur, what they have upstream and downstream and also at their opposite strands. (B) It highlights the
location of pxtRNA genes that are either embedded or have other molecular entities at its upstream or downstream.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Eukaryotic tRNA paradox


(27) me3 ac (36) me3 (79) me2} (4) {K (20) me1}])
(Ernst et al., 2010) from UCSC genome browser were
screened and correlated for expression status across the
three different cell lines. From this analysis, it was
observed that the distribution patterns of the chromatin
states as well as modications varied from one tRNA
gene to another, and in between the three cell lines
(Figure 5(A)).
In H1 hESC cell line, pxtRNAProAGG gene was found
to be in heterochromatin state where chromatin was inaccessible for transcription. Similarly, no modications were
observed in this stretch of pxtRNAProAGG gene. Taken
together, we suspected that pxtRNAProAGG gene in H1
hESC cell line might be transcribed differently. Further,
we checked the modication patterns in other two cell
lines, K562 and NHEK, and found that the results were
substantially similar. An important observation in K562
cell line was that pxtRNAProAGG gene remained at their
weak enhancer region. This weak enhancer could be distinguished from the active enhancer by H3K27ac modications (Creyghton et al., 2013). Enrichment of H3K4me1
and H3K27ac modications were key determinants to
interrogate the actively transcribed genes (Kimura, 2013).
For this work, it was demonstrated that at weak enhancer
region, pxtRNAProAGG gene in K562 cell line had no such
modications. Overall, the distinguishing feature of
pxtRNAProAGG gene in all three cell lines showed no
modications compared to standard tRNAProAGG genes.
This result indicated that in the absence of any activating
modication marker, pxtRNAProAGG gene might remain
transcriptionally different.
Now, extending our chromatin state analysis, modications to the rest nine normal tRNAProAGG genes in
three different cell lines were explored. This analysis
was to see whether these modications varied in any
way across the three cell lines for the nine normal
tRNAProAGG genes, and if so, how far were they comparable to the case of pxtRNAProAGG gene of human. It
was noticed that all active promoters of nine standard
human tRNAProAGG genes were generally characterized
by the enrichment of H3K9ac modication in K562 cell
lines. H3K9ac was one of the most important epigenetic
markers associated with active promoter region
(Bernstein et al., 2005; Heintzman et al., 2007; Suzuki,
Kondo, Wakayama, Cizdziel, & Hayashizaki, 2008; Du
et al., 2013). However, for this work, it was noteworthy
that in one such standard tRNAProAGG gene at chromosome 17 in K562 cell line, this active marker was totally
silent. Therefore, the question was how signicant was
this activation epigenetic marker in determining the
active transcription of the gene? Indeed, it has been
found that in mouse, active transcription was not correlated to the degree of H3K9ac and the expression of
gene (Nishida et al., 2006; Suzuki et al., 2008). It was

11

found that at active promoter regions, all the standard


tRNAProAGG genes showed di-, tri-methylation, and
H3K9ac mark in NHEK cell line, but no such signals for
the lone pxtRNAProAGG gene of human (Figure 5(B)). In
accordance with the result, it might be assumed that in
NHEK cell line, this pxtRNAProAGG gene followed a different pathway. Finally, in H1 hESC cell line, the standard tRNAProAGG genes showed signicant levels of
modications, but pxtRNAProAGG gene showed none,
and hence, was probably different.
This sketch of the epigenetic landscape (Figure 5(B))
gave a slender hint that probably pxtRNAProAGG gene in
human had a different course to follow compared to the
normal tRNAProAGG genes. This issue remained open for
future investigation. Though pxtRNAProAGG gene had
signicant differences in epigenetic status, the promoter
sequences (A-Box: TGGTCTAGGGG and B-Box:
GGTTCAAATCC) were the same as those of standard
tRNAProAGG genes of human.
As mentioned earlier, the total number of pxtRNAs
in eukaryotes ran into more than a thousand. Of these,
more than a thousand, there is just one in human. There
were eukaryotes with signicant number of pxtRNAs.
Though epigenetic data were available only for the
human now, the more signicant cases of pxtRNAs are
clear elsewhere. It is there, in these other eukaryotes, that
they might be more signicant simply because of the
sheer number of pxtRNAs present.
3.4. Lineages of V-arm of the pxtRNAs
The epigenetic marks on the lone human pxtRNAProAGG
gene might not be typical of pxtRNAs in other eukaryotes. There were other eukaryotes with larger numbers of
pxtRNAs, and hence, the loss and gain of V-arm from the
core region should be studied. The tree of Figure 6 shows
the pattern of gain and loss of V-arm in phylogenetic
space, to understand how class-I tRNA of some lineages
evolved with long V-arm. pxtRNAs were concentrated in
particular groups of non-vertebrates and vertebrates. They
were not found in Fungi, Haemosporida, Insecta, and
Leishmania of non-vertebrates and Aves, Didelphimorphi,
Logomorpha, Perissodactyla, and Reptilia of vertebrates.
The Figure 6 tree shows remarkable divergence compared
with the already known phylogeny of eukaryotes. The
most striking pattern revealed in the vertebrate group was
the single clade formation of Carnivora with the maximum amount of gain (~81.35%). Further, the dendogram
for the individual groups of Carnivora revealed an astonishingly huge divergence of Felis catus from Canis familiaris. The large difference between them was most
likely the result of further V-arm gain after they diverged
from each other in geological timeline. In case of nonvertebrates, Rhabditida showed the maximum hike.

S. Mitra et al.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

12

Figure 5. Epigenetic analysis of human pxtRNAProAGG gene. (A) The gure represents the distribution patterns of the chromatin
state segmentation of pxtRNAProAGG gene, compared with the standard tRNAProAGG genes in three different cell lines (H1 hESC,
K562 and NHEK). The gure also shows that pxtRNAProAGG gene remains at an inactive chromatin state, whereas the standard
tRNAProAGG genes stay at an active chromatin state in three different cell lines. (B) The gure shows the comparison of histone modication between the pxtRNAProAGG gene and standard tRNAProAGG genes. In this gure, we clearly observe the differences of histone methylation and acetylation marks between pxtRNAProAGG gene and standard tRNAProAGG genes. The solid colored line (GreenH1 hESC, Violet-K562 and Pink-NHEK cell line) represents data from ChIP-Seq Peak File and the gray scale solid line represents
data from Signal File.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Eukaryotic tRNA paradox

13

Figure 6. V-arm Loss and Gain.


Notes: Phylogenetic tree is constructed based on loss and gain of V-arm from core tRNA region for all the lineages considering nonvertebrates and vertebrates. The highest gains, in Rhabditida and Carnivora respectively, in both the trees are further magnied to give
better illustrations. The respective color in each pie-chart represents the percentage of amino acid sharing by pxtRNA corresponding
to each group of non-vertebrates and vertebrates (Details of percentage is given in Table S2).

4. Discussion
Our manuscript describes the computational analysis of
an unusual group of more than a thousand eukaryotic
class-I tRNA genes with paradoxically long V-arms.
Analysis of the conserved sequence features on these
pxtRNAs revealed only minor differences with the standard ones. But there were observable distinguishing
marks in the A-Box and the B-Box promoter regions.
Transcription termination signals were found to be an
assortment of canonical and non-canonical runs of thymidine immediately following, or at a distance downstream. Since computationally analyzable data were not
available for all eukaryotic lineages, further investigations were needed. The formation of universal cloverleaf
tRNA structures in thousand-plus cases could not just be
a random phenomenon. Recent transcriptomics investigations in eukaryotes had consistently pointed towards very
high percentages of genomic regions being transcribed
into RNAs (Costa, 2011; Hangauer, Vaughn, &
McManus, 2013). Even the so-called pseudogenes with
active promoters were known to be transcribed into

competing endogenous RNAs (Pei et al., 2012; Tay,


Rinn, & Pandol, 2014).
These tRNA genes were different since they
belonged to class-I, but had long variable arm characteristic of class-II. The supporting observations on RNA
polymerase III promoters and terminators, epigenetic signatures, and evolutionary conservations, provided suggestive hypotheses for further experimental validation.
The necessary biochemical experiments were considered
important by one of the referees. These, however, were
beyond the scope of this computational work.
In the human genome, out of several hundred tRNA
genes, there was just one pxtRNA gene, namely the
pxtRNAProAGG gene. As such, the miniscule presence of
pxtRNA gene in human would not have drawn much
attention, but for the fact that a huge amount of data on
epigenetic marks and gene expressions were currently
available. For the rst time the epigenetic analysis of
these tRNA genes was taken up, and its role in active
transcription studied. It was known that RNA polymerase III and RNA polymerase II sometimes leave similar

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

14

S. Mitra et al.

epigenetic signatures at the promoter regions of the


genes transcribed by them (Barski et al., 2010; Oler
et al., 2010). It was remarkable that in human, this lone
pxtRNAProAGG gene showed lack of histone modications at three different cell lines (normal, stem, and cancer), whereas the nine normal tRNAProAGG genes had
signicant modications. As remarked in the UCSC genome browser, all standard tRNAProAGG genes had salient
epigenetic marks, notably, H3K4me23 and H3K9ac in
the three cell lines. These were basically histone marks
associated with active genes (Du et al., 2013). However,
in the three cell lines, the lone pxtRNAProAGG gene had
no such epigenetic signature. The chromatin state segmentation study also showed that pxtRNAProAGG gene
either remained in heterochromatin state in stem cell, or
was accompanied by a weak enhancer or promoter.
Taken together, the role, and the level of its transcription
of the lone pxtRNA gene in human, appeared to be different from the standard ones. Nevertheless, what was
the role of pxtRNAProAGG gene in human? After all,
recent transcriptomics data in human spoke in favor of
exhaustive levels of transcription (Pertea, 2012). Hence,
experimental data were required to decipher the expression level, and the role of the lone human pxtRNA gene,
if any.
Since, in human, there was just a single pxtRNA gene,
its role, if any, was perhaps circumscribed by this rather
miniscule presence. Contrast that with the large number of
pxtRNA genes in Felis catus of Carnivora. In the absence
of epigenetic data on these, the promoter sequences of the
pxtRNA genes were studied across all eukaryotes for further insight. Strikingly, some salient deviations of nucleotides were observed in the A-Box of pxtRNA genes. The
TFIIIC bound to the B-Box and A-Box, and recruited
RNA polymerase III (Schramm & Hernandez, 2002). Variations observed in the A-Box could be a signicant pointer to the differences with the standard tRNA genes,
though its implications remained an open issue to be
assessed experimentally in future.
Along with the promoter, the other check was the
transcription termination signals. Just as with the standard eukaryotic tRNA genes, pxtRNA genes had some
canonical and non-canonical T-stretches as transcription
termination motifs. It was noteworthy that non-vertebrates mostly had canonical T-stretches, with just a few
non-canonical termination signals, on the other hand, the
probability of nding long runs of T immediately following the pxtRNA gene was signicantly lower in vertebrates. Generally, the non-canonical termination
sequences were followed downstream by canonical termination sequences. The maximum distance where we
could locate the transcription termination signal was
about 240 nucleotides downstream of pxtRNA gene.
Conservation and variations of the core regions and
the V-arms of pxtRNAs vis--vis the standard tRNAs

were studied. The two did not differ signicantly. Many


of the conserved features of tRNAs were shared by
pxtRNAs as well. From the study of loss and gain of Varms of class-I tRNAs, it was deduced that pxtRNAs
were present only in certain lineages. V-arm gain per
species uctuated across eukaryotes from several V-arm
gains, down to only a few or nil per species, and showed
no simple phylogenetic pattern, with V-arm-rich and
V-arm-deprived species comingled across the eukaryotic
tree. It was clear that to study the relevance and functions of pxtRNAs, we needed more data on regions with
maximal gain. We further checked whether number of Varm gain had any direct correlation with total tRNA pool
of any species considered, but failed to observe any such
association. The dearth of V-arm gain in many of the lineages made us curious about whether they were recent
gains, or actually lost recently in many genera, and some
remnants were being observed in certain lineages. When
considered as latest, then relative lack of pressure
towards genome reduction in the evolution may also
have played a role in gain (Wolf & Koonin, 2013).
Clearly, the signicant appearance of pxtRNAs in certain
lineages stood out, and their functionality needed future
validation. These pxtRNAs could conceivably be the
missing link between class-I and class-II tRNAs. It was
predicted earlier that class-II tRNAs evolved earlier compared to class-I (Sun & Caetano-Anolles, 2008a, 2008b).
Hence, with evolutionary progression the length of Varm diminished. The distribution pattern did not appear
to follow any simple evolutionary route and perhaps produced by a different selection process. We explored the
possibility of alternative splicing in tRNA genes experiencing V-arm loss and gain. No evidence was documented for alternative splicing, although this was not
conclusive because sufcient amount of data was not
available. If this apparent V-arm retention event reected
inefciency of splicing of this V-arm, positive selection
for transcript delity could have driven this loss. If,
instead, this alternative splicing event was functional, it
would be surprising for the V-arm to be lost for most
cases, unless the alternative splicing pattern evolved
recently in some lineages, or had been lost in other lineages prior to V-arm thrashing (Barbosa et al., 2012).
In view of its novelty, it was natural to propose and
examine several alternate hypotheses on pxtRNAs. The
rst possibility was that these were a new class of
tRNAs, having divergent functions (Gieg, 2008), but
then that could only be validated experimentally. The
increased length of pxtRNAs could foster altered tertiary
structures leading to mis-translations (Ling et al., 2009;
Yadavalli & Ibba, 2013).
The next possibility was the long length V-arm was
perhaps a NCI, removed during tRNA maturation. If the
long length V-arm was removed on maturation, the paradox would disappear, though the deviations in the

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Eukaryotic tRNA paradox


A-Box would persist. However, which route of excision
could this NCI follow, if they were intron at all? These
extra nucleotides could be expurgated neither by RNA
motif rule (Tocchini-Valentini, Fruscoloni, & TocchiniValentini, 2005, 2007), nor by the typical presence of
segment I/II as in many cases of eukaryotes (Di Nicola
Negri et al., 1997). On the other hand, since the extra
nucleotides formed a stem-loop structure in the V-arm,
could it be treated as a relaxed BHL motif, and perhaps
removed? An example, portrayed in Figure S2, shows
that by removing this extra notch between the bases 46
and 47, a better Cove score was generated in tRNAscanSE search server.
The other hypothesis was that these pxtRNAs actually
coded for some other tRNA with different anticodon. In
that case, some extra nucleotides were needed to be readjusted as NCI, leading to a change in the anticodon. In
support of this logic, the scenario in euryarchaea, N. pellirubrum, was worth recounting. A copy of tRNAGlyACC
gene was annotated for N. pellirubrum in NCBI and that
was supported by tRNAscan-SE search server. But, paradoxically, there was a long V-arm for this class-I tRNA.
There were two odd features in this otherwise innocuous
tRNAGlyACC gene: (1) A starting anticodon, if ever, was
rarely present in archaea, and (2) being a class-I tRNA,
how did it have a long V-arm? Interestingly, it was found
that tRNAArgCCG gene was not yet annotated in NCBI for
N. pellirubrum. Therefore, it was natural to hypothesize
that this paradoxical copy of tRNAGlyACC gene could
actually be the standard tRNAArgCCG gene, provided the
proper NCIs could be located and adjusted. This is precisely what is shown in Figure S3. Hence, taking a cue
from this example, it might be possible that the pxtRNAs
in eukaryotes were actually candidates of other tRNA
genes mis-annotated in NCBI. In case of archaea, however, it was simple to decipher, since the total tRNA
count was small, with just one missing candidate (Das,
Mitra, & Chakrabarti, 2010). In eukaryotes there was but
a large measure of ambiguity, since every species had
multiple numbers of isoacceptor tRNAs (Chaley,
Korotkov, & Phoenix, 1999).
Finally, pxtRNAs might be derived from other genomic portions, or from organellar genomes. Indeed, Sorghum bicolor had two pxtRNAsTyrGUA, one had a match
with a chloroplast tRNA, the other with a mitochondrial
tRNA. Interestingly, for Caenorhabditis elegans, one
pxtRNAIleUAU and another pxtRNAAspAUC had 100%
similarities with complement of introns of protein
T22C1.12 and Y92H12A.5, respectively. However, the
number of such matches was too few, and it could not
be concluded that the pxtRNAs were just copied from
elsewhere.
To conclude, the 1431 pxtRNA genes in eukaryotes
appeared to be distributed preferentially in a few lineages, most notably in Felis catus. The conserved

15

sequence features of pxtRNAs matched closely those of


the standard tRNAs; however, the promoter regions had
salient deviations. The nature of transcription termination
signals strongly pointed towards the possibility that
pxtRNAs were the transcript products of RNA polymerase III. Their V-arm sequences and secondary structures
made them different from the standard class-II tRNAs.
In comparing pxtRNAs with other tRNAs, the case of
the euryarchaea, N. pellirubrum, was worth recounting.
Here, the innocuously annotated class-I tRNAGlyACC
gene in NCBI had a long class-II V-arm. The paradox in
this archaea could be solved by properly identifying
NCIs in tRNAGlyACC gene. Maturation, that is, elimination of the NCIs, led to a standard class-I tRNA, but
with a different anticodon, namely that of tRNAArgCCG
gene. In eukaryotes, however, NCIs were observed only
in a very few instances, hence the process of tRNA maturation and the elimination of introns remained an open
issue.
Supplementary material
The supplementary material for this paper is available
online at http://dx.doi.10.1080/07391102.2014.1003198.

References
Abbott, J. A., Francklyn, C. S., & Robey-Bond, S. M. (2014).
Transfer RNA and human disease. Frontiers in Genetics, 5,
118.
Barbezier, N., Canino, G., Rodor, J., Jobet, E., Saez-Vasquez,
J., Marchfelder, A., & Echeverra, M. (2009). Processing of
a dicistronic tRNA-snoRNA precursor: Combined analysis
in vitro and in vivo reveals alternate pathways and coupling to assembly of snoRNP. Plant Physiology, 150,
15981610.
Barbosa-Morais, N. L., Irimia, M., Pan, Q., Xiong, H. Y.,
Gueroussov, S., Lee, L. J., ... Blencowe, B. J. (2012). The
evolutionary landscape of alternative splicing in vertebrate
species. Science, 338, 15871593.
Barski, A., Chepelev, I., Liko, D., Cuddapah, S., Fleming, A.
B., Birch, J., ... Zhao, K. (2010). Pol II and its associated
epigenetic marks are present at Pol III-transcribed noncoding RNA genes. Nature Structural & Molecular Biology,
17, 629634.
Bermudez-Santana, C., Attolini, C. S., Kirsten, T., Engelhardt,
J., Prohaska, S. J., Steigele, S., ... Stadler, P. F. (2010).
Genomic organization of eukaryotic tRNAs. BMC Genomics, 11, 114.
Bernstein, B. E., Kamal, M., Lindblad-Toh, K., Bekiranov, S.,
Bailey, D. K., Huebert, D. J., ... Lander, E. S. (2005).
Genomic maps and comparative analysis of histone modications in human and mouse. Cell, 120, 169181.
Chaley, M. B., Korotkov, E. V., & Phoenix, D. A. (1999).
Relationships among isoacceptor tRNAs seems to support
the coevolution theory of the origin of the genetic code.
Journal of Molecular Evolution, 48, 168177.
Chan, P., & Lowe, T. (2009). GtRNAdb: A database of transfer
RNA genes detected in genomic sequence. Nucleic Acids
Research, 37, D93D97.

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

16

S. Mitra et al.

Costa, F. F. (2011). Non-coding RNAs: Could they be the


answer? Briengs in Functional Genomics, 10, 316319.
Creyghton, M. P., Cheng, A. W., Welstead, G. G., Kooistra, T.,
Carey, B. W., Steine, E. J., ... Jaenisch, R. (2013). Histone
H3K27ac separates active from poised enhancers and predicts developmental state. Proceedings of the National
academy of Sciences of the United States of America, 107,
2193121936.
Crooks, G. E., Hon, G., Chandonia, J. M., & Brenner, S. E.
(2004). WebLogo: A sequence logo generator. Genome
Research, 14, 11881190.
Das, S., Mitra, S., & Chakrabarti, J. (2010). Multiply expressed
tRNA genes? Journal of Bimolecular Structure & Dynamics, 28, 239246.
Das, S., Mitra, S., Sahoo, S., & Chakrabarti, J. (2014). Viral/
plasmid captures in Crenarchaea. Journal of Biomolecular
Structure & Dynamics, 32, 546554.
Di Nicola Negri, E., Fabbri, S., Bufardeci, E., Baldi, M. I.,
Gandini Attardi, D., Mattoccia, E., & Tocchini-Valentini,
G. P. (1997). The Eucaryal tRNA splicing endonuclease
recognizes a tripartite set of RNA elements. Cell, 89, 859
866.
Du, Z., Li, H., Wei, Q., Zhao, X., Wang, C., Zhu, Q., ... Su, Z.
(2013). Genome-wide analysis of histone modications:
H3K4me2, H3K4me3, H3K9ac, and H3K27ac in Oryza sativa L. Japonica. Molecular Plant, 6, 14631472.
Ebersole, T., Kim, J. H., Samoshkin, A., Kouprina, N.,
Pavlicek, A., White, R. J., ... Larionov, V. (2011). tRNA
genes protect a reporter gene from epigenetic silencing in
mouse cells. Cell Cycle, 10, 27792791.
Ernst, J., Kheradpour, P., Mikkelsen, T. S., Shoresh, N., Ward,
L. D., Epstein, C. B., ... Bernstein, B. E. (2010). Mapping
and analysis of chromatin state dynamics in nine human
cell types. Nature, 473, 4349.
Frazer-Abel, A. A., & Hagerman, P. J. (2004). Variation of the
acceptoranticodon interstem angles among mitochondrial
and non-mitochondrial tRNAs. Journal of Molecular
Biology, 343, 313325.
Fujishima, K., & Kanai, A. (2014). tRNA gene diversity in the
three domains of life. Frontiers in Gentics, 5, 111.
Galli, G., Hofstetter, H., & Birnstiel, M. L. (1981). Two conserved sequence blocks within eukaryotic tRNA genes are
major promoter elements. Nature, 294, 626631.
Ghavidel, A., Kislinger, T., Pogoutse, O., Sopko, R., Jurisica,
I., & Emili, A. (2007). Impaired tRNA nuclear export links
DNA damage and cell-cycle checkpoint. Cell, 131, 915
926.
Gieg, R. (2008). Toward a more complete view of tRNA biology. Nature Structural & Molecular Biology, 15, 1007
1014.
Good, P. D., Kendall, A., Ignatz-Hoover, J., Miller, E. L., Pai,
D. A., Rivera, S. R., ... Engelke, D. R. (2013). Silencing
near tRNA genes is nucleosome-mediated and distinct from
boundary element function. Gene, 526, 715.
Goodenbour, J. M., & Pan, T. (2006). Diversity of tRNA genes
in eukaryotes. Nucleic Acids Research, 34, 61376146.
Gunnery, S., Ma, Y., & Mathews, M. B. (1999). Termination
sequence requirements vary among genes transcribed by
RNA polymerase III. Journal of Molecular Biology, 286,
745757.
Haeusler, R. A., & Engelke, D. R. (2006). Spatial organization
of transcription by RNA polymerase III. Nucleic Acids
Research, 34, 48264836.
Hamada, M., Huang, Y., Lowe, T. M., & Maraia, R. J. (2001).
Widespread use of TATA elements in the core promoters

for RNA polymerases III, II, and I in ssion yeast. Molecular and Cellular Biology, 21, 68706881.
Hangauer, M. J., Vaughn, I. W., & McManus, M. T. (2013).
Pervasive transcription of the human genome produces
thousands of previously unidentied long intergenic noncoding RNAs. PLoS Genetics, 9, e1003569.
Heintzman, N. D., Stuart, R. K., Hon, G., Fu, Y., Ching, C.
W., Hawkins, R. D., ... Ren, B. (2007). Distinct and predictive chromatin signatures of transcriptional promoters and
enhancers in the human genome. Nature Genetics, 39,
311318.
Hofacker, L., Fontana, W., Stadler, P. F., Bonhoeffer, S.,
Tacker, M., & Schuster, P. (1994). Fast folding and
comparison of RNA secondary structures. Monatshefte fr
Chemie, 125, 167188.
Kawach, O., Vob, C., Wolff, J., Had, K., Maier, U. G., &
Zauner, S. (2005). Unique tRNA introns of an enslaved
algal cell. Molecular Biology and Evolution, 22, 1694
1701.
Kent, W. J., Sugnet, C. W., Furey, T. S., Roskin, K. M.,
Pringle, T. H., Zahler, A. M., & Haussler, D. (2009). The
human genome browser at UCSC. Genome Research, 12,
9961006.
Kimura, H. (2013). Histone modications for human epigenome analysis. Journal of Human Genetics, 58, 439445.
Kouzarides, T. (2007). Chromatin modications and their function. Cell, 128, 693705.
Kramer, E. B., & Hopper, A. K. (2013). Retrograde transfer
RNA nuclear import provides a new level of tRNA quality
control in Saccharomyces cerevisiae. Proceedings of the
National Academy of Sciences, 110, 2104221047.
Liao, H. Y., Yin, M. L., & Cheng, Y. (2004). A parallel implementation of the SmithWaterman algorithm for massive
sequences searching. Proceedings Conference of IEEE
Engineering in Medicine and Biology Society, 4, 2817
2820.
Ling, J., So, B. R., Yadavalli, S. S., Roy, H., Shoji, S., Fredrick, K., ... Ibba, M. (2009). Resampling and editing of
mischarged tRNA Prior to translation elongation. Molecular
Cell, 33, 654660.
Lowe, T. M., & Eddy, S. R. (1997). tRNAscan-SE: A program
for improved detection of transfer RNA genes in genomic
sequence. Nucleic Acids Research, 25, 955964.
Marck, C., & Grosjean, H. (2002). tRNomics: Analysis of
tRNA genes from 50 genomes of Eukarya, Archaea, and
Bacteria reveals anticodon-sparing strategies and domainspecic features. RNA, 8, 11891232.
Maruyama, S., Sugahara, J., Kanai, A., & Nozaki, H. (2010).
Permuted tRNA genes in the nuclear and nucleomorph genomes of photosynthetic eukaryotes. Molecular Biology and
Evolution, 27, 10701076.
Matsuzaki, M., Misumi, O., Shin-i, T., Maruyama, S., Takahara, M., Miyagishima, S., ... Kuroiwa, T. (2004). Genome
sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature, 428, 653657.
Mori, S., Suen, C. Y., & Yamamoto, K. (1992). Historical
review of OCR research and development. Proceedings of
the IEEE, 80, 10291058.
Naykova, T. M., Kondrakhin, Y. V., Rogozin, I. B., Voevoda,
M. I., Yudin, N. S., & Romaschenko, A. G. (2003). Concerted changes in the nucleotide sequences of the intragenic
promoter regions of eukaryotic genes for tRNAs of all
specicities. Journal of Molecular Evolution, 57, 520532.
Nishida, H., Suzuki, T., Kondo, S., Miura, H., Fujimura, Y., &
Hayashizaki, Y. (2006). Histone H3 acetylated at lysine 9

Downloaded by [Arpa Samadder] at 07:34 21 February 2015

Eukaryotic tRNA paradox


in promoter is associated with low nucleosome density in
the vicinity of transcription start site in human cell. Chromosome Research, 14, 203211.
Novoa, E. M., & de Pouplana, L. R. (2012). Speeding with
control: codon usage, tRNAs, and ribosomes. Trends in
Genetics, 28, 574581.
Oler, A. J., Alla, R. K., Roberts, D. N., Wong, A., Hollenhorst,
P. C., Chandler, K. J., Cairns, B. R. (2010). Human
RNA polymerase III transcriptomes and relationships to
Pol II promoter chromatin and enhancer-binding factors.
Nature Structural & Molecular Biology, 17, 620628.
Orioli, A., Pascali, C., Quartararo, J., Diebel, K. W., Praz, V.,
Romascano, D., Dieci, G. (2011). Widespread occurrence
of non-canonical transcription termination by human RNA
polymerase III. Nucleic Acids Research, 39, 54995512.
Pei, B., Sisu, C., Frankish, A., Howald, C., Habegger, L., Mu,
X. J., Gerstein, M. B. (2012). The GENCODE pseudogene resource. Genome Biology, 13, 126.
Pertea, M. (2012). The human transcriptome: An unnished
story. Genes, 3, 344360.
Raab, J. R., Chiu, J., Zhu, J., Katzman, S., Kurukuti, S., Wade, P.
A., Kamakaka, R. T. (2012). Human tRNA genes function
as chromatin insulators. The EMBO Journal, 31, 330350.
RajBhandary, U. L., & Khrer, C. (2006). Early days of tRNA
research: Discovery, function, purication and sequence
analysis. Journal of Biosciences, 31, 439451.
Randau, L., & Sll, D. (2008). Transfer RNA genes in pieces.
EMBO reports, 9, 623628.
Rogozin I. B., Kondrakhin Y. V., Naykova T. M., Yudin N. S.,
Voevoda M. I., & Romaschenko, A. G. (2000). The module
organisation of the A and B-Boxes in the tRNA intragenic
promoter. Proceedings of BGRS2000, 1, 106110.
Schramm, L., & Hernandez, N. (2002). Recruitment of RNA
polymerase III to its target promoters. Genes & Development, 16, 25932620.
Segni, G. D., Gastaldi, S., & Tochhini-Valentini, G. P. (2008).
Cis- and trans-splicing of mRNAs mediated by tRNA
sequences in eukaryotic cells. Proceedings of the National
Academy of Sciences, 105, 68646869.
Sharp, S. J., Schaack, J., Cooley, L., Burke, D. J., & Sll, D.
(1985). Structure and Transcription of Eukaryotic tRNA
Gene. Critical Reviews in Biochemistry and Molecular
Biology, 19, 107144.
Soma, A. (2014). Circularly permuted tRNA genes: Their
expression and implications for their physiological relevance and development. Frontiers in Genetics, 5, 117.
Soma, A., Onodera, A., Sugahara, J., Kanai, A., Yachie, N.,
Tomita, M., ... Sekine, Y. (2007). Permuted tRNA genes
expressed via a circular RNA intermediate in cyanidioschyzon merolae. Science, 31, 450453.

17

Soma, A., Sugahara, J., Onodera, A., Yachie, N., Kanai, A.,
Watanabe, S., Sekine, Y. (2013). Identication of
highly-disrupted tRNA genes in nuclear genome of the red
alga, Cyanidioschyzon merolae 10D. Scientic Reports, 3,
19.
Sugahara, J., Fujishima, K., Morita, K., Tomita, M., & Kanai,
A. (2009). Disrupted tRNA gene diversity and possible
evolutionary scenarios. Journal of Molecular Evolution, 69,
497504.
Sun, F. J., & Caetano-Anolls, G. (2008a). Evolutionary patterns in the sequence and structure of transfer RNA: Early
origins of archaea and viruses. PLoS Computational Biology, 4, e1000018.
Sun, F. J., & Caetano-Anolls, G. (2008b). The evolutionary
signicance of the long variable arm in transfer RNA.
Complexity, 14, 2639.
Suzuki, T., Kondo, S., Wakayama, T., Cizdziel, P. E., &
Hayashizaki, Y. (2008). Genome-wide analysis of abnormal
H3K9 acetylation in cloned mice. PLoS ONE, 3, e1905.
Tay, Y., Rinn, J., & Pandol, P. P. (2014). The multilayered
complexity of ceRNA crosstalk and competition. Nature,
505, 344352.
Tocchini-Valentini, G. D., Fruscoloni, P., & Tocchini-Valentini,
G. P. (2005). Coevolution of tRNA intron motifs and tRNA
endonuclease architecture in Archaea. Proceedings of the
National Academy of Sciences, 102, 1541815422.
Tocchini-Valentini, G. D., Fruscoloni, P., & Tocchini-Valentini,
G. P. (2007). The dawn of dominance by the mature
domain in tRNA splicing. Proceedings of the National
Academy of Sciences, 104, 1230012305.
Torres, A. G., Batlle, E., & de Pouplana, L. R. (2014). Role of
tRNA modications in human diseases. Trends in Molecular Medicine, 20, 306314.
Weinert, T., & Hopper, A. K. (2007). tRNA trafc meets a
cell-cycle checkpoint. Cell, 131, 838840.
Woese, C. (1970). Molecular mechanics of translation: A reciprocating ratchet mechanism. Nature, 226, 817820.
Wolf, Y. I., & Koonin, E. V. (2013). Genome reduction as the
dominant mode of evolution. BioEssays, 35, 829837.
Yadavalli, S. S., & Ibba, M. (2013). Selection of tRNA charging quality control mechanisms that increase mistranslation
of the genetic code. Nucleic Acids Research, 41, 1104
1112.
Yoshihisa, T. (2014). Handling tRNA introns, archaeal way and
eukaryotic way. Frontiers in Genetics, 10, 116.
Zhang, G., Lukoszek, R., Mueller-Roeber, B., & Ignatova, Z.
(2011). Different sequence signatures in the upstream
regions of plant and animal tRNA genes shape distinct
modes of regulation. Nucleic Acids Research, 39, 3331
3339.

Vous aimerez peut-être aussi