Académique Documents
Professionnel Documents
Culture Documents
FEATURES OF
RNA
POLYMERASE
PAUSE SITES
HEMANTH PRABHAR
11B205
ABSTRACT:
During the highly regulated transcription process
which results in the production of mRNA, the polymerase
that moves over the DNA and it undergoes 1) pause, 2)
arrest or 3) termination when it encounters certain
sequence motifs. This was identified because of the
pausing of the polymerase molecule at certain sites. Also
transcription arrest is found to happen due to the
slipping-out of the RNA-DNA hybrids due to certain
structures called R loops.
ABOUT THE PROJECT:
The project is about identifying the effect of sequence
features of DNA such as tandem repeats and R loops on
polymerase pausing.
BACKGROUND WORK:
Gene transcription takes place in three phases.
Initiation,
Elongation and
Termination
Elongation step is the most regulated and regulation
takes place by RNA polymerase pausing. The two
causes for polymerase pausing are:
1)Thermodynamic stability of RNA-DNA hybrid (A-T
rich)
2)Structures formed by RNA will displace the RNA from
the polymerase catalytic site (1)
STEPS INVOLVED:
The initial pausing happens at A-T rich region.
Polymerase backtracking to
thermodynamically stable G-C region
RNA cleavage takes place and stable pausing
occurs
Pausing also happens due to the presence of
structures called R loops. R loop is a structure in
which one strand of DNA is partially or completely
hybridized with RNA leaving the other strand
unpaired(2)
R loops a cause pausing downstream of the poly A site. G
rich regions are present downstream of poly A region and
stabilize R loops. It has been proposed that R loops may be critical
for RNAPII to pause downstream of the poly A site(3)
AIM:
METHODOLOGY:
A total of 2200 genes were downloaded from BIOMART taking 100
genes randomly from each chromosome to ensure that the data set is
random.
TOOLS USED:
GRO-seq or global run-on sequencing software is used for the
study to locate paused genes throughout the human genome. GROseq is a relatively new methodology for documenting transcribed
regions in the human genome by isolation and large-scale
sequencing of nascent RNAs.
The GRO-seq methodology sequences nascent RNAs on a large
scale by isolation and then it documents transcribed regions in the
human genome. Thus polymerase locations can be identified
precisely and their active promoters and directionality can also be
identified (4)
TRANSCRIPTION FACTORS:
The transcription factors for these may also have an effect on the
pausing of the polymerases. DBD is a database of predicted
transcription factors in completely sequenced
genomes.
The predicted transcription factors all contain
assignments to sequence specific DNA-binding
domain families. The predictions are based on
domain assignments from the SUPERFAMILY and
Pfam hidden Markov model libraries. Benchmarks of
our transcription factor predictions show they are
accurate and have wide coverage on a genomic scale
(9)
The DBD consists of predicted transcription factor
repertoires for 930 completely sequenced genomes.
The transcription factors for the human genome was
taken and the transcription factors for the genes
were listed.
RESULTS:
The results for the correlation of tandem repeats with the pause sites
TABLE 1:
Pausing status
Tandem
repeats present
No tandem
repeats
total
positive
322
134
456
negative
703
196
899
total
1025
330
1355
The results for the correlation of R loops with the pause sites
TABLE 2:
Pausing status
R loop present
R loop absent
total
positive
286
61
347
negative
1256
597
1853
total
1542
658
2200
Also the transcription factors are not found to have any correlation
with the pause sites of the polymerase.
DISCUSSION:
1. The presence of R loops, tandem repeats as well as the
transcription factors are not found to have any influence on the
pausing status of the polymerase. This hold true for the whole
genome as the data set is chosen randomly.
2. Though the R-loops and sequence repeats may play a role in
determining pausing, there is no concluding evidence obtained
from this result.
3. While DNA sequence must contain all information needed for
regulation of genes, our understanding of the protein machinery
interprets this information is limited. In particular, it is unclear
what sequence features specify the location and duration of
promoter-proximal Pol II pausing on a given gene.
4. The transcription factors are random and do not form a consistent
order for the pause sites and hence they do not have any effect on
the pausing of polymerase.
BIBLIOGRAPHY:
1) Sequence features of RNA polymerase pause sites by M.
Rajeeva Lochan
2) Nechaev, S. & Adelman, K. Pol II waiting in the starting gates:
regulating the transition from transcription initiation into
productive elongation. Biochim Biophys Acta 1809, 3445 (2011).
3) R-loop-mediated genomic instability is caused by impairment of
replication fork progression by Wenjian Gan, Jie Liu, Keng Shen.
4) Nascent rna sequencing reveals widespread and divergent
initiation at human promoters by core lj, list jt (2008)
5) http://bowtie-bio.sourceforge.net/index.shtml
6) http://broadinstitute.org/software/igv/download
7) .Quantitative model of R-loop forming structures reveals a novel
level of RNA-DNA Interactome complexity Wongsurawat T et al.,
Nucleic Acids Research, 2011, doi:10.1093/nar/gkr1075
8) McIvor, Elizabeth I, Urszula Polak, and Marek Napierala. New
Insights into Repeat Instability. RNA Biology 7, no. 5 (2010):
551558