Vous êtes sur la page 1sur 5

The genetic code

As it became avident that genes controlled the structure of polypeptides,


attention focused on how the sequence of the four base-pairs in the in DNA could
control the squence of the 20 amino acids found in proteins. With the discovery of
the mRNA intermediary, the question become one of how the squence of the four.
Bases present in mRNA molecules could spesify the amino acids squence of a
polypeptide. What is the nature of the genetic code relating mRNA base squence
(or DNA base-pair squences) to amino acid squences? Clearly, the symbols or
letters used it in the code must be the bases; but what comprises a codon, the
unit or word spesifying one amino acid (or, actually, one aminoacyl-tRNA
complex)?
Figure 10.29
Probable secondary structure of the human snRNA U1 and its pottential base-pairing
to the 5 consensus squence of introns in nuclear pre-mRNAs. The pre-mRNA
squense is shown in read with the 5 consensus squence (5-GUAAGU-3) of the
intron written 3 to 5 to aligan it with the complementary squence near the 5 and
of the U1 snRNA. The m symbols rever to methyl side groups. The U1 snRNA has a
trimethyl G cap at the 5 terminus linked to the subterminal A m by a 5-5
triphosphate linkage like most eukaryotic mRNAs (see Fig.10.21). (after P. Epstein,
R. Reddy, and H. Busch, site-spesific Cleavage by T1 Rnase of U-1 RNA in U-1
Ribonucleoprotein particles Proc. Natl. Acad. Sci. U.S.A 78:1562-1566, 1981).
Three Nucleotides per Codon
Twenty different amino acids are in cerporated during translation. Thus, at least 20
different codons must be formed using the four symbols (bases) available in the
message (mRNA). Two bases per codon would yield only 4 2 or 16 possible codonsclearly not enough. Three bases per codon yield 4 3 or 64 possible codons-an
apparent exsess.
The first strong evidence that the genetic code was in fact a triplet code (three
nucleotides per codon) resulted from a genetic analysis of proflavin-induced
mutation in the rII locus of phage T4 carried out by F. H. C. Crick and colleagues in
1961. Crick and collegues isolated proflavin-induced revertants of a proflavininduced mutation. (proflavin, an acridine dye, induces single base-pair additions and
deletions; see chapter 11, pp. 310-311). These revertants were shown (by
bakcrosses to will type; see Fig. 11.9) to result from the accurence of suppressor at
the original site of mutation. Crick and colleagues resoned that if the original
mutation was a single base-pair addition or deletion, than the suppressor mutations
must be single base-pair deletions or addition, respectively accurring at a site or
sites near the original mutetion. A single base-pair addition or deletion will after the
reading frame of the gene and mRNA (the codons in phase during translation). This
is illustrated in Fig. 10.30a. when the suppressor mutations where isolated a single

mutans by screening progeny of bakcrosses to wild type, they were found to


produce mutance phenotypes, just like the original mutation. Crick and colleagues
next isolated profavin-induced suppressor mutation of the original suppressor
mutations, and so on.
All the isolated mutations were then classified into two group, plus (+) and minus (-)
(for additions and deletions although Crick et al. Had no idea which mutation would
suppress a (-) mutation, but not another (+) mutation, and versa (see Fig. 10.30 and
legend for addition the details). Next, Crick at al. Constructed recombinants that
carried various combination of the (+) and the (-) mutation. Recombinants with two
(+) mutation or two (-) mutations always had mutant phenotypes just like the
single.
Figure 10.30
(Right page) : schematic illustration of crick and coworkers proof that the genetic
code is a triplet code (three bases per codon). Crick and colleagues studied serries
of suppressor mutations of a mutation at the rII locuse of phage T4. The original rII
mutation had been induced with the acridine dye proflavin; it was most likely,
therefore the result of a single base-pair addition or deletion. In (a), the original
mutation is shown arbitrarily as a single base-pair addition, spesifically as an AT
base-pair insertion (wild-type allele mutant allele). The nucleotide-pair sequence
shown for the wild-type allele (and thus the mRNA base sequence and amino acid
sequence od the polypeptide) is hypothetical. Crick and coworkers selected
phenotypic revertants of this mutant and demonstrated by bakcrosses that these
revertants resulted from suppressor mutations, not back-mutations at the original
mutant site. If the original mutation is an addition, gene-product activity might be
restored by a deletion (single base-pair) mutation in a nearby region, for example,
deletion of a CG base-pair as shown (mutant allele revertant allele). The original
addition mutation will change the reading frame (determining codons in phase for
translation) for all codon distal (relative to the derection of translation) to the site of
the mutation. The subsequent deletion (suppressor mutation) will restore the
reading frame for the distal portion of the gene. If the altered amino acid sequence
is not critical to function, the protein product by doubly mutant gene will be active.
When suppressor mutations were isolated in single-mutant strains by backrosses to
wild type, all were found to yield mutant phenotypes, just like the original mutations
that they suppressed. Crick and collagues then isolated proflavin-induced
suppressor mutations of the previously isolated suppressor mutations (present in
single mutant recovered from bakcrosses). After repeating this process for several
cycles, all the mutations were classified ads plus (for single base-pair addition) or
minus (for single base-pair deletion) on the bases that a plus mutation would
suppress a minus mutation, but not another plus mutation, and vice versa. Actually,
crick and colleagues had no idea whether the plus group of mutations represented
additions or not; they could just ask likely have been deletions. The only important

point was that all deletions ended up in one group (be it the plus group or the minus
group), and all additions ended up in the other group.
Having so classified the mutations, crick and coworkers next performed the critical
experiment. They isolated recombinants carrying various combinations of plus and
minus mutations. When two plus mutstions were present in a recombinant, its
phenotype was always mutant. The same was true in the case of recombinants with
two minus mutations. However, recombinans contraining three plus mutations
(b) or three minus mutations frequently had wild mean type phenotypes. Thus,
the wild-type reading frame for the distal portion of the gene was restored by three
single base-pair additions or three single base-pair deletions, but was altered by
either one or two base-pair additions or deleions. These results are most easily
explained if each codon containts three bases.
Mutants. Recombinants carrying three (+)mutations (Fig. 10.30b) or three (-)
mutations, howover, often had wild-type phenotypes. This indicated that the
additions of three base-pairs or the deletion of three base-pairs left the distal
portion of the gene with the correct (wild-type) reading frame, a result that would
be expected only if each codon contained three nucleotides.
Confirmation that the coding ratio (nucleotidas to amino acids) is indeed three has
come from many sources. Considerable evidence favoring a triplet code evolved
from studies using in vitro translation systems. The following observations were of
major importance. (1) trinucleotides were found sufficient to stimulate specific
binding of aminoacly-tRNAs to ribosomes. For example, 5-UUC-3 stimulates
ribosomal binding of phenylalanyl-tRNA phe. (2) chemically synthesized RNA
molecules, containing repeating dinucleotide sequences, directed the synthesis of
copolymers withalternating amino acid sequences. Polu (UG) n, for example, when
used as an artificial mRNA in an in vitro system, directed the synthesis of copolymer
(cys-val)n. (3) molecules with repeating trinucleotide sequences, on the other hand,
directed the synthesis of mixture ojfthree homopolymers (initiation being random on
such an mRNA in an in vitro system ). Poly (UGG)n, for example, directedthe
synthesis of a mixture of polyserine, polyarginine, and polyvaline. Againd, these
results are only consistent white a triplet code. Ultimately, the triplet nature of code
was difinitivelyestablished by the results of correlated nucliec acid and protein
sequence (e.g., see Fig. 10.31 and chapter 12,Fig 12.26).
Deciphering the Code
The decipering of the genetic code-that is, determining (1) which codons specify
which amino acids, (2) how many of the 64 posible codons are used, (3) how the
code is punctuated, and (4) whether different species use the same or bdifferent
codons-took place during the early 1960s and was one of the most exciting periodes
in the history of science. The craking of the genetic code had an effect on the life

sciences like the splitting of the atom did on the phlysical sciences. It opened up a
vast new field of study of gene expression.
The first major breakthrough came in 1961 when M.W. Nirenberg (1968 Nobel Prize
recipient) and J.H. Matthaei and then S. Ochoa (1995 Nobel Prize recipient) and
cowokers demonstrated that synthetic RNA molecules could be used as artificial
mRNAs to direct in vitro protein synthesis. That is, when ribosomes, aminoaclytRNAs, and the soluble proten factors required for translation are purified free of
natural mRNAs, these components can be cambined in vitro and stimulated to
synthesize polypeptides by the addition of chemically synthesized RNA molecules. If
these synthetic mRNA molecules are of known composition, the composition of the
polypeptides synthesized can be used to deduce which codons specify which amino
acids.
The first codon assignment (UUU for phenylalanine) was made when Nirenberg and
Matthaei demonstated that polyuridylic acid [polu U + (U)n ]directed the synthesis
of polyphenylalnine [(phenylalnine)n]. Ochoa and others continued this approach
using synthetic RNAs with random sequences of know nucleotide composition, such
as 50 percent U and 50 percent G. The friquencies of the different triplets in such a
rondom copolymer can be easily calculated. For example, the 50 percent U/50
percent G copolymer will caontain 12.5 percent (1/2 x 1/2x1/2=1/8) of each of the
eightpossible codont: UUU, UUG,UGU,GUU,,UGG,GUG,GGU and GGG. These can
then be compared with the amino acidsancorporated (phenylalnine, leucine,
cysteine, valine, tryptophan, and glycine) when this rondom copolymer is use in an
in vinto protein-synthesizing system. By verying the composition, for example, to
75 percent U and 25 percent G, one can very the relative with the relative
frequences of the eigth codons and correlate them with the relative frequencies of
the amino acids in the polypeptides synthesized. Such experiments provided agreat
deal of information about tha nature of the code.
More definitive data were later obtained by H. G. Khorana using in vitron systems
that were activated by synthetic mRNAs of know nucleotide sequences. Khoranas
experiments permitted direct camparisons between nucleotide sequences and the
amino acids incorporated in response to these sequences. The ultimate crecking
of the code accured when trinusleotides were found to function as mine-mRNAs in
directing the specific binding of aminoacyl-tRNAs to ribosomes. By using all the 64
possible trinucleotide sequences in such amino-tRNA binding experiments, it was
possible to veryfi the codon assignment made from data of earlier experiments.
On the basis of extensive data accumulated aver several years, the codon
assignments shown in Table 10.1 became firmaly established. Two important
question remained to be answered. (1) are the assigments based on in vitro
experiments valid in vivo? (2) is the code universal, that is, do the codons specify
the same amino acids in all organisms? Several lines of evidence now indicate that
these codon assigments are correct for protein synthesis in vivo for most, if not all,

species. When the amino acid substitutions that result from mutations induced with
chemical mutagens with specific mutagenic affects (see Chapter 11) are determined
by amino acid sequencing, the substitutions are almost always consistent with the
codon assigments given in table 10.1 and the known affect of the mutagen. More
convingcingly, when the nucleotide sequences of genesor of mRNAs are determined
and compored with the amino acid sequences of the polypeptides encoded by those
genes or mRNAs, the observed correlations are always found to be those predicted
from the accepted codon assigments (Table 10.1). this can be illustrated by
comparing the nucleotide sequence of the gene coding for the protein coat or
capsid of bacteriophage MS2 with the amino acid sequence of the capsid
polypeptide (Fig. 10.31). phage MS2 stores its genetic information in RNA (like TMV
virus; see Chapter 5, pp. 96-97). Its chromosome is equivalent to an mRNA molecule
in organisms with DNA genomes. (Also see Chapter 12, Fig. 12.26.)
Figure 10.31
Correlated nucleotide sequence of the coat protein gene of the RNA bacteriophage
MS2 and the amino acid sequence of the poly peptide (coat protein) that is
specifies. The initial sequences of the MS2 replicase (RNA polymerase) gene and the
correlated six amino-terminal amino acids are also shown. An untranslated
interhanec squences separates the genes. Translation proceeds from left to rigth as
drawn, and from the coat gene to the replicase gene. Bolh polypeptides are
initiated by f-methionene at the AUG codons andicated. The methionine is cleaved
off following (or during) translation, yielding the alanine terminus on the coat
protein and the serine terminus on the replicase. The coat pretein gene has two
tandem periodes (two termination codons) at the end, as though to make
adsolutely certain that translation terminates at this point. In addition, a third
termination codon is located,in proper reading frame, seven base triplets from the
second tandem termination. Each of the three termination codons is present once
between the translated sequence of the coat gene and the translated sequenceof
replicase gene. Note that the amino acid sequence of this protein, synthesized in
vivo, is precisely that predicted from the nucleotide squence using the codon
assigment presented in Table 10.1. (Data from W. Min Jou, G. Haegeman, Y.
Ysebaert, and W. Fiers, Nature 237: 82-88, 1972)
Degeneracy and Wobble
All the amino acids except metionine and trytophan are specified by more that one
codon (Table 10.1). three amino acids, leucine, serine, and arginine, are each
specified by six different codons. Isolaucine has three codons. The other amino
acids each have either two of four codons. The accorence of more that one codon
per amino acid is called degeneracy (though the usual connotations of the term are
hardly approtiate). The degeneracy in the genetic code is not at rodom; instead, it is
highly ordered. Usually, the

Vous aimerez peut-être aussi