Vous êtes sur la page 1sur 4

THE GENETIC CODE

As it became evident that genes controlled the struc- ture of polypeptides, attention focused on how
the sequence of the four base-pairs in DNA could control the sequence of the 20 amino acids found
in proteins. With the discovery of the mRNA intermediary, the question became one of how the
sequence of the four bases present in mRNA molecules could specify the amino acid sequence of a
polypeptide. What is the nature of the genetic code relating mRNA base se- quences (or DNA base-
pair sequences) to amino acid sequences? Clearly, the symbols or "letters" used in the code must be
the bases; but what comprises a codon, the unit or "word" specifying one amino acid (or, actually,
one aminoacyl-tRNA complex)?

Three Nucleotides per Codon Twenty diferent amino acids are incorporated during translation. Thus,
at least 20 different codons must be formed using the four symbols (bases) available in the
"message" (mRNA). Two bases per codon would yield only 42 or 16 possible codons-clearly not
enough. Three bases per codon yield 4 or 64 possible codons-an apparent excess The firststrong
evidence that the genetic code was in fact a triples code (three nucleotides per codon) resulted from
a genetic analysis of proflavin-Induced mutations in the rll locus of phage T4 carried out by F. H. C.
Crick and colleagues in 1961. Crick and col- leagues isolated proflavin-induced revertants of a
proflavin-induced mutation. (Proflavin, an acridine dye, induces single base-pair additions and
deletions see Chapter 11, pp. 310-311.) These revertants were shown (by backcrosses to wild type;
see Fig. 11.9) to result from the occurrence of suppressor mutations rather than from back-mutation
at the original site of mutation. Crick and colleagues reasoned that if the original mutation was a
single base-pair addition or deletion, then the suppressor mutations must be single base-pair
deletions or additions, respectively, occur- ring at a site or sites near the original mutation. A single
base-pair addition or deletion will alter the reading frame of the gene and mRNA (the codons in
phase during translation) for that portion of the gene distal to the mutation (relative to the direction
of translation). This is illustrated in Fig. 10.30a. When the suppressor mutations were isolated as
single mutants by screening progeny of backcrosses to wild type, they were found to produce
mutant phenotypes, just like the original mutation. Crick and colleagues next iso- lated proflavin-
induced suppressor mutations of the original suppressor mutations, and so on. All the isolated
mutations were then classified into two groups, plus (+) and minus ()(for additions and celetions,
although Crick et al. had no idea which group was which) using the reasoning that a (+) mutation
would suppress a (-) mutation, but not another (+) mutation, and vice versa (see Fig. 10.30 and
legend for additional details). Next, Crick et al const-ucted ecombinants that carried various combi-
nations of he (+) and the () mutations. Recombi- nants with tvo (+) mutations or wo () mutations
always had mutant phenotypes, just like the single mutants. Recombinants carrying three (+)
mutations Fig 10.30b) or three (-) mutations, however, often had wild-type phenotypes. This
indicated that the ad ditions of three base-pairs or the deletion of three base-pairs left the distal
portion of the gene with the correct (wild-type) reading frame, a result that would be expected only
if each codon contained three nucle- otides Confirmation that the coding ratio (nucleotides to amino
acids) is indeed three has come from many sources. Considerable evidence favoring a triplet code
evolved from studies using in vitro translation systems The following observations were of major
importance (1) Trinucleotides were found sufficient to stimulate specific binding of aminoacyl-tRNAs
to ribosomes. For example, 5'-UUC-3' stimulates ribosomal binding of phenylalanyl-tRNAphe (2)
Chemically synthesized RNA molecules, cotaining repeating dinucleotide se quences, directed the
synthesis of copolymers with alternating amino acid sequences. Poly (UG)n, for ex ample, when used
as an artificial rRNA in an in vitro system, directed the synthesis of the repeating copol ymer (cys-
val)n (3) Molecules with repeating trinuc.e otide sequences, on the other hand, directed the syn
thesis of a mixture of three homopolymers (initiation being random on such an mRNA in an in vitro
system) Poly (UGG), for example, directed the synthesis of a mixture of polyserine, polyarginine, and
polyvaline Again, these results are only consistent with a triplet code. Ultimately, the triplet nature
of the code was definitively established by the results of correlated nucleic acid and protein
sequencing (eg., see Fig 10.31 and Chapter 12, Fig. 12.26).

Deciphering the Code The deciphering of the genetic code-that is, deter mining (1) which codons
specify which amino acids (2) how many of the 64 possible codons are used, (3) how the code is
punctuated, and (4) vhether different species use the same or different codons -took place during
the early 1960s and was one of the most excit- ing periods in the history of science. The "cracking" of
the genetic code had an effect on the life sciences like the splitting of the atom d'd on the physical
sciences. It opened up a vast new field of study of gene expression The first major breakthrough
came in 1961 when M. W. Nirenberg (1968 Nobel Prize reclplent) and J. H Matthaei and then S.
Ochoa (1959 Nobel Prize recipi- ent) and coworkers demonstrated that synthetic RNA molecules
could be used as artificial mRNAs to direct in vitro protein synthesis. That is, when ribosomes, quired
for translation are purified free of natural aminoacyl-tRNAs, and the soluble protein factors re-
mRNAs, these components can be combined in vitro and stimulated to synthesize polypeptides by
the addi- tion of chemically synthesized RNA molecules. If hese synthetic mRNA molecules are of
known composition, the composition of the polypeptides synthesized can be used to deduce which
codons specify which amino acids The first codon assignment (UUU for phenylala nine) was made
when Nirenberg and Matthaei dem onstrated that polyuridylic acid [poly U (U)nl di rected the
synthesis of polyphenylalanine [(phenylala nine)n1. Ochoa and others continued this approach using
synthetic RNAs with random sequences of known nucleotide composition, such as 50 percent U and
50 percent G. The frequencies of the different triplets in such a random copolymer can be easily
calculated. For example, the 50 percent U/50 percent G copolymer will contain 12.5 percent (V2 x x /
2 ) of each of the eight possible codons : UUU , UUG, UGU, GUU, UGG, GUG, GGU, and GGG. These
can then be compared with the amino acids incorpo- ated phenylalanine, leucine, cysteine, valine,
tryp- tophan, and glycine) when this random copolymer is used in an in vitro protein-synthesizing
system. By varying the composition, for example, to 75 percent U ard 25 percent G, one can vary the
relative frequencies of the cght codons and correlate them with the rela- tive frequencies of the
amino acids in the polypeptides synthes.zed. Such experiments provided a great deal of inforn ation
about the nature of the code.

More definitive data were later obtained by H. G Kliorana using in vitro systems that "rere activated
by synth.etic imRNAs of known nucleotide sequences. Kho- rana's experiments permitted direct
comparisons be- ween rucleotide sequences and the amino acids in- corporated in response to these
sequences. The ulimate "cracking" of the code occurred when trinu- cleotides were found to
function as "mini-mRNAs" in directing the specific binding of aminoacyl-tRNAS to ribosomes. By
using all the 64 possible trinucleotide sequence: in such aminoacyl-tRNA binding experi- ments, it
was possible to verify the codon assignments made from data of earlier experiments. On the basis of
extensive data accumulated over several years, the codon assignments shown in Table 10.1 became
firmly established. Two important ques- tions remained to be answered. (1) Are the assign- ments
based on in vitro experiments valid in vivo? (2) Is the code untversal that is, do the codons speclfy
the same amino acids in all organisms? Several lines of evidence now indicate that these codon
assigîments are correct for protein synthesis in vivo for most, if not all, species. When the amino acid
substitutions that result from nutations induced with chemical muta- gens with specific mutagenic
effects (see Chapter 11) are determined by amino acid sequencing, the substi-tutions are almost
always consistent with the codon assignments given in Table 10.1 and the known effect of the
mutagen. More convincingly, when the nucle- otide sequences of genes or of mRNAs are
determined and compared with the amino acid sequences of the polypeptides encoded by those
genes or mRNAs, the observed correlations are always found to be those predicted from the
accepted codon assignments (Table 10.1). This can be illustrated by comparing the nuclectide
sequence of the gene coding for the pro- tein coat or capsid of bacteriophage MS2 with the amino
acid sequence of the capsid polypeptide (Fig 10.31). Phage MS2 stores its genetic information in RNA
(like TMV virus; see Chapter 5, pp. 96-97). Its chromosome is equivalent to an mRNA molecule
inDegeneracy and Wobble All the amino acids except methionine and tryptophan are specified by
more than one codon (Table 10.1). Three amino acids, leucine, serine, and arginine, are each
specified by six different codons. Isoleucine has three codons. The other amino acids each have
either two or four codons. The occurrence of more than one codon per amino acid is called
degeneracy (though the usual connotations of the term are hardly appro- priate). The degeneracy in
the genetic code is not at random; instead, it is highly orderea. Usually, the ) multiple codons
specifying an amino acid differ by only one base, the third or 3' base of the codon. The degeneracy is
primarily of two types. (1) Partial degen- eracy occurs when the third base may be either one of the
two pyrimidines (U and C) or, alternatively, either one of the two purines (A and G). With partial
degen- eracy, changing the third base from a purine to a pyrimidine, or vice versa, will change the
amino acid specified by the codon. (2) In the case of çomplete degeneracy, any of the four bases
may be present at the third position in the codon, and the codon will still speçify the same amino
acid. For example, valine is specifled by GUU, GUC, GUA, and GUG (Table 10.1) It has been
speculated that the order in the genetic code has evolved as a way of minimizing mutational
lethality. Many base substitutions at the third position of codons do not change the amino acid
specified by the codon. Moreover, amino acids with similar chem- ical properties (such as leucine,
isoleucine, and valine) have codons that differ from each other by onl' one base. Thus, many single
base-pair substitutions wil result in the substitution of one amino acid for an other amino acid with
very similar chemical properties (eg. valine for isoleucine). In most cases, such substitutions will not
result in inactive gene-products; again, this minimizes the effects of mutations. Because of the
degeneracy of the genetic code there must either be several different tRNAs that rec- ognize the
different codons specifying a given amino acid or the anticodon of a given tRNA must be able to
base-pair with several different codons. Actually, both of these occur. Several tRNAs exist for certain
an ino acids, and some tRNAs recognize more than one codon. The hydrogen bonding between the
bases n the anticodon of tRNA and the codon of mRNA appears to follow strict base-pairing rules
(i.e., be "tight") only for the first two bases of the codon. The base-pairing involving the third base of
the codon is apparently less stringent, allowing what Crick has called wobble at this site. On the basis
of molecular distances and steric (three-dimensional structure) considerations, Crick proposed that
wobble would allow several types, but not all types, of base-pairing at the third codon base in the
codon-anticodon interaction. His proposal has since been strongly supported by experimental data.
Table 10.2 shows the base-pairing predicted by the wobble hypothesis. It necessitates that there be
at least two tRNAs for each amino acid whose codons exhibit complete degeneracy at the third
position. This has indeed been found to be true. The wobble hypothesis predicted the occurrence of
three tRNAs for the six serine codons. Three serine tRNAs have been charac- terized (1) tRNAsert
(anticodon AGG) binds to codons UCU and UCC, (2) tRNAser2 (anticodon AGU) binds codons UCA
and UCG, and (3) tRNAsers (anticodon UCG) binds to codons AGU and AGC. These specifici- ties were
verified by the trinucleotide-stimulated bind- ing of purified aminoacyl-tRNAs to ribosomes in vitro.
Finally, several tRNAs contain the base inosine (produced by posttranscriptional enzymatic modifica
tion). Crick's wobble hypothesis predicted that inósine could pair (at the wobble position) with
adenine, uracil, or cytosine (in the codon). In lact, purified alanyl-tRNA containing inosine (1) at the
5' position of the anticodon (Fig. 10.15) binds to ribosomes ac:ivated with GCU, GCC, or GCA
trinucleotides. The saine result has been obtained with other purified tRNAs with inosine at the 5'
position of the anticodon. The wobble hypothesis thus fits several obse vations whether it is entirely
accurate remains unknown.

Initiation and Termination Codons The genetic code also provides for purictuation of genetic
information at the level of translation. Thiee codons, UAA, UAG, and UGA, specify polypeptide chain
termination. These codons are recognized bv protein release factors, rather than by tRNAS. Ore of
these proteins, designated RF-1, is apparently specific for UAA and UAG. The other, RF-2, causes
termination at UAA and UGA codons. Two codons, AUG and GUG, re recognized by the initiator
tRNA, tRNAer, but appar ently only when they foliow an appropriate nucleotide sequence in the
leader segment of an mRNA molecule. At internal positions, AUG is recognized by tRNAMer and GUG
is recognized by a valine tRNA. In the case of the initiation codons AUG and GUG and tRNAer, the
wobble base appears to be the first or 5' base of the codon. Since wobble at the first base is unique
initiation, it may be related to base-pairing at the P site rather than at the A site on the ribosome.
Met Universality of the Code A vast amount of data is now available from in vitro studies, from
amino acid replacements due to muta- tions, and from correlated nucleic acid and polypep tide
sequencing-all suggesting that the genetic code is the same or very nearly the same in all organisms.

These data (e.g, see the human hemoglobin substitu- tions, Chapter 11, Fig. 11.8, the correlated
nucleotide and amino acid sequences in the overlapping genes of the DNA bacteriophage ox174,
Chapter 12, Fig. 12.26 and of the RNA bacteriophage MS2, Fig. 10.31) all indicate that the genetic
code is largely universal The major exception to the universality of the code occurs in mitochondria
of humans, yeast, and several other species, where UGA is a tryptopharn codon. UGA is a
termination codon in nonmitochon- drial systems. Also, iu yeast mitochondria, CUA speci- fies
threonine instead of the usual leucine, and, in mammalian mitochondria, AUA specifies methionine
instead of the usual isoleucine. Excluding these and a few related exceptions, the code appears to be
univer sal.