Page 113 - Genetics_From_Genes_to_Genomes_6th_FULL_Part2
P. 113
272 Chapter 8 Gene Expression: The Flow of Information from DNA to RNA to Protein
that most such amino acids arise when proteins undergo Figure 8.2 The genetic code: 61 codons represent the
chemical modification after their synthesis. By contrast, 20 amino acids, while 3 codons signify stop. To read the
amino acids present in most proteins made the list. The code, find the first letter in the left column, the second letter along
question then became: How can four nucleotides encode the top, and the third letter in the right column; this reading
20 amino acids? corresponds to the 5′-to-3′ direction along the mRNA.
Like the Morse code, the four nucleotides encode 20 Second letter
amino acids through specific groupings of A, G, C, and T U C A G
(in DNA) or A, G, C, and U (in RNA). Researchers ini UUU Phe UCU UAU Tyr UGU Cys U
tially arrived at the number of letters per grouping by de U UUC UCC Ser UAC UGC C
ductive reasoning and later confirmed their guess by UUA Leu UCA UAA Stop UGA Stop A
experiment. They reasoned that if only one nucleotide rep UUG UCG UAG Stop UGG Trp G
resented an amino acid, information would exist for only
four amino acids: A would encode one amino acid, G a CUU CCU CAU His CGU U
second amino acid, and so on. If two nucleotides repre C CUC Leu CCC Pro CAC CGC Arg C
2
sented each amino acid, 4 = 16 possible combinations of CUA CCA CAA Gln CGA A
doublets would be possible. CUG CCG CAG CGG G
Of course, if the code consisted of groups containing First letter Third letter
one or two nucleotides, it would have 4 + 16 = 20 groups AUU ACU AAU Asn AGU Ser U
and could account for all the amino acids, but nothing A AUC Ile ACC Thr AAC AGC C
would be left over to signify the pause required to denote AUA ACA AAA Lys AGA Arg A
where one group ends and the next begins. Groups of three AUG Met ACG AAG AGG G
3
nucleotides in a row would provide 4 = 64 different triplet
combinations, more than enough to code for all the amino GUU GCU GAU Asp GGU U
acids. If the code consisted of doublets and triplets, a signal G GUC Val GCC Ala GAC GGC Gly C
denoting a pause would once again be necessary. But a GUA GCA GAA Glu GGA A
tripletsonly code would require no symbol for pause if the GUG GCG GAG GGG G
mechanism for counting to three and distinguishing among
successive triplets was very reliable.
Although this kind of reasoning generates a hypothe
sis, it does not prove it. As it turned out, however, the ex to fathom the code. They began by examining how different
periments described later in this chapter did indeed mutations in a single gene affected the amino acid sequence
demonstrate that groups of three nucleotides represent all of the gene’s polypeptide product. In this way, they were
20 amino acids. Each nucleotide triplet is called a codon. able to use the abnormal (specific mutations) to understand
Each codon, designated by the bases defining its three nu the normal (the general relationship between genes and
cleotides, specifies one amino acid. For example, GAA is polypeptides).
a codon for glutamic acid (Glu), and GUU is a codon for
valine (Val). Because the code comes into play only dur
ing the translation part of gene expression, that is, during A Gene’s Nucleotide Sequence
the decoding of messenger RNA to polypeptide, geneti Is Colinear with the Amino Acid Sequence
cists usually present the code in the RNA dialect of A, G, of the Encoded Polypeptide
C, and U, as depicted in Fig. 8.2. When speaking of genes,
they can substitute T for U to show the same code in the As you know, DNA is a linear molecule with base pairs
DNA dialect. following one another down the intertwined chains.
If you knew the sequence of nucleotides in a gene or its Proteins, by contrast, have complicated threedimensional
transcript as well as the sequence of amino acids in the cor structures. Even so, if unfolded and stretched out from N
responding polypeptide, you could then deduce the genetic terminus to C terminus, proteins have a onedimensional,
code without understanding how the underlying cellular linear structure—a specific primary sequence of amino
machinery actually works. Although techniques for deter acids. If the information in a gene and its corresponding
mining both nucleotide and amino acid sequence are availa protein are colinear, the consecutive order of bases in the
ble today, this was not true when researchers were trying to DNA from the beginning to the end of the gene would
crack the genetic code in the 1950s and 1960s. At that time, stipulate the consecutive order of amino acids from one
they could establish a polypeptide’s amino acid sequence, end to the other of the outstretched protein.
but not the nucleotide sequence of DNA or RNA. Because In the 1960s, Charles Yanofsky was the first to compare
of their inability to read nucleotide sequence, scientists used maps of mutations within a gene to the particular amino
an assortment of genetic and biochemical techniques acid substitutions that resulted. He began by generating a