Page 113 - Genetics_From_Genes_to_Genomes_6th_FULL_Part2
P. 113

272    Chapter 8    Gene Expression: The Flow of Information from DNA to RNA to Protein


              that most such amino acids arise when proteins undergo   Figure 8.2  The genetic code: 61 codons represent the
              chemical modification after their synthesis. By contrast,   20 amino acids, while 3 codons signify stop. To read the
              amino acids present in most proteins made the list. The   code, find the first letter in the left column, the second letter along
              question then became: How can four nucleotides encode   the top, and the third letter in the right column; this reading
              20 amino acids?                                      corresponds to the 5′-to-3′ direction along the mRNA.
                  Like the Morse code, the four nucleotides encode 20                  Second letter
              amino acids through specific groupings of A, G, C, and T     U                      C                     A                   G
              (in DNA) or A, G, C, and U (in RNA). Researchers ini­     UUU   Phe  UCU      UAU   Tyr  UGU  Cys  U
              tially arrived at the number of letters per grouping by de­  U  UUC   UCC   Ser  UAC     UGC       C
              ductive  reasoning  and  later  confirmed  their  guess  by   UUA   Leu  UCA   UAA  Stop  UGA  Stop A
              experiment. They reasoned that if only one nucleotide rep­  UUG     UCG       UAG  Stop  UGG  Trp  G
              resented an amino acid, information would exist for only
              four amino acids: A would encode one amino acid, G a      CUU       CCU       CAU   His  CGU       U
              second amino acid, and so on. If two nucleotides repre­  C  CUC   Leu  CCC   Pro  CAC    CGC  Arg  C
                                    2
              sented each amino acid,  4  = 16 possible combinations of   CUA     CCA       CAA   Gln  CGA       A
              doublets would be possible.                               CUG       CCG       CAG        CGG       G
                  Of course, if the code consisted of groups containing   First letter                             Third letter
              one or two nucleotides, it would have 4 + 16 = 20 groups   AUU      ACU        AAU   Asn  AGU  Ser  U
              and could account for all the amino acids, but nothing   A  AUC  Ile  ACC   Thr  AAC     AGC       C
              would be left over to signify the pause required to denote   AUA    ACA        AAA   Lys  AGA  Arg  A
              where one group ends and the next begins. Groups of three   AUG  Met  ACG      AAG       AGG       G
                                             3
              nucleotides in a row would provide 4  = 64 different triplet
              combinations, more than enough to code for all the amino   GUU      GCU       GAU   Asp  GGU       U
              acids. If the code consisted of doublets and triplets, a signal   G  GUC   Val  GCC   Ala  GAC   GGC  Gly  C
              denoting a pause would once again be necessary. But a     GUA       GCA       GAA   Glu  GGA       A
                triplets­only code would require no symbol for pause if the   GUG   GCG     GAG        GGG       G
              mechanism for counting to three and distinguishing among
              successive triplets was very reliable.
                  Although this kind of reasoning generates a hypothe­
              sis, it does not prove it. As it turned out, however, the ex­  to fathom the code. They began by examining how different
              periments  described  later  in  this  chapter  did  indeed   mutations in a single gene affected the amino acid sequence
              demonstrate that groups of three nucleotides represent all   of the gene’s polypeptide product. In this way, they were
              20 amino acids. Each nucleotide triplet is called a codon.   able to use the abnormal (specific mutations) to understand
              Each codon, designated by the bases defining its three nu­  the normal (the general relationship between genes and
              cleotides, specifies one amino acid. For example, GAA is     polypeptides).
              a codon for glutamic acid (Glu), and GUU is a codon for
              valine (Val). Because the code comes into play only dur­
              ing the translation part of gene expression, that is, during   A Gene’s Nucleotide Sequence
              the decoding of messenger RNA to polypeptide, geneti­  Is Colinear with the Amino Acid Sequence
              cists usually present the code in the RNA dialect of A, G,   of the Encoded Polypeptide
              C, and U, as depicted in Fig. 8.2. When speaking of genes,
              they can substitute T for U to show the same code in the   As you know, DNA is a linear molecule with base pairs
              DNA dialect.                                         following one another down the intertwined chains.
                  If you knew the sequence of nucleotides in a gene or its     Proteins, by contrast, have complicated three­dimensional
              transcript as well as the sequence of amino acids in the cor­  structures. Even so, if unfolded and stretched out from N
              responding polypeptide, you could then deduce the genetic   terminus to C terminus, proteins have a one­dimensional,
              code without understanding how the underlying cellular   linear structure—a specific primary sequence of amino
              machinery actually works. Although techniques for deter­    acids. If the information in a gene and its corresponding
              mining both nucleotide and amino acid sequence are availa­  protein are colinear, the consecutive order of bases in the
              ble today, this was not true when researchers were trying to   DNA from the beginning to the end of the gene would
              crack the genetic code in the 1950s and 1960s. At that time,   stipulate the consecutive order of amino acids from one
              they could establish a polypeptide’s amino acid sequence,   end to the other of the outstretched protein.
              but not the nucleotide sequence of DNA or RNA. Because   In the 1960s, Charles Yanofsky was the first to compare
              of their inability to read nucleotide sequence, scientists used   maps of mutations within a gene to the particular amino
              an assortment of genetic and biochemical techniques   acid substitutions that resulted. He began by generating a
   108   109   110   111   112   113   114   115   116   117   118