Page 50 - Genetics_From_Genes_to_Genomes_6th_FULL_Part3
P. 50

344    Chapter 10   Genome Annotation


              Figure 10.3  Homology map for a 100 kb region of the   enhancer elements (see Fig. 8.11) that help determine when
              human genome. Regions in black are homologous between the   and where nearby genes are transcribed into mRNA.
              human genome and the genome of the indicated species. Most DNA
              sequences conserved between humans and zebrafish are found in
              protein-coding exons. Some sequences outside of the exons are   The Most Direct Method to Find Genes
              also constrained evolutionarily, suggesting that they may play   Is to Locate Transcribed Regions
              functional roles that are currently unknown. (UTR: untranslated
              region; CDS: protein-coding sequence).
                                                                   Many genes encode proteins while some others, such as the
                                                                   genes for rRNAs and tRNAs, do not. However, all genes
                                                                   are transcribed into RNAs, even if some RNAs are not
                                                                   translated. If you knew the sequence of the RNA produced
                                                                   from a gene, it would be easy to find that gene in genomic
                                                                   DNA simply by looking for the DNA sequence comple-
                                                                   mentary to the RNA. This approach in fact works well for
                                                                   RNAs that can be purified in large amounts like rRNAs
                  With a computerized genome visualization tool, it   (which can be isolated from other RNAs because they form
                becomes possible to explore DNA sequence conservation   part of the ribosome).
              directly along the genome, as well as across evolutionary   In contrast, most mRNAs are so relatively rare in cells
              time. An example of cross-species homology analysis is   that they cannot be purified readily. Moreover, although
              shown in  Fig. 10.3 for a 100 kb region containing four   technologies for determining the  nucleotide sequence  of
              genes. The bottom row of the figure displays the locations   RNAs do exist, they are less widely available and much
              and exon/intron structures of the four genes in the human   more difficult to perform than the methods available for
              genome. Above this row are homology maps for three   sequencing DNA. As a result, the easiest way to study
                representative vertebrate species; highly conserved DNA   mRNAs is to copy them into DNA, to clone the resultant
              sequences are indicated with dark lines or blocks.   DNA molecules, and then to sequence these clones by the
                  As anticipated from the close relationship between hu-  same methods already described for genomic DNA.
              man and chimpanzee species, nearly complete conservation
              of human sequences exists across the entire region in a chimp
              genome. In other mammals, represented here by the mouse,   Making cDNA libraries
              conservation is also apparent across the entire region, but the   To produce DNA clones from mRNA sequences, research-
              pattern is choppy, indicating small regions of conservation   ers rely on a series of in vitro reactions that mimics part of
              interspersed with small, nonconserved regions.       the life cycle of viruses known as  retroviruses. Retrovi-
                  As we move farther across the phylogenetic landscape   ruses, which include among their ranks the HIV virus that
              to fish, we can distinguish sequences subject to evolution-  causes AIDS, carry their genetic information in molecules
              ary constraints more clearly from those that are not. Note in   of RNA. As part of their gene-transmission kit, retroviruses
              particular that large parts of the coding regions of three of   also contain the unusual enzyme known as RNA-dependent
              the four genes are highly conserved in all the species exam-  DNA polymerase, or simply reverse transcriptase (review
              ined (Fig. 10.3). This conservation suggests that the protein   the Genetics and Society Box in Chapter 8 entitled HIV and
              products of the three genes are crucial to the survival of all   Reverse Transcription). After infecting a cell, a retrovirus
              vertebrates. However, a homolog of the fourth gene is not   uses reverse transcriptase to copy its single strand of RNA
              found in zebrafish, indicating that its function is dispensable   into a strand of complementary DNA, often abbreviated as
              to fish. Regions of homology between the human and mouse   cDNA. The reverse transcriptase, which can also function
              or zebrafish genomes are much less frequent in introns, in   as a DNA-dependent DNA polymerase, then makes a sec-
              the noncoding parts of exons (corresponding to the 5′ and 3′   ond strand of DNA complementary to this first cDNA
              UTRs of the genes), and in the spaces between genes.  strand (and equivalent in sequence to the original RNA tem-
                  Sequence conservation over long evolutionary periods,   plate). Finally, this double-stranded DNA copy of the retro-
              such as the time since humans last shared a common ances-  viral RNA chromosome integrates into the host cell’s
              tor with mice or fish, therefore usually predicts the location   genome. Although the designation cDNA originally meant a
              of genes. However, exceptions do exist: Conserved DNA   single strand of DNA complementary to an RNA molecule,
              sequences can be observed rarely at locations outside of the   it now refers to any DNA—single- or double-stranded—
              coding regions. The fact that these features are so well con-  derived from an RNA template.
              served suggests strongly that they have a function that is   Let’s see how you could use reverse transcriptase to
              subject to evolutionary constraints—even if in most cases   make cDNA copies of all the mRNAs that are transcribed
              we do not yet know what these functions may be. Scientists   in a particular cell type such as red blood cell precursors.
              are actively exploring the potential roles of these conserved   You would first isolate by simple chemical means the total
              noncoding sequences; for example, some might represent   population of RNA molecules in these cells (Fig. 10.4a).
   45   46   47   48   49   50   51   52   53   54   55