Page 61 - Genetics_From_Genes_to_Genomes_6th_FULL_Part3
P. 61

10.4 A Comprehensive Example: The Hemoglobin Genes   355


                       and disseminating the data. A RefSeq need not be derived   Query is the sequence you already know; here, the amino
                       from a single individual, and it need not contain the most   acid sequence of the Drosophila protein written in the one-
                       common genetic variants found in  species  members.   letter code. The Subject is the homologous sequence found
                       Rather,  it  is  simply  an  arbitrary,  but  well-characterized,   by the BLAST program; in this case, the related human
                       example against which all newly obtained sequences from   protein. The row between the Query and the Subject indi-
                       that species can be compared.                       cates the conserved amino acids, with a + symbol denoting
                                                                           conservative amino acid replacements (missense substitu-
                                                                           tions in which an amino acid is replaced by a different
                       Visualizing genes and genomes                       amino acid with similar chemical properties).
                       Several web-based programs have been developed that     To appreciate the power of bioinformatics programs
                         allow a user to examine visual representations of genome   such as the Genome Browser and BLAST search tool, you
                       data. One such program is the UCSC Genome Browser   really need to access and use them yourself. Problems 23
                       (https://genome.ucsc.edu/) that visualizes RefSeq genes   and 24 at the end of this chapter involve some simple exer-
                       and their associated annotations, showing features such as   cises that will place a few of these vast genomic databases
                       exon/intron  structure  and the  location  of  protein-coding   at your disposal.
                         regions. Fig. 10.3 showed an example of the Genome
                       Browser output, focusing on a 100 kb region of the human
                       genome containing four genes. The transcription units are   essential concepts
                       indicated at the bottom of the figure with large blue arrows
                       that depict the extent of the gene, the direction of tran-  •  Bioinformatics applications that are freely accessible
                       scription, and each gene’s exon/intron structure (exons   online provide gateways for the exploration of genomic
                       represented as wider than the introns). Researchers can   data.
                       adjust their view of the browser to show many additional   •  Genome browsers show the arrangement and structure of
                       genomic features of interest, such as alternative splice   genes within RefSeq genomes.
                       variants, the location of repetitive DNA sequences, simi-  •  A BLAST search allows rapid, automated matching of
                       larities with the genomes of other organisms, and the loca-  particular DNA or amino acid sequences across multiple
                       tion of possible transcriptional regulatory elements.   species for analysis of evolutionary relationships.


                       BLAST Searches Automate the Finding
                       of Homologous Sequences
                                                                            10.4   A Comprehensive Example:
                       Suppose that you have identified a gene, for example from   The Hemoglobin Genes
                       the fruit fly  Drosophila, that is of interest to you. You
                       would like to know whether the human genome contains a
                       homolog of this fly gene. One tool you could use is an   learning objectives
                       NCBI program called BLAST (Basic  Local  Alignment
                       Search Tool), which allows you to find nucleotide or amino   1.  Discuss why it is advantageous for humans to produce
                       acid sequences related to any given nucleotide or amino   different hemoglobins at different stages of
                       acid sequence. Figure 10.18 displays a typical output of a   development.
                       BLAST search, in this case looking for human proteins that   2.  Explain how the clustering of hemoglobin genes
                       share similarity with a Drosophila protein of interest. The   impacts the cellular strategy to regulate their
                                                                                 expression.
                                                                             3.  Predict the phenotypic severity of particular mutations
                       Figure 10.18  Output from a BLAST search. The program     in the α and β clusters.
                       was asked to find a human protein related to a protein in
                       Drosophila. The Query shows part of the sequence of the fly
                       protein (from amino acids 688–720); the Subject (Sbjct) indicates
                       the corresponding amino acids in the human protein found by the   The vivid red color of our blood arises from its life-sustaining
                       search. Some of these amino acids are identical in the fly and   ability to carry oxygen. This ability, in turn, derives from
                       human proteins. Positions marked with a plus (+) are conservative
                       substitutions in which the substituted amino acids have similar   billions of red blood cells, each one packed with close to
                       chemical properties. At some positions the amino acids are very   280 million molecules of the protein pigment known as
                       different, suggesting that the identities of these particular amino   hemoglobin (Fig. 10.19a).
                       acids are not crucial to protein function.              A normal adult hemoglobin molecule consists of four
                        Query  688  GPLTASYK S EID  KH LIRA LFQ TDDW R AAIK  T QI  720  polypeptide chains—two alpha (α) and two beta (β) globins—
                                   GPL A++  S E+K  LIRA LFQ T++  R A  A+ +I
                        Sbjct  583  GPLAAAFS S EVS  KA LIRA LFQ TNER R AA AL AKI  615  each surrounding an iron-containing small molecular structure
   56   57   58   59   60   61   62   63   64   65   66