Page 68 - Genetics_From_Genes_to_Genomes_6th_FULL_Part3
P. 68
362 Chapter 10 Genome Annotation
c. The particular adapters discussed in this problem a. Why is the lack of poly-A tails not surprising in
allow the cDNAs to be ligated efficiently into a vec- light of your answer to part (d) of Problem 14?
tor treated with a commonly used restriction enzyme b. Why does the lack of poly-A tails present a diffi-
listed in Table 9.1. Name this restriction enzyme. culty for the method diagrammed in Fig. 10.4?
c. Outline how you might adjust the protocol in Fig.
Section 10.2 10.4 so as to find the cDNAs F and G annotated in
12. Give two different reasons for the much higher ratio the Genome Browser.
of total DNA to protein-encoding DNA in the human 16. Fig. 10.10 presents a model for exon shuffling in
genome as compared to bacterial genomes. which chromosomal fragments broken into introns
13. Using a cDNA library, you isolated two different can be restitched together to make novel genes that
cDNA clones that have sequences indicating that they did not exist before. However, only a certain fraction
both correspond to mRNAs transcribed from the same of events like those shown on the figure could actu-
nerve growth factor gene. The beginning and ending ally produce genes encoding proteins with functional
sequences of the clones are the same, but the middle domains from both original polypeptides. What is this
sequence is different. How can you explain the differ- fraction?
ent cDNAs? 17. An interesting phenomenon found in vertebrate DNA
14. The figure that follows shows part of a modified is the existence of pseudogenes, nonfunctional copies
screen shot of part of the human genome as displayed of a gene found elsewhere in the genome. Some pseu-
on the UCSC Genome Browser. A through G map the dogenes appear to have originated as double-stranded
sequences within individual cDNA clones to the ge- DNA copies of mature mRNA inserted into the chro-
nome sequence. Refer to the key in Fig. 10.3 if you mosome; these copies later underwent mutations to
need help in interpreting this diagram. Pay close atten- make them into pseudogenes.
tion to the vertical widths of the icons indicating exons. a. What sequence information might provide clues
that the original source of some of these pseudo-
genes is cDNA copied in cells from mRNA and
then inserted into the genome?
b. Would this mechanism of generating pseudogenes
be more likely to have operated if the pseudogene
was part of a gene family clustered in one region of
the genome, or if it was instead part of a gene
family whose members are scattered around the
genome? Explain.
18. a. If you found a zinc-finger domain (which facili-
Source: University of California Genome Project, https://genome.ucsc.edu tates DNA binding) in a newly identified gene,
what kinds of hypotheses could you make about
a. How many annotated genes do you think are pres- the gene’s function?
ent in this region of the human genome? b. Suppose that this newly identified gene shares a
b. For all annotated genes in this region, indicate high percentage of similarity throughout its length
whether they are transcribed in the direction from with a previously characterized gene in the same
centromere to telomere or from telomere to organism. What does this fact suggest about the or-
centromere. igin of the two genes? Would you categorize these
c. How many promoters are suggested by the data? genes as being: (i) homologous, (ii) paralogous, or
Approximately where are these promoters (iii) orthologous? (More than one answer may apply.)
located? 19. You sequence the genomes of four different organ-
d. How many different proteins are encoded by the isms and compare their sequences over a short region
DNA sequences in this region? as shown below.
e. What is unusual about this region of the human 5′ AGGTATATAATTTGCG 3′
genome? 5′ CAATATAAAACCCTAC 3′
15. In Problem 14, cDNAs F and G could not be found in 5′ GCGTATAAAAGAGCTA 3′
cDNA libraries (from any tissue) prepared using the 5′ TTATATATAAAGAAGT 3′
method shown in Fig. 10.4. The reason is that the a. Determine the consensus sequence common to the
corresponding transcripts do not have poly-A tails. four regions above.