Page 75 - Genetics_From_Genes_to_Genomes_6th_FULL_Part3
P. 75

11.1 Variation Among Genomes   369


                       DNA in two different haploid human genomes and the      You should remember that in protein-coding regions,
                       chimp RefSeq genome. Two single-base changes have   DIP variants act as frameshift mutations unless the number
                       occurred in this small genomic region since the diver-  of nucleotide pairs inserted or deleted is 3 or a multiple of 3.
                       gence of the two  species.  One  is shared by all human
                       genomes and is thus not polymorphic in humans. The
                       second base change was from a C in the common chimp-  Simple sequence repeats (SSRs)
                       human ancestor to a T (the derived allele) in a chromo-  The genomes of humans and higher eukaryotes are loaded
                       some of the ancestor of some people but not others. This   with loci defined by simple sequence repeats (SSRs), some-
                       means that if you and a friend share a derived allele at an   times also termed microsatellites. SSR loci consist of se-
                       anonymous SNP locus, you both got that allele from the   quences of one to a few bases that are repeated in tandem less
                       same ancestor who must have lived since the human and   than 10 to more than 100 times. Different alleles of an SSR
                       chimp lineages diverged from each other. The fact that   locus have different numbers of repeating units. The most
                       every random pair of human beings on the planet shares   common repeating units are one-, two-, or three-base se-
                       many unlinked, derived SNP alleles indicates  common   quences. SSRs with larger repeating units are less frequent,
                       ancestry for all people.                            and we employ here a relatively arbitrary cutoff in which the
                          To date, the analysis of thousands of human genomes   largest repeating unit of an SSR has 10 bases (Table 11.1;
                       has led to the identification of more than 50 million SNPs   those with larger repeating units will be  classified as CNVs
                       that are catalogued in a SNP database (dbSNP) at the   below). Examples of SSRs are AAAAAAAAAAAAAAA (a
                         National Center for Biotechnology Information (NCBI).   one-base repeat) or  CACACACACACACACACACACA (a
                       About 15 million of these SNPs are commonly found in   two-base repeating unit). SSRs of all types together account
                       human populations. The mutations giving rise to the de-  for about 3% of the total DNA in the human genome; an SSR
                       rived allele of common SNPs must have occurred far   locus can be found on average once in every 30 kb of human
                       enough back in human history to have been disseminated   DNA.
                       to a significant proportion of current-day people. The   As with all other polymorphisms, most SSRs occur
                       SNP database  already includes a large fraction of all com-  outside the coding regions of genes and have no effect on
                       mon SNPs in human populations. However, you should   phenotype. In contrast, SSR variations within genes can
                       realize that mutational events in the very recent past also   have profound phenotypic consequences. For example, we
                       create rare SNPs that would be found in only one or a few   have already discussed in Chapter 7 (review the  Fast
                       people of the billions on the earth. Very few of these rare     Forward Box entitled  Trinucleotide Repeat Diseases:
                       SNPs are yet accounted  for  in  dbSNP  because  so few     Huntington Disease and Fragile X Syndrome) that long
                         human genomes have been analyzed relative to the entire   tracts of trinucleotide repeats are the molecular cause of
                       human population.                                   several severe neurological conditions, including fragile X
                          This  brief  discussion about  the origin of SNPs  sug-  syndrome and Huntington disease.
                       gests that genome sequencing provides powerful tools for   SSRs arise spontaneously from rare, random events
                       understanding human ancestry. Chapter 21 presents some   that initially produce a short repeated sequence with four to
                       of the surprising findings about human history revealed by   five repeat units. Once a short SSR mutates into existence,
                       the analysis of DNA sequences from present-day humans   however, it can expand into a longer sequence by a form of
                       and the fossilized remains of our primate ancestors.   faulty DNA replication called slipped mispairing or stut-
                                                                           tering.  Figure  7.12  showed  in  detail  how  this  stuttering
                                                                           mechanism can change the number of repeat units at the
                       Deletion-insertion polymorphisms (DIPs)             SSR locus responsible for Huntington disease.
                       Short insertions or deletions of genetic material represent   Because of events such as slipped mispairing, new
                                                                                                                   −3
                       the second most common form of genetic variation in the     alleles arise at SSR loci at an average rate of 10  per locus
                       human genome. These variants are referred to as deletion-  per gamete (that is, one in every thousand gametes). This
                       insertion polymorphisms (DIPs) or InDels. While SNP   frequency is much greater than the single nucleotide muta-
                                                                                       −9
                       loci occur about once per kilobase in the comparison of any   tion rate of 10  and results in a large amount of SSR varia-
                       two haploid genomes, DIPs are considerably rarer, occur-  tion among unrelated individuals within a population.
                       ring about once in every 10 kb of DNA (Table 11.1). DIPs   Unlike SNPs—which are biallelic and do not change after
                       range in length from one base pair to hundreds of base   the mutational event that gave rise to them—SSRs are
                       pairs, but their relative frequency declines steeply in rela-  therefore highly polymorphic in the number of repeats they
                       tion to their length. As a result, DIPs involving only one or   carry, often with 10 or more alleles distinguishable at a
                       two nucleotides are the most common.                single SSR locus. The rate of SSR mutation is nonetheless
                          Several biochemical processes appear to contribute to   low enough that changes usually do not occur within a few
                       the formation of DIPs. These include problems in DNA   generations of even a large family. SSRs can thus serve as
                       replication or recombination, and mistakes that occur when   relatively stable, highly polymorphic DNA markers in link-
                       cells try to repair damage such as broken DNA strands.  age studies of many organisms, including humans.
   70   71   72   73   74   75   76   77   78   79   80