Page 75 - Genetics_From_Genes_to_Genomes_6th_FULL_Part3
P. 75
11.1 Variation Among Genomes 369
DNA in two different haploid human genomes and the You should remember that in protein-coding regions,
chimp RefSeq genome. Two single-base changes have DIP variants act as frameshift mutations unless the number
occurred in this small genomic region since the diver- of nucleotide pairs inserted or deleted is 3 or a multiple of 3.
gence of the two species. One is shared by all human
genomes and is thus not polymorphic in humans. The
second base change was from a C in the common chimp- Simple sequence repeats (SSRs)
human ancestor to a T (the derived allele) in a chromo- The genomes of humans and higher eukaryotes are loaded
some of the ancestor of some people but not others. This with loci defined by simple sequence repeats (SSRs), some-
means that if you and a friend share a derived allele at an times also termed microsatellites. SSR loci consist of se-
anonymous SNP locus, you both got that allele from the quences of one to a few bases that are repeated in tandem less
same ancestor who must have lived since the human and than 10 to more than 100 times. Different alleles of an SSR
chimp lineages diverged from each other. The fact that locus have different numbers of repeating units. The most
every random pair of human beings on the planet shares common repeating units are one-, two-, or three-base se-
many unlinked, derived SNP alleles indicates common quences. SSRs with larger repeating units are less frequent,
ancestry for all people. and we employ here a relatively arbitrary cutoff in which the
To date, the analysis of thousands of human genomes largest repeating unit of an SSR has 10 bases (Table 11.1;
has led to the identification of more than 50 million SNPs those with larger repeating units will be classified as CNVs
that are catalogued in a SNP database (dbSNP) at the below). Examples of SSRs are AAAAAAAAAAAAAAA (a
National Center for Biotechnology Information (NCBI). one-base repeat) or CACACACACACACACACACACA (a
About 15 million of these SNPs are commonly found in two-base repeating unit). SSRs of all types together account
human populations. The mutations giving rise to the de- for about 3% of the total DNA in the human genome; an SSR
rived allele of common SNPs must have occurred far locus can be found on average once in every 30 kb of human
enough back in human history to have been disseminated DNA.
to a significant proportion of current-day people. The As with all other polymorphisms, most SSRs occur
SNP database already includes a large fraction of all com- outside the coding regions of genes and have no effect on
mon SNPs in human populations. However, you should phenotype. In contrast, SSR variations within genes can
realize that mutational events in the very recent past also have profound phenotypic consequences. For example, we
create rare SNPs that would be found in only one or a few have already discussed in Chapter 7 (review the Fast
people of the billions on the earth. Very few of these rare Forward Box entitled Trinucleotide Repeat Diseases:
SNPs are yet accounted for in dbSNP because so few Huntington Disease and Fragile X Syndrome) that long
human genomes have been analyzed relative to the entire tracts of trinucleotide repeats are the molecular cause of
human population. several severe neurological conditions, including fragile X
This brief discussion about the origin of SNPs sug- syndrome and Huntington disease.
gests that genome sequencing provides powerful tools for SSRs arise spontaneously from rare, random events
understanding human ancestry. Chapter 21 presents some that initially produce a short repeated sequence with four to
of the surprising findings about human history revealed by five repeat units. Once a short SSR mutates into existence,
the analysis of DNA sequences from present-day humans however, it can expand into a longer sequence by a form of
and the fossilized remains of our primate ancestors. faulty DNA replication called slipped mispairing or stut-
tering. Figure 7.12 showed in detail how this stuttering
mechanism can change the number of repeat units at the
Deletion-insertion polymorphisms (DIPs) SSR locus responsible for Huntington disease.
Short insertions or deletions of genetic material represent Because of events such as slipped mispairing, new
−3
the second most common form of genetic variation in the alleles arise at SSR loci at an average rate of 10 per locus
human genome. These variants are referred to as deletion- per gamete (that is, one in every thousand gametes). This
insertion polymorphisms (DIPs) or InDels. While SNP frequency is much greater than the single nucleotide muta-
−9
loci occur about once per kilobase in the comparison of any tion rate of 10 and results in a large amount of SSR varia-
two haploid genomes, DIPs are considerably rarer, occur- tion among unrelated individuals within a population.
ring about once in every 10 kb of DNA (Table 11.1). DIPs Unlike SNPs—which are biallelic and do not change after
range in length from one base pair to hundreds of base the mutational event that gave rise to them—SSRs are
pairs, but their relative frequency declines steeply in rela- therefore highly polymorphic in the number of repeats they
tion to their length. As a result, DIPs involving only one or carry, often with 10 or more alleles distinguishable at a
two nucleotides are the most common. single SSR locus. The rate of SSR mutation is nonetheless
Several biochemical processes appear to contribute to low enough that changes usually do not occur within a few
the formation of DIPs. These include problems in DNA generations of even a large family. SSRs can thus serve as
replication or recombination, and mistakes that occur when relatively stable, highly polymorphic DNA markers in link-
cells try to repair damage such as broken DNA strands. age studies of many organisms, including humans.