This page has been archived and is no longer updated

 

Polygenic Inheritance and Gene Mapping

By: Heidi Chial, Ph.D. (Write Science Right) © 2008 Nature Education 
Citation: Chial, H. (2008) Polygenic inheritance and gene mapping. Nature Education 1(1):17
Email
Ever griped about your height? Figuring out its origins hasn't been any easier for geneticists who are turning to high-throughput, genome-wide association studies for clues.
Aa Aa Aa

 

An illustration shows the silhouettes of nine human figures standing side-by-side in order of increasing height, with the shortest figure at the far left, and the tallest figure at the far right. The silhouettes are shown standing in different positions and wearing different types of clothing, including pants, dresses, and a suit coat. The figures are shown on a black background, and the inside of each silhouette has colored lettering on a white background. The colored lettering that fills each silhouette includes lines of text with the letters “A,” “T,” “G,” and “C.” These letters represent the nucleotide bases that form DNA. The letters within the silhouettes are colored in tie-dyed patterns with rainbow stripes that extend across the silhouettes in red, blue, green, and yellow. The illustration shows that human height is likely to involve many genes and factors.
Human height.
There is great variation in human height between different individuals.
Jane Ades/National Human Genome Research Institute.
At first glance, the inheritance of human height appears to be straightforward: tall people often have at least one tall parent. However, human height has long been known to follow a polygenic mode of inheritance. Although human height is easy to measure, geneticists have struggled for decades to define the number of genes that contribute to height. Recently, scientists have employed high-throughput, genome-wide approaches to further their efforts in this area.

Genome-wide association studies (GWAS) are an unbiased method for determining associations between genotypes and phenotypes. Often, GWAS involve scanning the genome to identify single nucleotide polymorphisms (SNPs) associated with a disease or phenotype of interest. SNPs are single-base pair polymorphic regions of DNA that often vary from one person to the next. SNPs occur throughout our genome with an average of one SNP for every 1,000 base pairs, and they have been mapped along the length of every human chromosome. Indeed, the International HapMap Project has determined how often certain SNPs appear to be transmitted together (International HapMap Consortium, 2005).

Positive associations between a SNP and a phenotype may indicate that the associated SNP contributes to the trait or is located in a chromosomal region close to a genetic variant (mutation) that contributes to the trait. Furthermore, more than one SNP (or mutation) can contribute to the same trait. If the SNPs are located on the same gene, this is called allelic heterogeneity. If the SNPs are located on different genes (in some cases, the genes can be located on different chromosomes), this is called locus heterogeneity.

Research teams from multiple labs have conducted GWAS to identify genetic variation associated with human height (Weedon et al., 2007; Gudbjartsson et al., 2008; Lettre et al., 2008; Visscher, 2008; Weedon et al., 2008). In their studies, the researchers scanned the genomes of approximately 63,000 individuals to identify particular SNPs associated with human height.

From Early Gene Mapping Studies to GWAS Using SNPs

In the early days of gene mapping, researchers studied single-gene diseases and identified markers that co-segregated with disease-associated phenotypes in pedigrees. For instance, in the first example of gene mapping, researcher Roger Donahue tracked the co-segregation of a physical characteristic of one of his copies of chromosome 1 (an "unraveled" region near the centromere) with the Duffy blood group locus. He used his family pedigree, cytogenetics (to follow the "unraveled" chromosome phenotype), and biochemical tests to determine Duffy blood group types among his family members. In other cases, scientists used large pedigrees together with restriction fragment length polymorphisms (RFLPs) to map disease-associated genes. For example, the Huntington's Disease Consortium found that a particular DNA probe hybridized differently with chromosomal DNA from Huntington's disease-affected members of two very large families. With an RFLP-associated DNA probe, researchers were then able to use somatic cell hybrids to map the DNA probe to human chromosome 4 and to eventually isolate the Huntington's disease gene (Htt).

Today, with the sequence of the human genome at their disposal, scientists know the DNA sequence of every human chromosome, and they have used statistical approaches to predict the location of genes along the length of these chromosomes. Scientists are using this knowledge to identify those genes that contribute to polygenic diseases. In fact, rather than using physical changes in chromosome structure and/or DNA probes to search for links to human disease-associated genes, researchers are now conducting GWAS using SNPs.

Array technologies have developed so that a single array chip contains as many as 500,000 SNPs of known identity and chromosomal location, which can be simultaneously probed using chromosomal DNA from a given individual. Computers scan the chip and determine the signal at every position on the chip. The data are then analyzed to determine the SNP genotype at every SNP position for a given individual. Using this approach, scientists can simultaneously determine whether certain SNPs or a particular pattern of SNPs are associated with any given form of human disease.

By generating a bank of SNP genotypes for large populations of individuals, scientists can then use the same data set to study the SNP associations of any human disease, as long as the disease phenotype is represented in the populations under study. GWAS use SNP data in an unbiased manner to scan the entire human genome for associations between particular SNPs and a given phenotype using statistical methods to examine each SNP. After an initial screening, associated SNPs are further examined and validated.

Each SNP has its own chromosomal address, which allows researchers to interrogate that region of the genome to identify genes that are likely to contribute to disease. Further studies are then carried out to determine whether a candidate gene contains a mutation, and to determine the function of the wild-type and mutant gene products. GWAS using SNPs have been used to study a number of complex diseases, such as asthma, breast cancer, cleft lip and palate, diabetes, and obesity.

In order to identify meaningful connections between SNPs and phenotypes, GWAS rely on the analysis of large populations. The number of test subjects required to achieve significant results depends on the extent to which the phenotype and its associated SNPs are represented within the population. At first glance, human height seems an ideal phenotype for GWAS: it is easily observed and measured, and it seems to be highly heritable. Height is associated with growth and developmental processes, and it is also influenced by environmental factors, such as nutrition. Differences in average height between men and women suggest that hormones may also affect height. However, previous studies have attempted to establish genetic links to human height with little success. For the most part, the earlier studies demonstrated a need for larger populations in order to establish strong genetic ties to human height.

A recent study of SNP variants associated with human height used SNP arrays to analyze 500,000 SNPs in 4,921 individuals (Weedon et al., 2007). Despite the relatively small number of subjects, the researchers identified two SNPs that were associated with height: the first SNP (called rs1042725) mapped in the HMGA2 gene, and the second SNP (called rs7958582) mapped 12 kilobases past the end of the HMGA2 gene. This information, along with that provided by previous studies, indicated that the HMGA2 gene was a strong candidate for a height-associated gene. For example, studies showed that mice homozygous for a deletion in the HMGA2 gene, called pygmy mice, were short in length (Zhou et al., 1995). Further evidence for the gene's role in height regulation came from genetically engineered mice that expressed high levels of a shortened form of the HMGA2 gene; these mice exhibited gigantism (Battista et al., 1999). Finally, a chromosomal inversion in humans that leads to the expression of a truncated form of the HMGA2 gene was associated with a severe overgrowth syndrome in an eight-year-old boy (Battista et al., 1999); this inversion was also associated with benign mesenchymal tumors called lipomas.

Researchers followed up by analyzing the rs1042725 SNP in an additional group of 29,098 individuals. These researchers identified an allele, called "C," that was associated with increased human height. In addition, they examined 11 additional SNPs in this chromosomal region in 9,704 individuals but found that the rs1042725 SNP correlated most strongly with height.

The researchers were also interested in determining correlations between age and the onset of height phenotypes. Thus, they analyzed the rs1042725 SNP in a group of children with established birth measurements and growth charts. The researchers did not detect a correlation between the rs1042725 SNP and length at birth, but did they identify a strong correlation between this SNP and increased height at seven years of age. Collectively, they estimated that the rs1042725 SNP contributes a mere 0.3% of the variation in human height. They also emphasized that future studies would require many thousands of individuals in order to identify statistically significant links between human height and human genes.
Thus, in a recent set of collaborative studies, three labs performed GWAS using SNP-based approaches that involved many more test subjects in an effort to identify additional genes that contribute to human height. In these studies, the research teams examined at least 500,000 different SNPs in each of the more than 63,000 study subjects (Weedon et al., 2007; Gudbjartsson et al., 2008; Lettre et al., 2008; Weedon et al., 2008). Collectively, the teams identified 54 SNP variants that were strongly associated with height variation in the general population (Visscher, 2008). Table 1 shows a summary of the three studies. The research teams are referred to as Group 1 (Weedon et al.), Group 2 (Lettre et al.), and Group 3 (Gudbjartsson et al.).

 

Group 1

(Weedon et al., 2008)

Group 2

(Lettre et al., 2008)

Group 3

(Gudbjartsson et al., 2008)

Number of subjects in first SNP chip experiment 13,665 15,821 33,992
Number of SNPs per chip 402,951 300,000-550,000 304,226
Number of SNP variants identified in first chip experiment 39 78 27 loci
Number of SNP variants analyzed in second pass 39 78 40 SNPs corresponding to 27 loci
Number of subjects in second pass 16,482 2,189 5,517
Number of SNP variants analyzed in third pass N/A 29 (for most subjects) N/A
Number of subjects in third pass N/A 17,801 N/A
Final number of SNP variant loci identified 20 loci
12 loci
22 loci
Combined contribution to human height variation 3% 2% 3.7%

Finding a Good Candidate: From SNPs to Genes

Although these research teams identified individual SNPs associated with variation in human height, the SNPs themselves do not provide information about specific genes; they simply indicate a chromosomal location that is likely to be closely associated with a given phenotype. However, by knowing the chromosomal address of the SNP, researchers can then search the SNP's "neighborhood" for candidate genes. Candidate genes are chosen based on the following criteria:

  • Gene expression in the right tissues at the right time (based on information from studies of human and/or mouse tissues or cell lines)
  • Animal models that lack a specific gene sequence (or gene product) and exhibit a phenotype of interest
  • Human chromosomal rearrangements or gene mutations that cause the phenotype of interest
  • Previous human linkage and/or linkage disequilibrium/association studies suggesting that specific genes are linked to the phenotype of interest
  • The identification of gene mutations causing syndromes that include the phenotype of interest as part of their clinical presentation

All three of the research teams identified one set of four candidate genes that were linked to height variation, including ZBTB38, HHIP, CDK6, and HMGA2. An additional six candidate genes were shared between two of the three teams (GPR126, ADAMTSL3, GDF5, LCORL, EFEMP1, and HIST1H1D). Finally, the researchers identified at least 40 new SNP variants associated with human height variation. Some of the candidate genes were associated with processes and signaling pathways expected to contribute to height, whereas others were unexpected and suggested novel links to human height.

Future Directions

This massive effort by three research teams to define the number of genes that contribute to variation in human height has set the stage for future studies aimed at defining what "polygenetic" means with respect to a wide variety of human diseases. Indeed, if a given disease or phenotype is sufficiently represented within the teams' sample population, other researchers could immediately use the same data set to determine genes associated with their disease or phenotype of interest.

References and Recommended Reading


Battista, S., et al. The expression of a truncated HMGI-C gene induces gigantism associated with lipomatosis. Cancer Research 59, 4793–4797 (1999)

Gudbjartsson, D. F., et al. Many sequence variants affecting diversity of adult human height. Nature Genetics 40, 609–615 (2008) doi:10.1038/ng.122 (link to article)

International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005) doi:10.1038/nature04226 (link to article)

Lettre, G., et al. Identification of ten loci associated with height highlights new biological pathways in human growth. Nature Genetics 40, 584–591 (2008) doi:10.1038/ng.125 (link to article)

Visscher, P. M. Sizing up human height variation. Nature Genetics 40, 489–490 (2008) doi:10.1038/ng0508-489 (link to article)

Weedon, M. N., et al. A common variant of HMGA2 is associated with adult and childhood height in the general population. Nature Genetics 39, 1245–1250 (2007) doi:10.1038/ng.121 (link to article)

———. Genome-wide association analysis identifies 20 loci that influence adult height. Nature Genetics 40, 575–583 (2008) doi:10.1038/ng.121 (link to article)

Zhou, X., et al. Mutation responsible for the mouse pygmy phenotype in the developmentally regulated factor HMGI-C. Nature 376, 771–774 (1995) doi:10.1038/376771a0 (link to article)

Email

Article History

Close

Flag Inappropriate

This content is currently under construction.
Explore This Subject

Connect
Connect Send a message


Scitable by Nature Education Nature Education Home Learn More About Faculty Page Students Page Feedback



Genes and Disease

Visual Browse

Close