Recurrent mutation at the classical haptoglobin structural polymorphism

Polymorphism of haptoglobin in human serum was first discovered over 60 years ago. A new paper characterizes the complex structural variation at the HP locus in detail and, by using imputation from flanking SNP genotype data, shows that it affects blood cholesterol levels.

The development of gel electrophoresis by Oliver Smithies in 1955 allowed, for the first time, investigation of the molecular basis of genetic variation through identification of protein variants, observation of inheritance in families and measurement of allele frequencies. These first insights into molecular variation led to important advances in medical and population genetics, and gel electrophoresis remained the principal method of analyzing molecular genetic variation within species for over 20 years until the development of DNA-based methods. The first human molecular variation to be discovered using gel electrophoresis was that of haptoglobin, a serum protein that can bind hemoglobin or cholesterol1. Together with colleagues, particularly Norma Ford Walker and Nobuyo Maeda, Smithies began to elucidate the consequences of genetic variation on haptoglobin protein structure and to unravel the evolutionary history of the haptoglobin gene family2,3,4. On page 359 of this issue, Steven McCarroll and colleagues5 present new insights into the evolutionary origin of the alleles of the haptoglobin gene and demonstrate that the variation originally identified by Smithies has an effect on blood cholesterol levels.

Determining functional variants

The human HP (haptoglobin) gene has two common alleles6, HP1 and HP2. The two alleles are copy number variants, as HP2 encodes a larger protein owing to a 1.7-kb tandem duplication that adds an extra two exons within the gene. The HP1 and HP2 alleles differ in function. In HP1/HP1 homozygotes, haptoglobin is a dimer, but haptoglobin forms multimers in individuals with the HP2 allele. Despite evidence from genome-wide association studies (GWAS) that variation in this genomic region affects blood cholesterol levels, the HP1/HP2 variation is not tagged in a simple fashion by nearby SNP alleles in linkage disequilibrium. However, by examining long tracts of flanking SNP alleles, Boettger et al.5 show that particular long SNP haplotypes carry specific haptoglobin alleles, such that, although one particular haptoglobin allele is associated with many flanking SNP haplotypes, each different SNP haplotype is associated with a restricted number of particular haptoglobin alleles5. This observation allowed the authors to reliably impute HP1 and HP2 genotypes in over 22,000 individuals with flanking SNP genotypes and cholesterol measurements and to test for association between haptoglobin genotype and cholesterol levels in blood.

Boettger et al. find that each extra copy of the HP2 allele increases the level of cholesterol by 2.11 mg/dl, an effect that is clinically modest but consistent with the magnitude of effect sizes seen for SNPs across a range of common complex traits. Although they demonstrate that most of the association signal detected by GWAS is explained by the HP1/HP2 variation, some of the signal appears to remain at the sentinel SNP rs2000999, which is independently associated with an increase in cholesterol level of 1.49 mg/dl per copy of the A allele. Previous studies have shown that the rs2000999[A] allele reduces the expression levels of HP7,8. Therefore, Boettger et al. suggest a model based on impaired APOE antioxidant activity, whereby reduced expression of haptoglobin from the rs2000999[A] allele and the altered protein structure arising from the HP2 allele independently reduce haptoglobin's antioxidant activity on APOE5. Oxidation of APOE is known to impair clearance of plasma lipids9.

The work of Boettger et al. dissecting the relationship of the HP1/HP2 polymorphism to its flanking SNP haplotypes also led to a surprising insight into the evolution of this polymorphism5. Because great apes have the HP1 allele, it has understandably been assumed that the ancestral allele is HP1 and that HP2 is a derived duplication that occurred relatively recently, perhaps within the last 100,000 years10. However, in examining the diversity of SNP haplotypes for the HP1 and HP2 alleles and analyzing genomes from ancient hominins, it appears that the duplication generating HP2 occurred much further back in human evolution. Therefore, of the variation observed in populations today, the HP2 allele is ancestral and modern HP1 alleles have been generated by recurrent deletions (Fig. 1).

Figure 1: Origin of the human haptoglobin alleles that modify cholesterol levels in the blood.

Two variants at the HP (haptoglobin) locus independently affect blood cholesterol levels—a structural variant of the HP gene (HP1/HP2) and a SNP in a noncoding region 13 kb downstream of HP (rs2000999). HP2 arose by duplication from the HP1 allele between 0.5 and 7 million years ago (MYA) and became fixed in humans. Modern HP1 alleles have been generated by recurrent mutation of this ancestral HP2 allele.

Lessons for human genetics

Advances in understanding the origin and consequences of haptoglobin gene evolution are of particular interest and relevance to those working on these genes and metabolic traits. In addition, the development of a method to quickly measure variation at this locus in large populations will facilitate investigation of the role of haptoglobin in a range of other traits and diseases. More generally, detection of variants with modest effect sizes, such as those seen for haptoglobin variation on cholesterol levels, requires large sample sizes, and such variants could explain more of the missing heritability for common complex traits. Other copy number–variable sites involving recurrent duplication and deletion may be responsible for other GWAS signals. The same group has shown that recurrent mutation at a copy number variant involving the C4 gene (complement component 4) is responsible for a signal of association identified through GWAS of schizophrenia11. The extent to which these faster mutating multiallelic regions can be imputed is variable12 and will be a consequence of the interplay between their mutation rate, allelic variation and evolutionary history. For haptoglobin, there has been a high enough mutation rate to generate several HP1 alleles on different SNP haplotype backgrounds but not so much mutation that the association between the structural variants and flanking haplotypes has been erased.

Very high-throughput (and, ideally, low-cost) methods for measuring complex structural variation at other loci throughout the genome are needed. Imputation approaches are important, but creating panels of SNPs to impute complex structural variation requires a thorough understanding of the structural variation at a locus, robust genotyping and accurate phasing and so may not be applicable to all complex regions. However, the development of computational approaches to accurately infer complex structural variation from readily available SNP genotypes in very large populations holds promise for deeper understanding of the phenotypic effects of other structurally complex loci across the genome.


  1. 1

    Smithies, O. Nature 175, 307–308 (1955).

    CAS  Article  Google Scholar 

  2. 2

    Smithies, O. & Walker, N.F. Nature 176, 1265–1266 (1955).

    CAS  Article  Google Scholar 

  3. 3

    Harris, H., Robson, E.B. & Siniscalco, M. Nature 182, 1324–1325 (1958).

    CAS  Article  Google Scholar 

  4. 4

    Maeda, N., Yang, F., Barnett, D.R., Bowman, B.H. & Smithies, O. Nature 309, 131–135 (1984).

    CAS  Article  Google Scholar 

  5. 5

    Boettger, L.M. et al. Nat. Genet. 48, 359–366 (2016).

    CAS  Article  Google Scholar 

  6. 6

    Langlois, M.R. & Delanghe, J.R. Clin. Chem. 42, 1589–1600 (1996).

    CAS  PubMed  Google Scholar 

  7. 7

    Froguel, P. et al. PLoS One 7, e32327 (2012).

    CAS  Article  Google Scholar 

  8. 8

    Soejima, M. et al. Clin. Chim. Acta 433, 54–57 (2014).

    CAS  Article  Google Scholar 

  9. 9

    Yang, Y., Cao, Z., Tian, L., Garvey, W.T. & Cheng, G. PLoS One 8, e57571 (2013).

    CAS  Article  Google Scholar 

  10. 10

    Rodriguez, S. et al. Ann. Hum. Genet. 76, 352–362 (2012).

    CAS  Article  Google Scholar 

  11. 11

    Sekar, A. et al. Nature 530, 177–183 (2016).

    CAS  Article  Google Scholar 

  12. 12

    Handsaker, R.E. et al. Nat. Genet. 47, 296–303 (2015).

    CAS  Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Edward J Hollox.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hollox, E., Wain, L. Recurrent mutation at the classical haptoglobin structural polymorphism. Nat Genet 48, 347–348 (2016).

Download citation

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing