A positively selected FBN1 missense variant reduces height in Peruvian individuals

Asgari, Samira; Luo, Yang; Akbari, Ali; Belbin, Gillian M.; Li, Xinyi; Harris, Daniel N.; Selig, Martin; Bartell, Eric; Calderon, Roger; Slowikowski, Kamil; Contreras, Carmen; Yataco, Rosa; Galea, Jerome T.; Jimenez, Judith; Coit, Julia M.; Farroñay, Chandel; Nazarian, Rosalynn M.; O’Connor, Timothy D.; Dietz, Harry C.; Hirschhorn, Joel N.; Guio, Heinner; Lecca, Leonid; Kenny, Eimear E.; Freeman, Esther E.; Murray, Megan B.; Raychaudhuri, Soumya

doi:10.1038/s41586-020-2302-0

Article
Published: 13 May 2020

A positively selected FBN1 missense variant reduces height in Peruvian individuals

Samira Asgari ORCID: orcid.org/0000-0002-2347-8985^1,2,3,4,5,
Yang Luo ORCID: orcid.org/0000-0001-7385-6166^1,2,3,4,5,
Ali Akbari^4,6,
Gillian M. Belbin^7,8,9,
Xinyi Li^1,2,3,4,5,
Daniel N. Harris^10,11,
Martin Selig¹²,
Eric Bartell^4,5,13,
Roger Calderon ORCID: orcid.org/0000-0001-8932-0489¹⁴,
Kamil Slowikowski ORCID: orcid.org/0000-0002-2843-6370^1,2,3,4,5,
Carmen Contreras¹⁴,
Rosa Yataco¹⁴,
Jerome T. Galea¹⁵,
Judith Jimenez¹⁴,
Julia M. Coit¹⁶,
Chandel Farroñay ORCID: orcid.org/0000-0002-3532-3120¹⁴,
Rosalynn M. Nazarian¹²,
Timothy D. O’Connor^10,17,
Harry C. Dietz ORCID: orcid.org/0000-0002-6856-0165^18,19,
Joel N. Hirschhorn^4,6,13,20,
Heinner Guio²¹,
Leonid Lecca¹⁴,
Eimear E. Kenny^7,8,9,
Esther E. Freeman²²,
Megan B. Murray¹⁶ &
…
Soumya Raychaudhuri ORCID: orcid.org/0000-0002-1901-8265^1,2,3,4,5,23

Nature volume 582, pages 234–239 (2020)Cite this article

10k Accesses
31 Citations
360 Altmetric
Metrics details

Subjects

Abstract

On average, Peruvian individuals are among the shortest in the world¹. Here we show that Native American ancestry is associated with reduced height in an ethnically diverse group of Peruvian individuals, and identify a population-specific, missense variant in the FBN1 gene (E1297G) that is significantly associated with lower height. Each copy of the minor allele (frequency of 4.7%) reduces height by 2.2 cm (4.4 cm in homozygous individuals). To our knowledge, this is the largest effect size known for a common height-associated variant. FBN1 encodes the extracellular matrix protein fibrillin 1, which is a major structural component of microfibrils. We observed less densely packed fibrillin-1-rich microfibrils with irregular edges in the skin of individuals who were homozygous for G1297 compared with individuals who were homozygous for E1297. Moreover, we show that the E1297G locus is under positive selection in non-African populations, and that the E1297 variant shows subtle evidence of positive selection specifically within the Peruvian population. This variant is also significantly more frequent in coastal Peruvian populations than in populations from the Andes or the Amazon, which suggests that short stature might be the result of adaptation to factors that are associated with the coastal environment in Peru.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Genetic architecture of height in the Peruvian population.**

**Fig. 2: rs200342067 is positively selected in the Peruvian population.**

**Fig. 3: Electron microscopy of fibrillin 1 in the skin.**

Characterizing rare and low-frequency height-associated variants in the Japanese population

Article Open access 27 September 2019

Masato Akiyama, Kazuyoshi Ishigaki, … Yoichiro Kamatani

A saturated map of common genetic variants associated with human height

Article Open access 12 October 2022

Loïc Yengo, Sailaja Vedantam, … Joel N. Hirschhorn

Allele frequency differentiation at height-associated SNPs among continental human populations

Article 15 July 2021

Minhui Chen & Charleston W. K. Chiang

Data availability

Genotyping data are available through dbGAP, under accession number phs002025.v1.p1.

Code availability

No custom code was used to draw the central conclusions of this work. All the software and packages used in this work are included and referenced in the manuscript.

References

NCD Risk Factor Collaboration (NCD-RisC). A century of trends in adult human height. eLife 5, e13410 (2016).
Google Scholar
Homburger, J. R. et al. Genomic insights into the ancestry and demographic history of South America. PLoS Genet. 11, e1005602 (2015).
PubMed PubMed Central Google Scholar
Harris, D. N. et al. Evolutionary genomic dynamics of Peruvians before, during, and after the Inca Empire. Proc. Natl Acad. Sci. USA 115, E6526–E6535 (2018).
CAS PubMed PubMed Central Google Scholar
Ruiz-Linares, A. et al. Admixture in Latin America: geographic structure, phenotypic diversity and self-perception of ancestry based on 7,342 individuals. PLoS Genet. 10, e1004572 (2014).
PubMed PubMed Central Google Scholar
Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).
PubMed PubMed Central Google Scholar
Marouli, E. et al. Rare and low-frequency coding variants alter human adult height. Nature 542, 186–190 (2017).
ADS CAS PubMed PubMed Central Google Scholar
Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019).
CAS PubMed PubMed Central Google Scholar
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
CAS PubMed PubMed Central Google Scholar
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
CAS PubMed PubMed Central Google Scholar
Vilhjálmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
PubMed PubMed Central Google Scholar
Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 100, 635–649 (2017).
CAS PubMed PubMed Central Google Scholar
Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).
CAS PubMed PubMed Central Google Scholar
Kong, A. et al. The nature of nurture: effects of parental genotypes. Science 359, 424–428 (2018).
ADS CAS PubMed Google Scholar
Domingue, B. W. et al. The social genome of friends and schoolmates in the National Longitudinal Study of Adolescent to Adult Health. Proc. Natl Acad. Sci. USA 115, 702–707 (2018).
CAS PubMed PubMed Central Google Scholar
Rask-Andersen, M., Karlsson, T., Ek, W. E. & Johansson, Å. Gene–environment interaction study for BMI reveals interactions between genetic factors and physical activity, alcohol consumption and socioeconomic status. PLoS Genet. 13, e1006977 (2017).
PubMed PubMed Central Google Scholar
Pelova, N. Considerations on the so-called myelolipoma of the adrenals. Nauchni Tr. Vissh. Med. Inst. Sofiia 48, 31–35 (1969).
CAS PubMed Google Scholar
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Google Scholar
Voight, B. F., Kudaravalli, S., Wen, X. & Pritchard, J. K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).
PubMed PubMed Central Google Scholar
Johnson, K. E. & Voight, B. F. Patterns of shared signatures of recent positive selection across human populations. Nat. Ecol. Evol. 2, 713–720 (2018).
PubMed PubMed Central Google Scholar
Akbari, A. et al. Identifying the favored mutation in a positive selective sweep. Nat. Methods 15, 279–282 (2018).
CAS PubMed PubMed Central Google Scholar
Sabeti, P. C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).
ADS CAS PubMed Google Scholar
Nei, M. & Li, W. H. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc. Natl Acad. Sci. USA 76, 5269–5273 (1979).
ADS CAS PubMed MATH PubMed Central Google Scholar
Arbiza, L., Zhong, E. & Keinan, A. NRE: a tool for exploring neutral loci in the human genome. BMC Bioinformatics 13, 301 (2012).
PubMed PubMed Central Google Scholar
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
CAS PubMed PubMed Central Google Scholar
Albers, P. K. & McVean, G. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol. 18, e3000586 (2020).
PubMed PubMed Central Google Scholar
Lamason, R. L. et al. SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310, 1782–1786 (2005).
ADS CAS PubMed Google Scholar
Fan, S., Hansen, M. E. B., Lo, Y. & Tishkoff, S. A. Going global by adapting local: a review of recent human adaptation. Science 354, 54–59 (2016).
ADS CAS PubMed PubMed Central Google Scholar
Adhikari, K. et al. A GWAS in Latin Americans highlights the convergent evolution of lighter skin pigmentation in Eurasia. Nat. Commun. 10, 358 (2019).
ADS PubMed PubMed Central Google Scholar
Sturm, R. A. & Duffy, D. L. Human pigmentation genes under environmental selection. Genome Biol. 13, 248 (2012).
CAS PubMed PubMed Central Google Scholar
Günther, T. & Coop, G. Robust identification of local adaptation from allele frequencies. Genetics 195, 205–220 (2013).
PubMed PubMed Central Google Scholar
Lasker, G. W. Differences in anthropometric measurements within and between three communities in Peru. Hum. Biol. 34, 63–70 (1962).
CAS PubMed Google Scholar
Sengle, G. & Sakai, L. Y. The fibrillin microfibril scaffold: a niche for growth factors and mechanosensation? Matrix Biol. 47, 3–12 (2015).
CAS PubMed Google Scholar
Schrenk, S., Cenzi, C., Bertalot, T., Conconi, M. T. & Di Liddo, R. Structural and functional failure of fibrillin-1 in human diseases (review). Int. J. Mol. Med. 41, 1213–1223 (2018).
CAS PubMed Google Scholar
Collod-Béroud, G. et al. Update of the UMD-FBN1 mutation database and creation of an FBN1 polymorphism database. Hum. Mutat. 22, 199–208 (2003).
PubMed Google Scholar
Tiecke, F. et al. Classic, atypically severe and neonatal Marfan syndrome: twelve mutations and genotype-phenotype correlations in FBN1 exons 24–40. Eur. J. Hum. Genet. 9, 13–21 (2001).
CAS PubMed Google Scholar
Smallridge, R. S. et al. Solution structure and dynamics of a calcium binding epidermal growth factor-like domain pair from the neonatal region of human fibrillin-1. J. Biol. Chem. 278, 12199–12206 (2003).
CAS PubMed Google Scholar
Booms, P., Tiecke, F., Rosenberg, T., Hagemeier, C. & Robinson, P. N. Differential effect of FBN1 mutations on in vitro proteolysis of recombinant fibrillin-1 fragments. Hum. Genet. 107, 216–224 (2000).
CAS PubMed Google Scholar
Jensen, S. A., Robertson, I. B. & Handford, P. A. Dissecting the fibrillin microfibril: structural insights into organization and function. Structure 20, 215–225 (2012).
CAS PubMed Google Scholar
Jensen, S. A., Corbett, A. R., Knott, V., Redfield, C. & Handford, P. A. Ca²⁺-dependent interface formation in fibrillin-1. J. Biol. Chem. 280, 14076–14084 (2005).
CAS PubMed Google Scholar
McGettrick, A. J., Knott, V., Willis, A. & Handford, P. A. Molecular effects of calcium binding mutations in Marfan syndrome depend on domain context. Hum. Mol. Genet. 9, 1987–1994 (2000).
CAS PubMed Google Scholar
Zoledziewska, M. et al. Height-reducing variants and selection for short stature in Sardinia. Nat. Genet. 47, 1352–1356 (2015).
CAS PubMed PubMed Central Google Scholar
Fumagalli, M. et al. Greenlandic Inuit show genetic signatures of diet and climate adaptation. Science 349, 1343–1347 (2015).
ADS CAS PubMed Google Scholar
Luo, Y. et al. Early progression to active tuberculosis is a highly heritable trait driven by 3q23 in Peruvians. Nat. Commun. 10, 3765 (2019).
ADS PubMed PubMed Central Google Scholar
Zelner, J. L. et al. Identifying hotspots of multidrug-resistant tuberculosis transmission using spatial and molecular genetic data. J. Infect. Dis. 213, 287–294 (2016).
PubMed Google Scholar
Odone, A. et al. Acquired and transmitted multidrug resistant tuberculosis: the role of social determinants. PLoS ONE 11, e0146642 (2016).
PubMed PubMed Central Google Scholar
Zhou, X. & Stephens, M. Efficient multivariate linear mixed model algorithms for genome-wide association studies. Nat. Methods 11, 407–409 (2014).
CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. Long-range LD can confound genome scans in admixed populations. Am. J. Hum. Genet. 83, 132–135 (2008).
CAS Google Scholar
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
CAS PubMed PubMed Central Google Scholar
Conomos, M. P., Reiner, A. P., Weir, B. S. & Thornton, T. A. Model-free estimation of recent genetic relatedness. Am. J. Hum. Genet. 98, 127–148 (2016).
CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Whole population, genome-wide mapping of hidden relatedness. Genome Res. 19, 318–326 (2009).
CAS PubMed PubMed Central Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).
PubMed PubMed Central Google Scholar
Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
CAS PubMed PubMed Central Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
CAS PubMed Google Scholar
Chen, C.-Y. et al. Improved ancestry inference using weights from external reference panels. Bioinformatics 29, 1399–1406 (2013).
CAS PubMed PubMed Central Google Scholar
Ziyatdinov, A. et al. lme4qtl: linear mixed models with flexible covariance structure for genetic studies of related individuals. BMC Bioinformatics 19, 68 (2018).
PubMed PubMed Central Google Scholar
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
CAS PubMed PubMed Central Google Scholar
Schick, U. M. et al. Genome-wide association study of platelet count identifies ancestry-specific loci in Hispanic/Latino Americans. Am. J. Hum. Genet. 98, 229–242 (2016).
CAS PubMed PubMed Central Google Scholar
Balduzzi, S., Rücker, G. & Schwarzer, G. How to perform a meta-analysis with R: a practical tutorial. Evid. Based Ment. Health 22, 153–160 (2019).
PubMed Google Scholar
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
CAS PubMed PubMed Central Google Scholar
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
CAS PubMed PubMed Central Google Scholar
Bakshi, A. et al. Fast set-based association analysis using summary data from GWAS identifies novel gene loci for human complex traits. Sci. Rep. 6, 32894 (2016).
ADS CAS PubMed PubMed Central Google Scholar
Szpiech, Z. A. & Hernandez, R. D. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol. Biol. Evol. 31, 2824–2827 (2014).
CAS PubMed PubMed Central Google Scholar
Marcus, J. H. & Novembre, J. Visualizing the geography of genetic variants. Bioinformatics 33, 594–595 (2017).
CAS PubMed Google Scholar
Kelleher, J., Etheridge, A. M. & McVean, G. Efficient Coalescent simulation and genealogical analysis for large sample sizes. PLOS Comput. Biol. 12, e1004842 (2016).
ADS PubMed PubMed Central Google Scholar
International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).
Google Scholar
Gravel, S. et al. Demographic history and rare allele sharing among human populations. Proc. Natl Acad. Sci. USA 108, 11983–11988 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
CAS PubMed PubMed Central Google Scholar
Lin, D. et al. Digestion-ligation-only Hi-C is an efficient and cost-effective method for chromosome conformation capture. Nat. Genet. 50, 754–763 (2018).
CAS PubMed Google Scholar
Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).
ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank D. B. Moody for discussions, T. Horn for his feedback on optimizing skin immunohistochemistry and J. N. Katz for advising us on a structured clinical assessment of the musculoskeletal system. The study was supported by the National Institutes of Health (NIH) TB Research Unit Network, grants U19-AI111224-01 and U01-HG009088. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. S.A. was supported by the Swiss National Science Foundation (SNSF) postdoctoral mobility fellowships P2ELP3_172101 and P400PB_183823.

Author information

Authors and Affiliations

Center for Data Sciences, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Samira Asgari, Yang Luo, Xinyi Li, Kamil Slowikowski & Soumya Raychaudhuri
Division of Rheumatology, Inflammation, and Immunity, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Samira Asgari, Yang Luo, Xinyi Li, Kamil Slowikowski & Soumya Raychaudhuri
Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Samira Asgari, Yang Luo, Xinyi Li, Kamil Slowikowski & Soumya Raychaudhuri
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Samira Asgari, Yang Luo, Ali Akbari, Xinyi Li, Eric Bartell, Kamil Slowikowski, Joel N. Hirschhorn & Soumya Raychaudhuri
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
Samira Asgari, Yang Luo, Xinyi Li, Eric Bartell, Kamil Slowikowski & Soumya Raychaudhuri
Department of Genetics, Harvard Medical School, Boston, MA, USA
Ali Akbari & Joel N. Hirschhorn
The Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Gillian M. Belbin & Eimear E. Kenny
Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Gillian M. Belbin & Eimear E. Kenny
Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Gillian M. Belbin & Eimear E. Kenny
Program in Personalized and Genomic Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
Daniel N. Harris & Timothy D. O’Connor
Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
Daniel N. Harris
Pathology Service, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
Martin Selig & Rosalynn M. Nazarian
Division of Endocrinology and Center for Basic and Translational Obesity Research, Boston Children’s Hospital, Boston, MA, USA
Eric Bartell & Joel N. Hirschhorn
Socios En Salud, Lima, Peru
Roger Calderon, Carmen Contreras, Rosa Yataco, Judith Jimenez, Chandel Farroñay & Leonid Lecca
School of Social Work, University of South Florida, Tampa, FL, USA
Jerome T. Galea
Department of Global Health and Social Medicine, and Division of Global Health Equity, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Julia M. Coit & Megan B. Murray
Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
Timothy D. O’Connor
Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Harry C. Dietz
Howard Hughes Medical Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Harry C. Dietz
Department of Pediatrics, Harvard Medical School, Boston, MA, USA
Joel N. Hirschhorn
Instituto Nacional de Salud, Lima, Peru
Heinner Guio
Department of Dermatology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
Esther E. Freeman
Centre for Genetics and Genomics Versus Arthritis, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
Soumya Raychaudhuri

Authors

Samira Asgari
View author publications
You can also search for this author in PubMed Google Scholar
Yang Luo
View author publications
You can also search for this author in PubMed Google Scholar
Ali Akbari
View author publications
You can also search for this author in PubMed Google Scholar
Gillian M. Belbin
View author publications
You can also search for this author in PubMed Google Scholar
Xinyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Daniel N. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Martin Selig
View author publications
You can also search for this author in PubMed Google Scholar
Eric Bartell
View author publications
You can also search for this author in PubMed Google Scholar
Roger Calderon
View author publications
You can also search for this author in PubMed Google Scholar
Kamil Slowikowski
View author publications
You can also search for this author in PubMed Google Scholar
Carmen Contreras
View author publications
You can also search for this author in PubMed Google Scholar
Rosa Yataco
View author publications
You can also search for this author in PubMed Google Scholar
Jerome T. Galea
View author publications
You can also search for this author in PubMed Google Scholar
Judith Jimenez
View author publications
You can also search for this author in PubMed Google Scholar
Julia M. Coit
View author publications
You can also search for this author in PubMed Google Scholar
Chandel Farroñay
View author publications
You can also search for this author in PubMed Google Scholar
Rosalynn M. Nazarian
View author publications
You can also search for this author in PubMed Google Scholar
Timothy D. O’Connor
View author publications
You can also search for this author in PubMed Google Scholar
Harry C. Dietz
View author publications
You can also search for this author in PubMed Google Scholar
Joel N. Hirschhorn
View author publications
You can also search for this author in PubMed Google Scholar
Heinner Guio
View author publications
You can also search for this author in PubMed Google Scholar
Leonid Lecca
View author publications
You can also search for this author in PubMed Google Scholar
Eimear E. Kenny
View author publications
You can also search for this author in PubMed Google Scholar
Esther E. Freeman
View author publications
You can also search for this author in PubMed Google Scholar
Megan B. Murray
View author publications
You can also search for this author in PubMed Google Scholar
Soumya Raychaudhuri
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.R. and M.B.M. designed the study. S.A. analysed and interpreted the data. S.A. and S.R. drafted the manuscript. Y.L., G.M.B., E.E.K., J.N.H., E.B., K.S., H.G., T.D.O., A.A., D.N.H. and X.L. performed statistical analysis. M.B.M., L.L., R.C., J.M.C., C.C., R.Y., J.T.G., J.J., J.M.C. and C.F. recruited patients and obtained samples for this study. S.R., E.E.F., H.C.D., R.M.N. and M.S. conducted clinical assessment. All authors discussed the results and commented on the manuscript.

Corresponding author

Correspondence to Soumya Raychaudhuri.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Guillaume Lettre, Ben Voight and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Peruvian population structure.

a, b, PCA of genotyping data from Peruvian individuals included in this study (n = 3,134 individuals) merged with the data from continental populations from phase 3 of the 1000 Genomes Project (n = 3,469 individuals) as well as the data from Siberian and Native American populations from the previously published study⁵² (n = 738 individuals), which were used as a reference panel (number of variants, 34,936). Dots, individuals; colour, populations (AFR, African; AMR, South American; EAS, east Asian; SAS, south Asian; EUR, European; SIB, Siberian; NAT, Native American). c, Global ancestry analysis using ADMIXTURE (K = 4). We observed varying levels of European, African and Asian admixture in our cohort (n = 3,134 individuals) with a median proportion of Native American, European, African and Asian ancestry per individual of 0.83 (IQR = 0.72–0.91), 0.14 (0.08–0.21), 0.01 (0.003–0.03) and 0.003 (10⁻⁵–0.01), respectively. Vertical lines, individuals; colours, genomic proportion of a given ancestry in the genome of each individual. ADMIXTURE analysis (K = 4) is done using all populations in phase 3 of the 1000 Genomes Project as well as the Siberian and Native American populations from the previously published study⁵², which were used as a reference. African (AFR) ancestry includes Yoruba in Ibadan, Nigeria, Luhya in Webuye, Kenya, Gambian in Western Divisions in the Gambia, Mende in Sierra Leone, Esan in Nigeria, Americans of African Ancestry in southwest United States. European (EUR) ancestry includes central European, Utah residents (CEPH) with northern and western European ancestry (USA), Toscani in Italy, Finnish in Finland, British in England and Scotland, Iberian population in Spain. East Asian (EAS) ancestry includes Han Chinese in Beijing, China, Japanese in Tokyo, Japan, Southern Han Chinese, Chinese Dai in Xishuangbanna, China, Kinh in Ho Chi Minh City, Vietnam. South Asian (SAS) ancestry includes Gujarati Indian from Houston, Texas (USA), Punjabi from Lahore, Pakistan, Bengali from Bangladesh, Sri Lankan Tamil from the United Kingdom, Indian Telugu from the United Kingdom. Puerto Ricans (PUR) from Puerto Rico. Colombians (CLM) from Medellin, Colombia. Mexicans (MXL) from Los Angeles, California (USA). Peruvian individuals (PEL) from Lima, Peru. Altic, Altaic language family, which includes Yakut, Buryat, Evenki, Tuvinians, Altaian, Mongolian, Dolgan. North Amerind, northern Amerindian language family, which includes Maya, Mixe, Kaqchikel, Algonquin, Ojibwa and Cree. Central Amerind, central Amerindian language family, which includes Pima, Chorotega, Tepehuano, Zapotec, Mixtec and Yaqui. Andean, Andean language family, which includes Quechua, Aymara, Inga, Chilote, Diaguita, Chono, Hulliche and Yaghan. A full list of all populations in all language groups has been published previously⁵².

Extended Data Fig. 2 Association of rs200342067 and height.

a, Single-variant association analysis (n = 3,134 individuals and 7,756,401 variants). Dotted red line, genome-wide significance threshold of 5 × 10⁻⁸. Five SNPs that overlap the coding sequence of FBN1 passed the genome-wide significance threshold. We did not observe any inflation in test statistics (λ = 1.02). Association P values are from two-sided Wald tests. b, rs200342067 in heterozygous individuals reduces height by 2.2 cm (4.4 cm in homozygous individuals, including 11 individuals with the C/C genotype, 275 the C/T genotype and 2,848 the T/T genotype) and could explain 0.9% of the phenotypic variance in height in our cohort (n = 3,143 individuals). The x axis shows the rs200342067 genotype; the y axis shows the height residuals after adjustments for age, sex and a GRM as random effect.

Extended Data Fig. 3 rs12441775 DAF (rs12441775*G) and extended haplotype structure in the 1000 Genomes Project.

a, The derived allele, rs12441775*G, has a high frequency in all non-African populations in the 1000 Genomes Project (average DAF in non-Africans = 58% (IQR = 51–64) and in Africans = 4% (IQR = 1–5)). The map is generated using the GGV browser⁶⁴ (http://www.popgen.uchicago.edu/ggv). b–h, Haplotypes that carry the rs12441775*G (major/derived) allele are longer than haplotypes that carry the rs12441775*C (minor/ancestral) allele in non-African populations. Horizontal lines, haplotypes; the position of rs12441775 is marked below the haplotype. At any given position, adjacent haplotypes with the same colour carry identical genotypes between the core SNP (rs12441775) and that site, dashed line separates the haplotypes that carry the derived (above the line) and ancestral (below the line) alleles.

Extended Data Fig. 4 Haplotypes that carry the rs200342067 allele are longer than what is expected under neutral selection.

a, Haplotype decay around rs200342067 in our cohort (n = 3,134 individuals and 6,268 haplotypes). The position of rs200342067 is marked below the haplotypes. Haplotypes above the dashed line carry rs200342067*C allele (derived/minor, n = 297 haplotypes) and haplotypes below the dashed line carry the rs200342067*T allele (ancestral/major, n = 5,971 haplotypes). b, Integrated EHH of haplotypes carrying the rs200342067*C allele (n = 297 haplotypes) compared with the integrated EHH of haplotypes carrying 2,380 variants with similar DAF (4.7 ± 1%) that overlap the neutral regions of the genome in our cohort (n = 3,134 individuals). Haplotypes that carry the rs200342067*C allele are taller than 99.2% of the haplotypes carrying similar variants in neutral regions of the genome. Vertical red line, integrated EHH of haplotypes carrying the rs200342067*C allele (integrated EHH = 0.115). c, The same as a, but excluding the nine haplotypes that carry both rs200342067*C and rs12441775*G alleles. d, EHH decay curves for haplotypes carrying the rs200342067*C allele excluding the nine haplotypes that carry both rs200342067*C and rs12441775*G alleles (n = 288 haplotypes) compared with haplotypes carrying 2,309 variants that have a similar DAF to the updated frequency of rs200342067*C (4.6 ± 1%) and that overlap the neutral regions of the genome in our cohort (n = 3,134 individuals). Haplotypes with the rs200342067*C allele are longer than 99.7% of the haplotypes carrying similar variants in the neutral genomic regions. e, Integrated EHH for data shown in d. Vertical red line, integrated EHH for haplotypes carrying the rs200342067*C but not the rs12441775*G allele (integrated EHH = 0.124).

Extended Data Fig. 5 Simulation of haplotypes under the neutral demographic model.

a, PCA plot of principal component (PC)2 versus PC1 for simulated individuals (n = 1,000 simulated individuals and 2,000 simulated haplotypes). Individuals were simulated using a demographic model matching the population history of Peru and under neutral selection. Red dots, simulated individuals; other dots, reference populations from the 1000 Genomes Project. b, PCA plot of PC3 versus PC1 as described for a. c, We compared the integrated EHH of rs200342067*C with the integrated EHH of 1,000 variants that had a similar DAF to rs200342067 (DAF = 4.7 ± 1%) and that overlapped the same genomic region as rs200342067 on a simulated chromosome 15 (physical position, 48,773,926 ± 20 kb). The integrated EHH of rs200342067 is more extreme than the integrated EHH observed for any of the variants in the simulated data. The x axis shows the integrated EHH; the distribution is the integrated EHH of variants in simulated haplotypes (n = 2,000 haplotypes); the vertical red line shows the integrated EHH value of rs200342067 in our cohort (n = 6,628 haplotypes, integrated EHH = 0.115). d, e, Similar to c for two different neutral regions on chromosome 15. Vertical red lines, integrated EHH of rs17580697 (d, integrated EHH = 0.012, 76th percentile) and rs305008 (e; integrated EHH = 0.010, 74th percentile) in our cohort (n = 6,628 haplotypes).

Extended Data Fig. 6 Comparison of different selection statistics for rs200342067 and other variants with a similar DAF and recombination rate.

a, Distribution of iHS for 2,062 independent variants (that are at least 1 Mb apart) matched in DAF and local recombination rate to rs200342067. iHS values are calculated for Peruvian individuals in the 1000 Genomes Project (n = 85 individuals) and were obtained from a previously published study¹⁹. Red line, iHS of rs200342067 (iHS = −1.5; 4.7th percentile); green and blue lines, fifth and first percentile of the iHS distribution. b, EHH decay curves for rs200342067 (red line) as well as haplotypes that carry 2,062 independent variants (at least 1 Mb apart) matched in DAF and local recombination rate to rs200342067 in our cohort (n = 6,268 haplotypes (grey lines)). c, Distribution of integrated EHH for haplotypes shown in b, haplotypes carrying the rs200342067*C allele are longer than 97.5% of haplotypes that carry similar variants. The x axis shows the integrated EHH; the red line indicates the integrated EHH of the rs200342067*C allele (integrated EHH = 0.115). d, Histogram of Fisher’s exact test results comparing the extent of allele frequency differences between coastal (n = 46 individuals) and non-coastal (n = 104 individuals) regions in Peru for 2,062 independent variants that were matched in DAF and local recombination rate to rs200342067. the x axis shows the −log₁₀-transformed P values from the two-sided Fisher’s exact test; the dashed blue and green vertical lines show the 99th and 95th percentiles, respectively; the solid red line indicates the −log₁₀-transformed P value of the two-sided Fisher’s exact test (P= 0.0005) for rs200342067 (1.1% percentile). e, Bayenv2 XTX statistics, a measure of deviation from neutral patterns of population structure, for 2,062 independent variants that were matched in DAF and local recombination rate to rs200342067. The x axis shows the XTX statistics; the red line indicates the XTX value for rs200342067 (XTX = 2.13; 8.3th percentile); the green and blue lines show the fifth and first percentile of the XTX distribution, respectively.

Extended Data Fig. 7 Genomic context of rs200342067 FBN1(E1297G).

a, Schematic of FBN1, exons are shown as black bars. Exon 31 (ENSE00001753582) is shown in red. b, The FBN1 exon 31 sequence and PhyloP per-nucleotide conservation score based on multiple sequence alignment of 100 vertebrate species (obtained using the GRCh37 assembly conservation track of the UCSC genome browser). The T>C change due to rs200342067 occurs in a conserved nucleotide. c, Schematic of fibrillin 1 (ENST00000316623.5). Fibrillin 1 consists of the following domains: N- and C-terminal domains (black rectangles), EGF-like domains (stripped rectangles), hybrid domains (black pentagons), TGFβ-binding domains (grey ovals), a proline-rich domain (white hexagon) and 43 calcium-binding cbEGF-like domains (white rectangles). cbEGF domain 17, which is affected by rs200342067 FBN1(E1297G), is shown in red; E1297G is located between a conserved cysteine FBN1(C1296) involved in forming a disulfide bond with FBN1(C1284) and a conserved asparagine FBN1(N1298) involved in calcium binding. d, The sequence of FBN1(cbEGF) domain 17 of fibrillin 1 and the three-dimensional structures of cbEGF domains 17 and 18 (the three-dimensional structure was obtained based on homology with the previously published³⁶ cbEGF domains 12 and 13 of fibrillin 1 (PDB 1LMJ). rs200342067 changes the glutamic acid, a large amino acid with a negatively charged side chain, to glycine, the smallest amino acid with no side chain (shown in red). The side chains are shown for rs200342067 (red spheres), as well as the calcium-interacting residues (beige sticks) and the cysteine residues involved in disulfide bonds (yellow sticks). A calcium ion is shown in green.

Extended Data Fig. 8 Immunohistochemical staining of fibrillin 1.

a, b, Fibrillin 1 staining of skin biopsies from two individuals with the rs200342067 C/C genotype. c, d, Fibrillin 1 staining of skin biopsies from two individuals with the T/T genotype matched for age, sex and ancestry proportions. Individuals with the C/C genotype have less fibrillin 1 deposition in the dermal extracellular matrix and shorter microfibrillar projections from the dermal–epidermal junction into the superficial (papillary) dermis (red arrows, 20×) as well as less fibrillin 1 deposition in the deeper dermis. Two magnification are shown, the red rectangles in the first column (20× magnification) are magnified in the second column (60×).

Extended Data Fig. 9 Electron microscopy of fibrillin 1 in skin.

a, c, Electron microscopy images of the dermal–epidermal junction in samples from two individuals with the rs200342067 T/T genotype. b, d, Electron microscopy images of the dermal–epidermal junction in samples from two individuals with the rs200342067 C/C genotype who are matched for age, sex and ancestry proportions. Individuals with the C/C genotype have short, fragmented and less densely packed microfibrils with irregular edges (red arrows) and their microfibrils are embedded in less dense collagen bundles (yellow arrows) compared with individuals with the T/T genotype. Two magnification are shown, the white rectangles in the first column (4,400× magnification; green scale bars, 2 μm) are magnified in the second column (11,000× magnification; yellow scale bars, 1 μm).

Extended Data Table 1 SNPs that overlap the 15q15–21.1 locus

Full size table

Supplementary information

Supplementary Information

This file contains Supplementary sections 1-8, including Supplementary Figures and Tables, and additional references.

Reporting Summary

Source data

Source Data Fig. 1

Source Data Fig. 2

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asgari, S., Luo, Y., Akbari, A. et al. A positively selected FBN1 missense variant reduces height in Peruvian individuals. Nature 582, 234–239 (2020). https://doi.org/10.1038/s41586-020-2302-0

Download citation

Received: 28 February 2019
Accepted: 10 March 2020
Published: 13 May 2020
Issue Date: 11 June 2020
DOI: https://doi.org/10.1038/s41586-020-2302-0

This article is cited by

Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies
- Pouria Salehi Nowbandegani
- Anthony Wilder Wohns
- Luke J. O’Connor
Nature Genetics (2023)
Identification of two variants in PAX3 and FBN1 in a Chinese family with Waardenburg and Marfan syndrome via whole exome sequencing
- Xiaoqiang Xiao
- Yuqiang Huang
- Mingzhi Zhang
Functional & Integrative Genomics (2023)
The schizophrenia-associated missense variant rs13107325 regulates dendritic spine density
- Shiwu Li
- Changguo Ma
- Xiong-Jian Luo
Translational Psychiatry (2022)
The sequences of 150,119 genomes in the UK Biobank
- Bjarni V. Halldorsson
- Hannes P. Eggertsson
- Kari Stefansson
Nature (2022)
Multi-ancestry genome-wide association analyses identify novel genetic mechanisms in rheumatoid arthritis
- Kazuyoshi Ishigaki
- Saori Sakaue
- Soumya Raychaudhuri
Nature Genetics (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.