Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Whole‐genome sequencing identifies EN1 as a determinant of bone density and fracture


The extent to which low‐frequency (minor allele frequency (MAF) between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants1,2,3,4,5,6,7,8, as well as rare, population‐specific, coding variants9. Here we identify novel non‐coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole‐genome sequencing (n = 2,882 from UK10K (ref. 10); a population‐based genome sequencing consortium), whole‐exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low‐frequency non‐coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10−14), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low‐frequency non‐coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non‐coding variants. These findings provide evidence that low‐frequency non‐coding variants have large effects on BMD and fracture, thereby providing rationale for whole‐genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Association signals near engrailed homeobox-1 for lumbar spine BMD.
Figure 2: Genome‐wide features of association signals.
Figure 3: Mouse En1 functional experiments.

Similar content being viewed by others


  1. Richards, J. B. et al. Bone mineral density, osteoporosis, and osteoporotic fractures: a genome‐wide association study. Lancet 371, 1505–1512 (2008)

    Article  CAS  Google Scholar 

  2. Styrkarsdottir, U. et al. Multiple genetic loci for bone mineral density and fractures. N. Engl. J. Med. 358, 2355–2365 (2008)

    Article  CAS  Google Scholar 

  3. Styrkarsdottir, U. et al. New sequence variants associated with bone mineral density. Nature Genet. 41, 15–17 (2009)

    Article  CAS  Google Scholar 

  4. Rivadeneira, F. et al. Twenty bone‐mineral‐density loci identified by large‐scale meta‐analysis of genome‐wide association studies. Nature Genet. 41, 1199–1206 (2009)

    Article  CAS  Google Scholar 

  5. Duncan, E. L. et al. Genome‐wide association study using extreme truncate selection identifies novel genes affecting bone mineral density and fracture risk. PLoS Genet. 7, e1001372 (2011)

    Article  CAS  Google Scholar 

  6. Koller, D. L. et al. Genome‐wide association study of bone mineral density in premenopausal European‐American women and replication in African‐American women. J. Clin. Endocrinol. Metab. 95, 1802–1809 (2010)

    Article  CAS  Google Scholar 

  7. Xiong, D.‐H. et al. Genome‐wide association and follow‐up replication studies identified ADAMTS18 and TGFBR3 as bone mass candidate genes in different ethnic groups. Am. J. Hum. Genet. 84, 388–398 (2009)

    Article  CAS  Google Scholar 

  8. Estrada, K. et al. Genome‐wide meta‐analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nature Genet. 44, 491–501 (2012)

    Article  CAS  Google Scholar 

  9. Styrkarsdottir, U. et al. Nonsense mutation in the LGR4 gene is associated with several human diseases and other traits. Nature 497, 517–520 (2013)

    Article  CAS  ADS  Google Scholar 

  10. The UK10K Consortium The UK10K project identifies rare variants in health and disease. Nature (this issue)

  11. Hindorff, L. A. et al. Potential etiologic and functional implications of genome‐wide association loci for human diseases and traits. Proc. Natl Acad. Sci. USA 106, 9362–9367 (2009)

    Article  CAS  ADS  Google Scholar 

  12. Kiezun, A. et al. Exome sequencing and the genetic basis of complex traits. Nature Genet. 44, 623–630 (2012)

    Article  CAS  Google Scholar 

  13. Gudbjartsson, D. F. et al. Large‐scale whole‐genome sequencing of the Icelandic population. Nature Genet. 47, 435–452 (2015)

    Article  CAS  Google Scholar 

  14. Sulem, P. et al. Identification of a large set of rare complete human knockouts. Nature Genet. 47, 448–444 (2015)

    Article  CAS  ADS  Google Scholar 

  15. Abecasis, G. R. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)

    Article  ADS  Google Scholar 

  16. Richards, J. B., Zheng, H.‐F. & Spector, T. D. Genetics of osteoporosis from genome‐wide association studies: advances and challenges. Nature Rev. Genet. 13, 576–588 (2012)

    Article  CAS  Google Scholar 

  17. Huang, J. et al. Improved imputation of low‐frequency and rare variants using the UK10K haplotype reference panel. Nature Comm. 6, 8111 (2015)

    Article  CAS  ADS  Google Scholar 

  18. Xu, C., Tachmazidou, I., Walter, K., Ciampi, A., Zeggini, E. & Greenwood, C. M. T. Estimating genome‐wide significance for whole‐genome sequencing studies. Genet. Epidemiol. 38, 281–290 (2014)

    Article  Google Scholar 

  19. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012)

    Article  CAS  ADS  Google Scholar 

  20. Loomis, C. A. et al. The mouse Engrailed‐1 gene and ventral limb patterning. Nature 382, 360–363 (1996)

    Article  CAS  ADS  Google Scholar 

  21. Adamska, M., MacDonald, B. T., Sarmast, Z. H., Oliver, E. R. & Meisler, M. H. En1 and Wnt7a interact with Dkk1 during limb development in the mouse. Dev. Biol. 272, 134–144 (2004)

    Article  CAS  Google Scholar 

  22. Deckelbaum, R. A., Majithia, A., Booker, T., Henderson, J. E. & Loomis, C. A. The homeoprotein engrailed 1 has pleiotropic functions in calvarial intramembranous bone formation and remodeling. Development 133, 63–74 (2006)

    Article  CAS  Google Scholar 

  23. Matise, M. P. & Joyner, A. L. Expression patterns of developmental control genes in normal and Engrailed‐1 mutant mouse spinal cord reveal early diversity in developing interneurons. J. Neurosci. 17, 7805–7816 (1997)

    Article  CAS  Google Scholar 

  24. Sgaier, S. K. et al. Genetic subdivision of the tectum and cerebellum into functionally related regions based on differential sensitivity to engrailed proteins. Development 134, 2325–2335 (2007)

    Article  CAS  Google Scholar 

  25. Ackert‐Bicknell, C. L. et al. Mouse BMD quantitative trait loci show improved concordance with human genome‐wide association loci when recalculated on a new, common mouse genetic map. J. Bone Miner. Res. 25, 1808–1820 (2010)

    Article  Google Scholar 

  26. Zheng, H.‐F. et al. WNT16 influences bone mineral density, cortical bone thickness, bone strength, and osteoporotic fracture risk. PLoS Genet. 8, e1002745 (2012)

    Article  CAS  Google Scholar 

  27. Medina‐Gomez, C. et al. Meta‐analysis of genome‐wide scans for total body BMD in children and adults reveals allelic heterogeneity and age‐specific effects at the WNT16 locus. PLoS Genet. 8, e1002718 (2012)

    Article  Google Scholar 

  28. Movérare‐Skrtic, S. et al. Osteoblast‐derived WNT16 represses osteoclastogenesis and prevents cortical bone fragility fractures. Nature Med. 20, 1279–1288 (2014)

    Article  Google Scholar 

  29. Ladouceur, M., Dastani, Z., Aulchenko, Y. S., Greenwood, C. M. T. & Richards, J. B. The empirical power of rare variant association methods: results from Sanger sequencing in 1,998 individuals. PLoS Genet. 8, e1002496 (2012)

    Article  CAS  Google Scholar 

  30. Tang, H. et al. A large‐scale screen for coding variants predisposing to psoriasis. Nature Genet. 46, 45–50 (2014)

    Article  CAS  Google Scholar 

  31. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)

    Article  CAS  Google Scholar 

  32. Mägi, R. & Morris, A. P. GWAMA: software for genome‐wide association meta‐analysis. BMC Bioinformatics 11, 288 (2010)

    Article  Google Scholar 

  33. Yang, J. et al. Conditional and joint multiple‐SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Genet. 44, 369–375 (2012)

    Article  CAS  Google Scholar 

  34. Voorman, A. A., Brody, J. & Lumley, T. SkatMeta: an R package for meta‐analyzing region‐based tests of rare DNA variants. ( (2013)

  35. Wu, M. C. et al. Rare‐variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011)

    Article  CAS  Google Scholar 

  36. Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLOS Comput. Biol. 6, e1001025 (2010)

    Article  Google Scholar 

  37. Chang, C. C. et al. Second‐generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015)

    Article  Google Scholar 

  38. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP effect predictor. Bioinformatics 26, 2069–2070 (2010)

    Article  CAS  Google Scholar 

  39. Hanks, M., Wurst, W., Anson‐Cartwright, L., Auerbach, A. B. & Joyner, A. L. Rescue of the En‐1 mutant phenotype by replacement of En‐1 with En‐2. Science 269, 679–682 (1995)

    Article  CAS  ADS  Google Scholar 

Download references


Full acknowledgements are listed in the Supplementary Information.

Author information

Authors and Affiliations




Principal Investigators: A.H., A.J., A.U., A.X.‐A., B.L., C.A.‐B., Ch.C., C.L., Cl.C., C.L.D., C.M.v.D., C.O., D.S.E., D.Ga., D.Go., D.Gr., D.H., D.Ki., D.M., E.D., E.O., F.Ri., F.Ro., G.D.S., J.B.R., J.D., J.Re., J.Ri., J.‐T.K., J.Tu., K.A., L.A.C., L.L., L.P.G.M.d.G., M.B., M.M.F., N.S., N.T., N.v.d.V., N.vS., P.R., R.D., R.D.J., R.L.P., S.G.W., S.H.R., T.H., T.P., T.S., U.P.‐K.,V.G., X.N.,Y.‐H.H. Genotyping: AOGC Consortium, A.U., B.M., B.W., C.L., C.M.v.D., C.N., C.O., C.W., D.C., D.Gr., E.D., E.O., F.Ri., G.G., G.Tr., J.Er., J.Jv.M., J.Re., J.Ri., J.‐T.K., J.v.R., M.B., M.C.F., M.J., M.Z., N.A., N.G.‐G., N.S., N.T., P.Ar., P.D., P.R., R.K., S.H.R., S.M., S.R., U.P.‐K., X.N. and Y.‐H.H. Phenotyping: A.E., A.H., A.L., AOGC Consortium, A.P.H., A.U., A.X.‐A., B.M., C.G., C.K., C.L., C.L.D., C.M.v.D., C.O., D.Go., D.Ka., D.Ki., D.M., E.D., E.N., E.O., F.E.M., F.K., F.Ri., F.Ro., G.H., J.B., J.C., J.Ei., J.O., J.Re., J.Ri., J.‐T.K., J.To., K.E., K.S., K.T., L.O., L.R., L.V., M.B., M.C.F., M.M.F., M.C.Z., M.K., M.Z., N.A., N.S., N.T., O.L., O.S., R.L.P., S.G.W., S.G., S.H.R., S.K., T.N., T.S. and U.P.‐K. Functional experiments: A.J., A.R.‐D., B.Ge, C.A.‐B., C.H., C.L., C.L.D., C.O., C.U., D.Ga., D.P., E.G., H.Y.P.‐M., J.D., J.F., K.Ch., Ma.M., M.H., N.S., O.S., S.B., S.C., S.‐H.C., St.W., T.K., T.P., U.P.‐K., W.C. and X.J. Data analysis: A.E., A.K., A.S., A.V.S., B.M., C.A.‐B., Ch.C., C.‐H.C., C.K., C.L., C.L.D., C.M.‐G., C.M.T.G., C.O., C.T.L., C.W., D.S.E., D.M.E., D.C., D.Ka., D.M., D.P., E.D., E.G., E.N., F.G., F.Ri., G.H., G.Th., H.‐F.Z., J.B.R., J.D., J.Er., J.F., J.Ha., J.Hu., J.K., J.v.R., K.Ch., K.E., K.W., L.A.C., L.H., L.M., L.O., L.R., L.V., M.B., M.C., M.H., M.K., N.A., N.S., N.T., O.L., P.Au., P.D., P.L., R.L., S.B., S.C., S.G.W., S.K., U.P., U.P.‐K., V.F., W.‐C.C., Y.‐H.H., Y.M. and Y.Z. Meta‐analysis: H.‐F.Z., V.F. and Y.‐H.H. Lead analysts: H.‐F.Z. and V.F. Wrote first draft: J.B.R.

Corresponding author

Correspondence to J. Brent Richards.

Ethics declarations

Competing interests

Authors from deCODE Genetics are employees of deCODE Genetics. Authors from 23andMe are employees of 23andMe. Remaining authors declare no competing financial interests.

Additional information

Source code used in preparation of results is available at BMD discovery meta‐analysis results are available from Information pertaining to UK10K can be obtained from

Lists of participants and their affiliations appear in the Supplementary Information.

Lists of participants and their affiliations appear in the Supplementary Information.

Extended data figures and tables

Extended Data Figure 1 Discovery single variant meta‐analysis.

a, Overall study design. b, From top to bottom, quantile–quantile plots for the sex‐combined single SNV meta‐analysis, sex‐stratified single SNV meta‐analysis (forearm phenotype consists solely of female‐only cohorts), and sex‐combined single SNV conditional meta‐analysis Plots depicts P values prior (blue) and after (red) conditional analysis on genome-wide significant variants (see Supplementary Methods). c, From top to bottom, Manhattan plots for sex‐combined meta‐analysis for lumbar spine BMD, femoral neck BMD, and forearm BMD. Each plot depicts variants from the UK10K/1000G reference panel with MAF > 0.5% across the 22 autosomes (odd, grey; even, black) against the −log10 P value from the meta‐analysis of 7 cohorts (dots). Also depicted are the subset variants from the reference panel that are also present in ref. 8 with P value <5 × 10−6 (diamonds). Variants with MAF < 5% and P < 1.2 × 10−6 are also depicted (red). d, Quantile–quantile plots for the sex‐combined meta‐analysis of lumbar spine, femoral neck, and forearm BMD for SNVs present across both exome‐sequenced and genome-sequenced and imputed cohorts, that is, SNV present only in genome-sequenced or imputed cohorts are not shown. e, Manhattan plot for the meta‐analysis of sex‐combined results for lumbar spine BMD for SNVs present in exome‐sequenced and genome-sequenced and imputed cohorts, that is, SNV present only in genome-sequenced or imputed cohorts are not shown (from left to right: lumbar spine, forearm and femoral neck BMD).

Extended Data Figure 2 Forest plots by cohort for genome‐wide significant loci from discovery meta‐analysis.

Forest plots for three BMD phenotypes are shown. Title of each plot includes gene overlapping the SNV and its genomic position on build hg19. P values are from fixed‐effect meta‐analysis (see Supplementary Information).

Extended Data Figure 3 Gene expression in human and mouse.

a, Quantification of Dock8 expression and its temporal pattern through RNA‐seq in cultured calvarial murine osteoblasts across day 2 through to day 18 of osteoblast development. Shown for comparison is Bglap, which encodes osteocalcin, a critical protein in osteoblasts. b, Quantification of expression of genome‐wide significant genes and their temporal pattern through RNA‐seq in cultured calvarial murine osteoblasts across day 2 through to day 18 of osteoblast development. c, Expression of EN1 mRNA in human cells presented as per cent of GAPDH mRNA. d, Expression of En1 in control and sdEn1 mice in purified osteoblast culture. For osteoblast marker gene expression, total mRNAs were purified from osteoblast cultures at day 10 and measured using quantitative real‐time PCR. mRNA levels were normalized relative to GAPDH mRNA. e. Real‐time PCR expression of control and sdEn1 as compared to 18S mRNA in whole vertebral bone extract. All data are shown as mean ± s.e.m. Significance computed by Student’s unpaired t‐test.

Extended Data Figure 4 Histological assessment of En1cre‐expressing cells in skeletal cells of the vertebra.

a, Lineage history of En1cre‐expressing cells in skeletal cells of the vertebra. The En1cre allele was combined with the R26LSL‐YFP reporter allele and examined using frozen fluorescent immunohistochemistry and alkaline phosphatase (AP) staining. Cell nuclei were detected with DAPI. YFP‐expressing cells have expressed Cre (En1) at some time in their history. In subpanel A, control animals lacking the R26LSL‐YFP reporter show low background YFP signal (green). In subpanel B, En1cre/+; R26LSL‐YFP/+ mice YFP‐expressing cells are detected in the growth plate chondrocytes of the vertebra (asterisk), trabecular bone lining cells (arrow) and osteocytes (arrowhead). Note, high fluorescent background staining in the marrow space. In subpanel C, the same section is shown stained for AP activity using the Fast Red substrate. Strong activity is present in the hypertrophic chondrocytes of the growth plate and trabecular bone lining cells (arrow). In subpanel D, alignment of the AP and YFP images shows that the trabecular lining cells co‐express AP and YFP. b, Co‐localization of En1 and alkaline phosphatase expression. Images of lumbar vertebrae sections (growth plate and trabecular bone regions, ×40 magnification) from two‐month old En1lacZ/+ mice (see Fig. 3b), stained for LacZ and alkaline phosphatase (AP), false‐coloured as indicated. Double‐positive cells are indicated by arrows, single‐positive cells are indicated by arrowheads (LacZ+) or asterisks (AP+). Except for some chondrocytes, most AP+ cells are also LacZ+, that is, express En1. The bone marrow was digitally removed, as it contains no AP+ cells.

Extended Data Figure 5 Micro‐CT results for control (En1flox/+) and self‐deleting En1 knockout (sdEn1, En1cre/flox) animals.

a, Trabecular bone micro‐CT images from lumbar vertebra 5. b, Morphological characteristics at lumbar vertebra 4, 5, and 6 (from bottom to top). c, d, Morphological characteristics of left femur trabecular bone (c) and left femur cortical bone (d). e, Micro‐CT parameter results for the comparison of control and sdEn1 animals at lumbar vertebra 5, femur trabecula, and femur cortical bone. Horizontal lines denote mean of observations. Significance between control and sdEn1 is calculated using an unpaired t‐test.

Extended Data Figure 6 Novel association from 7q31.3.

a, Chromatin interaction data from Hi‐C performed in H1 embryonic stem cells23 of a 2 Mb region encompassing rs148771817 (red and identified by arrow) and WNT16b, The left axis denotes the association P value (red and green lines at P = 1.2 × 10−5 and 1.2 × 10−8, respectively). The novel genome‐wide significant SNV, rs148771817, within an intron of CPED1, and the lead genome‐wide significant SNV rs7776725 upstream to WNT16 (within FAM3C) are in low LD with each other. c, Allele frequency versus absolute effect size (in standard deviations) for forearm BMD of all previously identified genome‐wide significant variants (blue)8 and the novel variant within CPED1 (red), rs148771817 from replication meta‐analysis. The blue line denotes the mean of effect sizes for previously reported forearm BMD variants. d, Meta‐analysis summary statistics of rs148771817 conditioned on rs7776725.

Extended Data Figure 7 Regional plots of genome‐wide significant loci from single‐SNV association tests for forearm and femoral neck BMD.

Each regional plot depicts SNVs within 1 Mb of a locus’ lead SNV (x axis) and their associated meta‐analysis P value (−log10). SNVs are colour-coded according to r2 with the lead SNV (labelled, r2 calculated from UK10K whole‐genome sequencing data set). Recombination rate (blue line), and the position of genes, their exons and the direction of transcription are also displayed (below plot).

Extended Data Figure 8 Regional plots of genome‐wide significant loci from single‐SNV association tests from lumbar spine BMD.

Each regional plot depicts SNVs within 1 Mb of a locus’ lead SNV (x axis) and their associated meta‐analysis P value (−log10). SNVs are colour coded according to r2 with the lead SNV (labelled, r2 calculated from UK10K whole genome sequencing data set). Recombination rate (blue line), and the position of genes, their exons and the direction of transcription are also displayed (below plot).

Extended Data Figure 9 Region‐based association tests using skatMeta for windows of 30 SNVs and window step of 20 SNVs.

a, Left, quantile–quantile plots for forearm (FA) BMD, femoral neck (FN) BMD, and lumbar spine (LS) BMD. For each MAF range considered (<5% or < 1%), analysis was conducted across all variants, variant overlapping coding exons, and variants with GERP++ score >1. b, Right, Manhattan plots forearm BMD, femoral neck BMD, and lumbar spine BMD. For each MAF range considered (<5% or < 1%), analysis was conducted across all variants, variant overlapping coding exons, and variants with GERP++ score >1. Blue lines indicate genome‐wide suggestive (P = 1.2 × 10−6) thresholds and red lines indicate genome‐wide significant (P = 1.2 × 10−8) thresholds.

Extended Data Figure 10 Single variant analysis of signals from region‐based tests.

a, Drop‐one SNV (left) and drop‐one cohort (right) for genome‐wide significant 30 SNV windows for femoral neck and forearm BMD from skatMeta analysis. On left, for a given 30 SNV window, the −log10P of skatMeta test for 29 SNVs, excluding (that is, dropping) the SNV at position labelled on the x axis. On right, for given 30 SNV window on left, the −log10P of skatMeta test for all cohorts, excluding (that is, dropping) cohort labelled on x axis. b, Regional view of CPED1/WNT16 locus for forearm BMD. Significant SNVs from single variant meta‐analysis (rs148771817 and rs79162867, in blue) overlap significant regions found using region‐based test (red bars).

Supplementary information

Supplementary Information

This file contains Supplementary Text and Supplementary References. (PDF 1093 kb)

Supplementary Tables

This file contains Supplementary Tables 1-19. (XLSX 757 kb)

PowerPoint slides

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, H., Forgetta, V., Hsu, Y. et al. Whole‐genome sequencing identifies EN1 as a determinant of bone density and fracture. Nature 526, 112–117 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing