Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Calibrated prediction intervals for polygenic scores across diverse contexts

Abstract

Polygenic scores (PGS) have emerged as the tool of choice for genomic prediction in a wide range of fields. We show that PGS performance varies broadly across contexts and biobanks. Contexts such as age, sex and income can impact PGS accuracy with similar magnitudes as genetic ancestry. Here we introduce an approach (CalPred) that models all contexts jointly to produce prediction intervals that vary across contexts to achieve calibration (include the trait with 90% probability), whereas existing methods are miscalibrated. In analyses of 72 traits across large and diverse biobanks (All of Us and UK Biobank), we find that prediction intervals required adjustment by up to 80% for quantitative traits. For disease traits, PGS-based predictions were miscalibrated across socioeconomic contexts such as annual household income levels, further highlighting the need of accounting for context information in PGS-based prediction across diverse populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Calibrated and context-specific prediction intervals via CalPred.
Fig. 2: Widespread context-specific PGS prediction accuracy in UK Biobank.
Fig. 3: Widespread context-specific PGS prediction accuracy in All of Us.
Fig. 4: Simulation studies with gene–context interactions.
Fig. 5: Simulation studies with multiple contexts.
Fig. 6: CalPred PGS calibration of LDL in UK Biobank.
Fig. 7: Variation of prediction s.d. accounting for all contexts.
Fig. 8: Calibration of T2D risk prediction across income groups.

Similar content being viewed by others

Data availability

UK Biobank individual-level genotype and phenotype data are available through application at http://www.ukbiobank.ac.uk. All of Us individual-level genotype and phenotype are available through application at https://www.researchallofus.org.

Code availability

Software implementing CalPred and code for processing and main analyses is available via GitHub at https://github.com/KangchengHou/calpred (ref. 62) and https://github.com/KangchengHou/calpred-manuscript (ref. 63).

References

  1. Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Torkamani, A., Wineinger, N. E. & Topol, E. J. The personal and clinical utility of polygenic risk scores. Nat. Rev. Genet. 19, 581–590 (2018).

    Article  CAS  PubMed  Google Scholar 

  3. Li, R., Chen, Y., Ritchie, M. D. & Moore, J. H. Electronic health records and polygenic risk scores for predicting disease risk. Nat. Rev. Genet. 21, 493–502 (2020).

    Article  CAS  PubMed  Google Scholar 

  4. Kullo, I. J. et al. Polygenic scores in biomedical research. Nat. Rev. Genet. 23, 524–532 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Martin, A. R. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat. Genet. 51, 584–591 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ding, Y. et al. Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. Nat. Genet. 54, 30–39 (2022).

    Article  CAS  PubMed  Google Scholar 

  7. Privé, F. et al. Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. Am. J. Hum. Genet. 109, 12–23 (2022).

  8. Weissbrod, O. et al. Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. Nat. Genet. 54, 450–458 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bitarello, B. D. & Mathieson, I. Polygenic scores for height in admixed populations. G3 10, 4027–4036 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Mostafavi, H. et al. Variable prediction accuracy of polygenic scores within an ancestry group. eLife 9, e48376 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Jiang, X., Holmes, C. & McVean, G. The impact of age on genetic risk for common diseases. PLoS Genet. 17, e1009723 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Hui, D. et al. Quantifying factors that affect polygenic risk score performance across diverse ancestries and age groups for body mass index. Pac. Symp. Biocomput. 28, 437–448 (2023).

    PubMed  PubMed Central  Google Scholar 

  14. Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Ge, T., Chen, C.-Y., Neale, B. M., Sabuncu, M. R. & Smoller, J. W. Phenome-wide heritability analysis of the UK Biobank. PLoS Genet. 13, e1006711 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Zhu, C. et al. Amplification is the primary mode of gene-by-sex interaction in complex human traits. Cell Genom. 3, 100297 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Brown, B. C., Ye, C. J., Price, A. L. & Zaitlen, N. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Shi, H. et al. Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat. Commun. 12, 1098 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Patel, R. A. et al. Genetic interactions drive heterogeneity in causal variant effect sizes for gene expression and complex traits. Am. J. Hum. Genet. 109, 1286–1297 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Weine, E., Smith, S. P., Knowlton, R. K. & Harpak, A. Tradeoffs in modeling context dependency in complex trait genetics. Preprint at bioRxiv https://doi.org/10.1101/2023.06.21.545998 (2023).

  21. Wang, Y. et al. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nat. Commun. 11, 3865 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Lambert, S. A., Abraham, G. & Inouye, M. Towards clinical utility of polygenic risk scores. Hum. Mol. Genet. 28, R133–R142 (2019).

    Article  CAS  PubMed  Google Scholar 

  23. Ding, Y. et al. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 618, 774–781 (2023).

  24. Johnson, R. et al. Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS Community Health Initiative. Genome Med. 14, 104 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Wiley, L. K. et al. Building a vertically integrated genomic learning health system: the biobank at the Colorado Center for Personalized Medicine. Am. J. Hum. Genet. 111, 11–23 (2024).

  26. Belbin, G. M. et al. Toward a fine-scale population health monitoring system. Cell 184, 2068–2083.e11 (2021).

    Article  CAS  PubMed  Google Scholar 

  27. Abul-Husn, N. S. & Kenny, E. E. Personalized medicine and the power of electronic health records. Cell 177, 58–69 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. The All of Us Research Program Genomics Investigators et al. Genomic data in the All of Us Research Program. Nature 627, 340–346 (2024).

  30. Wand, H. et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature 591, 211–219 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Wei, J. et al. Calibration of polygenic risk scores is required prior to clinical implementation: results of three common cancers in UKB. J. Med. Genet. 59, 243–247 (2022).

    Article  PubMed  Google Scholar 

  32. van Houwelingen, H. C. Validation, calibration, revision and combination of prognostic survival models. Stat. Med. 19, 3401–3415 (2000).

    Article  PubMed  Google Scholar 

  33. Van Calster, B. et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 17, 230 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Sun, J. et al. Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction. Nat. Commun. 12, 5276 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Smyth, G. K. Generalized linear models with varying dispersion. J. R. Stat. Soc. 51, 47–60 (1989).

    Article  Google Scholar 

  36. Koenker, R. Quantile Regression (Cambridge Univ. Press, 2005).

  37. Rigby, R. A. & Stasinopoulos, D. M. Generalized additive models for location, scale and shape. J. R. Stat. Soc. Ser. C 54, 507–554 (2005).

    Article  Google Scholar 

  38. Romano, Y., Patterson, E. & Candès, E. J. Conformalized quantile regression. Advances in Neural Information Processing Systems 32 (2019).

  39. Gneiting, T. & Katzfuss, M. Probabilistic forecasting. Annu. Rev. Stat. Appl. 1, 125–151 (2014).

    Article  Google Scholar 

  40. Yang, J. et al. FTO genotype is associated with phenotypic variability of body mass index. Nature 490, 267–272 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Young, A. I., Wauthier, F. L. & Donnelly, P. Identifying loci affecting trait variability and detecting interactions in genome-wide association studies. Nat. Genet. 50, 1608–1614 (2018).

    Article  CAS  PubMed  Google Scholar 

  42. Miao, J. et al. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc. Natl Acad. Sci. USA 119, e2212959119 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Schoeler, T. et al. Participation bias in the UK Biobank distorts genetic associations and downstream analyses. Nat. Hum. Behav. https://doi.org/10.1038/s41562-023-01579-9 (2023).

  44. Selzam, S. et al. Comparing within- and between-family polygenic score prediction. Am. J. Hum. Genet. 105, 351–363 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Okbay, A. et al. Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. Nat. Genet. 54, 437–449 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Yengo, L. et al. A saturated map of common genetic variants associated with human height. Nature 610, 704–712 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600, 675–679 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Lambert, S. A. et al. The polygenic score catalog as an open database for reproducibility and systematic evaluation. Nat. Genet. 53, 420–425 (2021).

    Article  CAS  PubMed  Google Scholar 

  49. Durvasula, A. & Price, A. L. Distinct explanations underlie gene–environment interactions in the UK Biobank. Preprint at medRxiv https://doi.org/10.1101/2023.09.22.23295969 (2023).

  50. Mahajan, A. et al. Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps. Nat. Genet. 50, 1505–1513 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Patel, A. P. et al. A multi-ancestry polygenic risk score improves risk prediction for coronary artery disease. Nat. Med. 29, 1793–1803 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Schumacher, F. R. et al. Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. Nat. Genet. 50, 928–936 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Zhang, H. et al. Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nat. Genet. 52, 572–581 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Martin, A. R. et al. Human demographic history impacts genetic risk prediction across diverse populations. Am. J. Hum. Genet. 107, 788–789 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Kachuri, L. et al. Genetically adjusted PSA levels for prostate cancer screening. Nat. Med. 29, 1412–1423 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Smyth, G. K. An efficient algorithm for REML in heteroscedastic regression. J. Comput. Graph. Stat. 11, 836–847 (2002).

    Article  Google Scholar 

  57. Giner, G. & Smyth, G. K. statmod: probability calculations for the inverse Gaussian distribution. The R Journal 8, 339–351 (2016).

  58. Yousefi, P. D. et al. DNA methylation-based predictors of health: applications and statistical considerations. Nat. Rev. Genet. 23, 369–383 (2022).

    Article  CAS  PubMed  Google Scholar 

  59. The International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

    Article  PubMed Central  Google Scholar 

  60. Privé, F., Arbel, J. & Vilhjálmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 36, 5424–5431 (2020).

    Article  PubMed Central  Google Scholar 

  61. Szczerbinski, L. et al. Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores—a new resource for diabetes precision medicine. Preprint at bioRxiv https://doi.org/10.1101/2023.09.05.23295061 (2023).

  62. Hou, K. KangchengHou/calpred. Zenodo https://doi.org/10.5281/zenodo.10962189 (2024)

  63. Hou, K. KangchengHou/calpred-manuscript. Zenodo https://doi.org/10.5281/zenodo.11094535 (2024)

Download references

Acknowledgements

We thank M. Przeworski, H. Zhang, T. Chen, Y. Wang, A. Martin and J. Hirbo for helpful suggestions. This research was funded in part by the National Institutes of Health under awards R01HG009120 (B.P.), R01MH115676 (B.P.), U01HG011715 (B.P.) and R35GM151108 (A.H.). This research was conducted using the UK Biobank Resource under application 33127. We thank the participants of UK Biobank for making this work possible. The All of Us Research Program is supported by the National Institutes of Health, Office of the Director: Regional Medical Centers: 1 OT2 OD026549; 1 OT2 OD026554; 1 OT2 OD026557; 1 OT2 OD026556; 1 OT2 OD026550; 1 OT2 OD 026552; 1 OT2 OD026553; 1 OT2 OD026548; 1 OT2 OD026551; 1 OT2 OD026555; IAA #: AOD 16037; Federally Qualified Health Centers: HHSN 263201600085U; Data and Research Center: 5 U2C OD023196; Biobank: 1 U24 OD023121; The Participant Center: U24 OD023176; Participant Technology Systems Center: 1 U24 OD023163; Communications and Engagement: 3 OT2 OD023205; 3 OT2 OD023206; and Community Partners: 1 OT2 OD025277; 3 OT2 OD025315; 1 OT2 OD025337; and 1 OT2 OD025276. In addition, the All of Us Research Program would not be possible without the partnership of its participants.

Author information

Authors and Affiliations

Authors

Contributions

K.H. and B.P. conceived and designed the experiments. K.H., Z.X. and Y.D. performed the experiments and statistical analyses with assistance from R.M., Z.S., K.B., A.H. and B.P. K.H. and B.P. wrote the paper with feedback from all authors.

Corresponding authors

Correspondence to Kangcheng Hou or Bogdan Pasaniuc.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Iftikhar Kullo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Pearson’s correlation between context variables in UK Biobank and All of Us datasets.

Pearson correlations were calculated separately within individuals annotated with “white British’ in UK Biobank and within individuals with SIRE “white’ in All of Us (a,c) and across all individuals (b,d).

Extended Data Fig. 2 Distribution of context variables in UK Biobank.

We show context distribution separately for “white British’ individuals and rest of individuals in UK Biobank.

Extended Data Fig. 3 Distribution of context variables in All of Us.

We show context distribution separately for “white SIRE’ individuals and rest of individuals in All of Us.

Extended Data Fig. 4 R2 between covariate-adjusted height and PGS across education and income levels in All of Us.

R2 were calculated across all individuals, and within individuals of European and African genetic ancestry (with estimated admixture proportion of the corresponding ancestry > 90%), across education levels (a) and income levels (b). Error bars denote mean values +/- standard deviation of R2 across 30 bootstrap samples.

Extended Data Fig. 5 Quantitative trait simulations with gene-context interactions.

We simulated three scenarios of gene-context interactions for quantitative traits and evaluated calibration of prediction intervals. These scenarios include (a) imperfect genetic correlation: Var[G] = 0.5, Var[E] = 0.5 in both contexts; genetic correlation=0.5 across contexts. (b) varying heritability: Var[G] = 0.5, Var[E] = 0.5 in context 1 and Var[G] = 0.1, Var[E] = 0.9 in context 2; genetic correlation=1. (c) joint amplified G and E: Var[G] = 0.25, Var[E] = 0.75 in context 1 and Var[G] = 0.25*1.5, Var[E] = 0.75*1.5; genetic correlation=1. Across three scenarios, PGS weights derived in the first context were applied to individuals in both contexts. We show results for individuals in context 2 using four modeling approaches. “PGS’: PGS and prediction variance calculated with individuals from context 1 were applied to individuals in context 2; “PGS; VbyC’: fit y ~ N(PGS, VbyC); “PGS+PGSxC’: fit y ~ N(PGS + PGSxC, prediction variance derived in context 1); “PGS+PGSxC; VbyC’: fit y ~ N(PGS + PGSxC, VbyC). Blue dashed line denotes the best fit to data; red dashed line denotes model predictions; red error bar denotes the prediction interval for an individual at top 5% quantile of PGS. Prediction interval coverage was evaluated within data in top PGS decile. We note these three simulation scenarios did not cover all possible modes of gene-context interactions: these models assume gene-context interactions act similarly across all causal variants, and they model gene-context interactions using PGSxC and VbyC.

Extended Data Fig. 6 Disease trait simulations with gene-context interactions.

We simulated three scenarios of gene-context interactions for disease traits using a liability threshold model and evaluated calibration of probability prediction. These scenarios include: (a) imperfect genetic correlation: Var[G] = 0.5, Var[E] = 0.5, disease prevalence = 10% in both contexts; genetic correlation=0.5 across two contexts. (b) varying heritability: Var[G] = 0.5, Var[E] = 0.5 in context 1 and Var[G] = 0.1, Var[E] = 0.9 in context 2, disease prevalence=10% in both contexts; genetic correlation=1 across two contexts. (c) varying disease prevalence: Var[G] = 0.5, Var[E] = 0.5 in both contexts; disease prevalence = 10%/20% in context 1/2. Across three scenarios, PGS weights derived in the first context were applied to individuals in both contexts. We fit four models using different sets of predictors in logistic regression across individuals in two contexts (probit regression led to similar results): “PGS’: fit y ~ PGS; “PGS + C’: fit y ~ PGS + Context; “PGS+PGSxC’: fit y ~ PGS + PGSxContext; “PGS+PGSxC+C’: fit y ~ PGS + PGSxContext + Context. Error bars denote observed disease proportions and their 95% confidence intervals for each predicted probability bin (n = 2000 individuals for each error bar).

Extended Data Fig. 7 Simulations with varying number of individuals, unmeasured contexts, excessive dummy contexts.

We performed simulations to investigate factors that influence coverage of prediction intervals. We compared coverage in these alternative scenarios with default scenario (marked by ‘Default’ in the figure) where we performed calibration using age, PC1, and sex and 5000 individuals as calibration data (same as Fig. 5). (a) Coverage of prediction intervals with varying number of individuals used in calibration (Ncal = 100, 500, 2500, 5000). We evaluated the coverage both at the overall level and within each group (groups are denoted by colors) using 5,000 testing individuals. Different box plots with the same color denotes different strata for each context (quintile for age and PC1; male/female for sex). We determined coverages had more downward bias and higher variance when less individuals are used in the calibration. (b) Coverage of prediction intervals when certain context variables were not measured. To simulate unmeasured covariate, we performed calibration using PC1 and sex only (excluding age). And we determined prediction intervals were mis-calibrated along the unmeasured context of age in this scenario. (c) Coverage of prediction intervals when including excessive dummy contexts in calibration. We simulated dummy variables with no effects to phenotype variance (number of dummy covariates Ndummy = 5, 25, 50; drawn from N(0,1)) and included them in calibration to investigate the effect of including excessive covariates to prediction coverage. We determined coverages had more downward bias and higher variance when more dummy variables were used in the calibration. For (a-c), each box plot contains results across 100 simulations (each box contains n = 100 points). For box plots, the center corresponds to the median; the box represents the first and third quartiles of the points; the whiskers represent the minimum and maximum points located within 1.5\(\times\) interquartile ranges from the first and third quartiles, respectively.

Extended Data Fig. 8 Standardized effects of PGS, contexts, and PGSxC interaction terms in quantitative trait prediction in All of Us.

We display standardized effects of all predictors where they are standardized with mean 0 and variance 1 in regression analysis. We note that the left figure containing effects of PGS and contexts has a different color scale than the right figure containing PGSxC interaction terms.

Extended Data Fig. 9 Contribution of PGS to inter-individual variation of prediction SDs in All of Us.

We compared inter-individual variation of prediction SDs in two models: (1) prediction mean as a function of all contexts without PGS; (2) include PGS as part of prediction mean in the baseline model. Prediction SD is modeled as a function of all contexts in both models. By comparing prediction SDs in these two models, we found including PGS substantially impacted inter-individual variation in prediction SD.

Extended Data Fig. 10 Standardized effects of PGS, contexts, and PGSxC interaction terms in disease trait prediction in All of Us.

We show standardized effects where all predictor variables are standardized with mean 0 and variance 1 in regression analysis within all individuals. Left figure containing PGS and contexts has different color scale from the right figure containing PGSxC interaction terms.

Supplementary information

Supplementary Information

Supplementary Figs. 1–19, Note and table captions.

Reporting Summary

Peer Review File

Supplementary Tables

Supplementary Tables 1–4.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hou, K., Xu, Z., Ding, Y. et al. Calibrated prediction intervals for polygenic scores across diverse contexts. Nat Genet (2024). https://doi.org/10.1038/s41588-024-01792-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41588-024-01792-w

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing