Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Harnessing deep learning for population genetic inference

Abstract

In population genetics, the emergence of large-scale genomic data for various species and populations has provided new opportunities to understand the evolutionary forces that drive genetic diversity using statistical inference. However, the era of population genomics presents new challenges in analysing the massive amounts of genomes and variants. Deep learning has demonstrated state-of-the-art performance for numerous applications involving large-scale data. Recently, deep learning approaches have gained popularity in population genetics; facilitated by the advent of massive genomic data sets, powerful computational hardware and complex deep learning architectures, they have been used to identify population structure, infer demographic history and investigate natural selection. Here, we introduce common deep learning architectures and provide comprehensive guidelines for implementing deep learning models for population genetic inference. We also discuss current challenges and future directions for applying deep learning in population genetics, focusing on efficiency, robustness and interpretability.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Workflow for traditional and machine learning approaches in population genetic inference.
Fig. 2: Common architectures and layers for artificial neural networks.
Fig. 3: Deep generative models.
Fig. 4: Novel architectures and models.
Fig. 5: The implementation workflow.

Similar content being viewed by others

References

  1. Wakeley, J. The limits of theoretical population genetics. Genetics 169, 1–7 (2005).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Lewontin, R. C. Population genetics. Annu. Rev. Genet. 1, 37–70 (1967).

    Article  Google Scholar 

  3. Fu, Y.-X. Variances and covariances of linear summary statistics of segregating sites. Theor. Popul. Biol. 145, 95–108 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Bradburd, G. S. & Ralph, P. L. Spatial population genetics: it’s about time. Annu. Rev. Ecol. Evol. Syst. 50, 427–449 (2019).

    Article  Google Scholar 

  5. Ewens, W. J. Mathematical Population Genetics I: Theoretical Introduction 2nd edn (Springer, 2004). This classic textbook covers theoretical population genetics ranging from the diffusion theory to the coalescent theory.

  6. Crow, J. F. & Kimura, M. An Introduction to Population Genetics Theory (Blackburn Press, 2009). This classic textbook introduces the fundamentals of theoretical population genetics.

  7. Pool, J. E., Hellmann, I., Jensen, J. D. & Nielsen, R. Population genetic inference from genomic sequence variation. Genome Res. 20, 291–300 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Charlesworth, B. & Charlesworth, D. Population genetics from 1966 to 2016. Heredity 118, 2–9 (2017).

    Article  CAS  PubMed  Google Scholar 

  9. Johri, P. et al. Recommendations for improving statistical inference in population genomics. PLoS Biol. 20, e3001669 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Article  Google Scholar 

  11. Mallick, S. et al. The Allen Ancient DNA Resource (AADR): a curated compendium of ancient human genomes. Preprint at bioRixv https://doi.org/10.1101/2023.04.06.535797 (2023).

  12. The 1001 Genomes Consortium. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 166, 481–491 (2016).

    Article  Google Scholar 

  13. Sudlow, C. et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Walters, R. G. et al. Genotyping and population characteristics of the China Kadoorie Biobank. Cell Genom. 3, 100361 (2023).

  15. Schrider, D. R. & Kern, A. D. Supervised machine learning for population genetics: a new paradigm. Trends Genet. 34, 301–312 (2018). This review covers the applications of supervised learning in population genetic inference.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  CAS  PubMed  Google Scholar 

  17. Gao, H. et al. The landscape of tolerated genetic variation in humans and primates. Science 380, eabn8153 (2023).

    Article  CAS  PubMed  Google Scholar 

  18. van Hilten, A. et al. GenNet framework: interpretable deep learning for predicting phenotypes from genetic data. Commun. Biol. 4, 1094 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Vaswani, A. et al. Attention is all you need. In Proc. Advances in Neural Information Processing Systems 30, NIPS 2017 (eds Guyon, I. et al.) 5999–6009 (NIPS, 2017). This study proposes the vanilla transformer architecture, which has become the basis of novel architectures that achieve state-of-the-art performance in different machine learning tasks.

  20. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Proc. 32nd International Conference on Machine Learning Vol. 37 (eds Bach, F. & Blei, D.) 2256–2265 (PMLR, 2015).

  21. Nei, M. in Molecular Evolutionary Genetics 327–403 (Columbia Univ. Press, 1987).

  22. Hamilton, M. B. in Population Genetics 53–67 (Wiley-Blackwell, 2009).

  23. Kimura, M. Diffusion models in population genetics. J. Appl. Probab. 1, 177–232 (1964).

    Article  Google Scholar 

  24. Kingman, J. F. C. On the genealogy of large populations. J. Appl. Probab. 19, 27–43 (1982).

    Article  Google Scholar 

  25. Rosenberg, N. A. & Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet. 3, 380–390 (2002).

    Article  CAS  PubMed  Google Scholar 

  26. Fu, Y.-X. & Li, W.-H. Maximum likelihood estimation of population parameters. Genetics 134, 1261–1270 (1993).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Griffiths, R. C. & Tavaré, S. Monte Carlo inference methods in population genetics. Math. Comput. Model. 23, 141–158 (1996).

    Article  Google Scholar 

  28. Tavaré, S., Balding, D. J., Griffiths, R. C. & Donnelly, P. Inferring coalescence times from DNA sequence data. Genetics 145, 505–518 (1997).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Marjoram, P. & Tavaré, S. Modern computational approaches for analysing molecular genetic variation data. Nat. Rev. Genet. 7, 759–770 (2006).

    Article  CAS  PubMed  Google Scholar 

  30. Williamson, S. H. et al. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc. Natl Acad. Sci. USA 102, 7882–7887 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Wang, M. et al. Detecting recent positive selection with high accuracy and reliability by conditional coalescent tree. Mol. Biol. Evol. 31, 3068–3080 (2014).

    Article  CAS  PubMed  Google Scholar 

  32. Szpiech, Z. A. & Hernandez, R. D. selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol. Biol. Evol. 31, 2824–2827 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Maclean, C. A., Hong, N. P. C. & Prendergast, J. G. D. hapbin: an efficient program for performing haplotype-based scans for positive selection in large genomic datasets. Mol. Biol. Evol. 32, 3027–3029 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Huang, X., Kruisz, P. & Kuhlwilm, M. sstar: a Python package for detecting archaic introgression from population genetic data with S*. Mol. Biol. Evol. 39, msac212 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Borowiec, M. L. et al. Deep learning as a tool for ecology and evolution. Methods Ecol. Evol. 13, 1640–1660 (2022).

    Article  Google Scholar 

  36. Korfmann, K., Gaggiotti, O. E. & Fumagalli, M. Deep learning in population genetics. Genome Biol. Evol. 15, evad008 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  37. Alpaydin, E. in Introduction to Machine Learning 3rd edn (eds Dietterich, T. et al.) 1–20 (MIT Press, 2014).

  38. Bengio, Y., LeCun, Y. & Hinton, G. Deep learning for AI. Commun. ACM 64, 58–65 (2021).

    Article  Google Scholar 

  39. Sapoval, N. et al. Current progress and open challenges for applying deep learning across the biosciences. Nat. Commun. 13, 1728 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Bishop, C. M. Model-based machine learning. Philos. Trans. R. Soc. A 371, 20120222 (2013).

    Article  Google Scholar 

  41. Lee, C., Abdool, A. & Huang, C. PCA-based population structure inference with generic clustering algorithms. BMC Bioinform. 10, S73 (2009).

    Article  Google Scholar 

  42. Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Skov, L. et al. Detecting archaic introgression using an unadmixed outgroup. PLoS Genet. 14, e1007641 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Chen, H., Hey, J. & Slatkin, M. A hidden Markov model for investigating recent positive selection through haplotype structure. Theor. Popul. Biol. 99, 18–30 (2015).

    Article  PubMed  Google Scholar 

  45. Lin, K., Li, H., Schlötterer, C. & Futschik, A. Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics. Genetics 187, 229–244 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Schrider, D. R., Ayroles, J., Matute, D. R. & Kern, A. D. Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLoS Genet. 14, e1007341 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Durvasula, A. & Sankararaman, S. A statistical model for reference-free inference of archaic local ancestry. PLoS Genet. 15, e1008175 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016). This classic textbook introduces the fundamentals of deep learning.

  49. Eraslan, G., Avsec, Z., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat. Rev. Genet. 20, 389–403 (2019).

    Article  CAS  PubMed  Google Scholar 

  50. Villanea, F. A. & Schraiber, J. G. Multiple episodes of interbreeding between Neanderthals and modern humans. Nat. Ecol. Evol. 3, 39–44 (2019).

    Article  PubMed  Google Scholar 

  51. Unadkat, S. B., Ciocoiu, M. M. & Medsker L. R. in Recurrent Neural Networks: Design and Applications (eds Medsker, L. R. & Jain, L. C.) 1–12 (CRC, 1999).

  52. Géron, A. Neural networks and deep learning (O’Reilly Media Inc., 2018).

  53. Sheehan, S. & Song, Y. S. Deep learning for population genetic inference. PLoS Comput. Biol. 12, e1004845 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Mondal, M., Bertranpetit, J. & Lao, O. Approximate Bayesian computation with deep learning supports a third archaic introgression in Asia and Oceania. Nat. Commun. 10, 246 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Sanchez, T., Curry, J., Charpiat, G. & Jay, F. Deep learning for population size history inference: design, comparison and combination with approximate Bayesian computation. Mol. Ecol. Resour. 21, 2645–2660 (2021).

    Article  PubMed  Google Scholar 

  56. Tran, L. N., Sun, C. K., Struck, T. J., Sajan, M. & Gutenkunst, R. N. Computationally efficient demographic history inference from allele frequencies with supervised machine learning. Preprint at bioRixv https://doi.org/10.1101/2023.05.24.542158 (2023).

    Article  Google Scholar 

  57. Romero, A. et al. Diet networks: thin parameters for fat genomics. In Proc. 5th International Conference on Learning Representations, ICLR 2017 (OpenReview.net, 2017).

  58. Isildak, U., Stella, A. & Fumagalli, M. Distinguishing between recent balancing selection and incomplete sweep using deep neural networks. Mol. Ecol. Resour. 21, 2706–2718 (2021).

    Article  PubMed  Google Scholar 

  59. Qin, X., Chiang, C. W. K. & Gaggiotti, O. E. Deciphering signatures of natural selection via deep learning. Brief. Bioinform. 23, bbac354 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Burger, K. E., Pfaffelhuber, P. & Baumdicker, F. Neural networks for self-adjusting mutation rate estimation when the recombination rate is unknown. PLoS Comput. Biol. 18, e1010407 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Fang, Y., Deng, S. & Li, C. A generalizable deep learning framework for inferring fine-scale germline mutation rate maps. Nat. Mach. Intell. 4, 1209–1223 (2022).

    Article  Google Scholar 

  62. Battey, C. J., Ralph, P. L. & Kern, A. D. Predicting geographic location from genetic variation with deep neural networks. eLife 9, e54507 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Flagel, L., Brandvain, Y. & Schrider, D. R. The unreasonable effectiveness of convolutional neural networks in population genetic inference. Mol. Biol. Evol. 36, 220–238 (2019). This study experiments with CNNs for various tasks in population genetic inference.

    Article  CAS  PubMed  Google Scholar 

  64. Wang, Z. et al. Automatic inference of demographic parameters using generative adversarial network. Mol. Ecol. Resour. 21, 2689–2705 (2021). This study develops a generative adversarial framework aimed at inferring demographic parameters from data in an unsupervised manner.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).

    Article  CAS  PubMed  Google Scholar 

  66. Montserrat, D. M., Bustamante, C. & Ioannidis, A. Lai-Net: Local-ancestry inference with neural networks. In Proc. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing 1314–1318 (IEEE, 2020).

  67. Sabat, B. O., Montserra, D. M., Giró-i-Nieto, X. & Ioannidis, A. G. SALAI-Net: species-agnostic local ancestry inference network. Bioinformatics 38, ii27–ii33 (2022).

    Article  Google Scholar 

  68. Kern, A. D. & Schrider, D. R. diploS/HIC: an updated approach to classifying selective sweeps. G3 8, 1959–1970 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Torada, L. et al. ImaGene: a convolutional neural network to quantify natural selection from population genomic data. BMC Bioinform. 20, 337 (2019).

    Article  Google Scholar 

  70. Deelder, W. et al. Using deep learning to identify recent positive selection in malaria parasite sequence data. Malar. J. 20, 270 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Xue, A. T., Schrider, D. R. & Kern, A. D., Ag1000g Consortium. Discovery of ongoing selective sweeps within Anopheles mosquito populations using deep learning. Mol. Biol. Evol. 38, 1168–1183 (2021).

    Article  CAS  PubMed  Google Scholar 

  72. Caldas, I. V., Clark, A. G. & Messer, P. W. Inference of selective sweep parameters through supervised learning. Preprint at bioRixv https://doi.org/10.1101/2022.07.19.500702 (2022).

    Article  Google Scholar 

  73. Hamid, I., Korunes, K. L., Schrider, D. R. & Goldberg, A. Localizing post-admixture adaptive variants with object detection on ancestry-painted chromosomes. Mol. Biol. Evol. 40, msad074 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Whitehouse, L. S. & Schrider, D. R. Timesweeper: accurately identifying selective sweeps using population genomic time series. Genetics 224, iyad084 (2023).

    Article  PubMed  Google Scholar 

  75. Cecil, R. M. & Sugden, L. A. On convolutional neural networks for selection inference: revealing the lurking role of preprocessing, and the surprising effectiveness of summary statistics. Preprint at bioRixv https://doi.org/10.1101/2023.02.26.530156 (2023).

    Article  Google Scholar 

  76. Arnab, S. P., Amin, M. R. & DeGiorgio, M. Uncovering footprints of natural selection through time-frequency analysis of genomic summary statistics. Mol. Biol. Evol. 40, msad157 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Lauterbur, M. E., Munch, K. & Enard, D. Versatile detection of diverse selective sweeps with Flex-sweep. Mol. Biol. Evol. 40, msad139 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Blischak, P. D., Barker, M. S. & Gutenkunst, R. N. Chromosome-scale inference of hybrid speciation and admixture with convolution neural networks. Mol. Ecol. Resour. 21, 2676–2688 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Gower, G., Picazo, P. I., Fumagalli, M. & Racimo, F. Detecting adaptive introgression in human evolution using convolutional neural networks. eLife 10, e64669 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Ray, D. D., Flagel, L. & Schrider, D. R. introUNET: identifying introgressed alleles via semantic. Preprint at bioRixv https://doi.org/10.1101/2023.02.07.527435 (2023).

    Article  Google Scholar 

  81. Zhang, Y. et al. Inferring historical introgression with deep learning. Syst. Biol. https://doi.org/10.1093/sysbio/syad033 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Smith, C. C. R., Tittes, S., Ralph, P. L. & Kern, A. D. Dispersal inference from population genetic variation using a convolutional neural network. Genetics 224, iyad068 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Battey, C. J., Coffing, G. C. & Kern, A. D. Visualizing population structure with variational autoencoders. G3 11, jkaa036 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Booker, W. W., Ray, D. D. & Schrider, D. R. This population doesn’t exist: learning the distribution of evolutionary histories with generative adversarial networks. Genetics 224, iyad063 (2023).

    Article  PubMed  Google Scholar 

  85. Meisner, J. & Albrechtsen, A. Haplotype and population structure inference using neural networks in whole-genome sequencing data. Genome Res. 32, 1542–1552 (2022). This study develops a variational autoencoder scalable on the UK Biobank data set for estimating ancestry proportions across the genome without training from simulated data.

    Article  PubMed  PubMed Central  Google Scholar 

  86. Yelmen, B. et al. Deep convolutional and conditional neural networks for large-scale genomic data generation. Preprint at bioRixv https://doi.org/10.1101/2023.03.07.530442 (2023).

    Article  Google Scholar 

  87. Chan, J. et al. A likelihood-free inference framework for population genetic data using exchangeable neural networks. In Proc. Advances in Neural Information Processing Systems 31, NeurIPS 2018 (eds Bengio, S. et al.) 8594–8605 (NeurIPS, 2018).

  88. Graves, A. Supervised Sequence Labelling with Recurrent Neural Networks 1st edn (Springer, 2012).

  89. Adrion, J. R., Galloway, J. G. & Kern, A. D. Predicting the landscape of recombination using deep learning. Mol. Biol. Evol. 37, 1790–1808 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Hejase, H. A., Mo, Z., Campagna, L. & Siepel, A. A deep-learning approach for inference of selective sweeps from ancestral recombination graph. Mol. Biol. Evol. 39, msab332 (2022).

    Article  CAS  PubMed  Google Scholar 

  91. Sanchez-Lengeling, B., Reif, E., Pearce, A. & Wiltschko, A. B. A gentle introduction to graph neural networks. Distill https://doi.org/10.23915/distill.00033 (2021).

    Article  Google Scholar 

  92. Daigavane, A., Ravindran, B. & Aggarwal, G. Understanding convolutions on graphs. Distill https://doi.org/10.23915/distill.00032 (2021).

    Article  Google Scholar 

  93. Veličković, P. et al. Graph attention networks. In Proc. 6th International Conference on Learning Representations, ICLR 2018 (OpenReview.net, 2018).

  94. Griffiths, R. C. & Marjoram, P. Ancestral inference from samples of DNA sequences with recombination. J. Comput. Biol. 3, 479–502 (1996).

    Article  CAS  PubMed  Google Scholar 

  95. Paradis, E. Analysis of haplotype networks: the randomized minimum spanning tree method. Methods Ecol. Evol. 9, 1308–1317 (2018).

    Article  Google Scholar 

  96. Korfmann, K., Sellinger, T., Freund, F., Fumagalli, M. & Tellier, A. Simultaneous inference of past demography and selection from the ancestral recombination graph under the beta coalescent. Preprint at bioRxiv https://doi.org/10.1101/2022.09.28.508873 (2022).

    Article  Google Scholar 

  97. Hinton, G. et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012).

    Article  Google Scholar 

  98. Bond-Taylor, S., Leach, A., Long, Y. & Willcocks, C. G. Deep generative modelling: a comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7327–7347 (2022).

    Article  PubMed  Google Scholar 

  99. Yelmen, B. et al. Creating artificial human genomes using generative neural networks. PLoS Genet. 17, e1009303 (2021). This study utilizes restricted Boltzmann machines and generative adversarial networks for synthesizing realistic human genomes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Goodfellow, I. J. et al. Generative adversarial nets. In Proc. Advances in Neural Information Processing Systems 27, NIPS 2014 (eds Ghahramani, Z. et al.) 2672–2680 (NIPS, 2014).

  101. Saxena, D. & Cao, J. Generative adversarial networks (GANs): challenges, solutions, and future directions. ACM Comput. Surv. 54, 63 (2021).

    Google Scholar 

  102. Mantes, A. D., Montserrat, D. M., Bustamante, C. D., Giró-i-Nieto, X. & Ioannidis, A. G. Neural ADMIXTURE: rapid population clustering with autoencoders. Nat. Comput. Sci. 3, 621–629 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  103. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Ausmees, K. & Nettelblad, C. A deep learning framework for characterization of genotype data. G3 12, jkac020 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  106. Riley, R., Mathieson, I. & Mathieson, S. Interpreting generative adversarial networks to infer natural selection from genetic data. Preprint at bioRixv https://doi.org/10.1101/2023.03.07.531546 (2023).

    Article  Google Scholar 

  107. Gower, G., Picazo, P. I., Lindgren, F. & Racimo, F. Inference of population genetics parameters using discriminator neural networks: an adversarial Monte Carlo approach. Preprint at bioRixv https://doi.org/10.1101/2023.04.27.538386 (2023).

  108. Montserrat, D. M., Bustamante, C. & Ioannidis, A. Class-conditional VAE-GAN for local-ancestry simulation. In Proc. 14th Machine Learning in Computational Biology meeting (MLCB, 2019).

  109. Borji, A. Pros and cons of GAN evaluation measures. Comput. Vis. Image Underst. 179, 41–65 (2019).

    Article  Google Scholar 

  110. Phuong M. & Hutter M. Formal algorithms for transformers. Preprint at arXiv https://doi.org/10.48550/arXiv.2207.09238 (2022).

  111. Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: Fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning Vol. 119 (eds Daumé, H. III & Singh, A.) 5156–5165 (PMLR, 2020).

  112. Cordonnier, J., Loukas, A. & Jaggi, M. On the relationship between self-attention and convolutional layers. In Proc. 8th International Conference on Learning Representations, ICLR 2020 (OpenReview.net, 2020).

  113. Lakew, S. M., Cettolo, M. & Federico, M. A comparison of transformer and recurrent neural networks on multilingual neural machine translation. In Proc. 27th International Conference on Computational Linguistics (eds Bender, E. et al.) 641–652 (ACL, 2018).

  114. Ramachandran, P. et al. Stand-alone self-attention in vision models. In Proc. Advances in Neural Information Processing Systems 32, NeurIPS 2019 (eds Wallach, H. et al.) 68–80 (NeurIPS, 2019).

  115. Liu, Y. X. et al. Learning virus genotype-fitness landscape in embedding space. Preprint at bioRixv https://doi.org/10.1101/2023.02.09.527693 (2023).

    Article  Google Scholar 

  116. Devlin, J., Chang, M., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference North American Chapter of the Association for Computational Linguistics: Human Language Technologies Vol. 1 (eds Burstein, J. et al.) 4171–4186 (ACL, 2019).

  117. Brown, T. B. et al. Language models are few-shot learners. In Proc. Advances in Neural Information Processing Systems 33, NeurIPS 2020 (eds Larochelle, H. et al.) 1877–1901 (NeurIPS, 2020).

  118. Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In Proc. 9th International Conference on Learning Representations, ICLR 2021 (OpenReview.net, 2021).

  119. Zaheer, M. et al. Big Bird: Transformers for longer sequences. In Proc. Advances in Neural Information Processing Systems 33, NeurIPS 2020 (eds Larochelle, H. et al.) 17283–17297 (NeurIPS, 2020).

  120. Dhariwal, P. & Nichol, A. Q. Diffusion models beat GANs on image synthesis. In Proc. Advances in Neural Information Processing Systems 34, NeurIPS 2021 (eds Ranzato, M. et al.) 8780–8794 (NeurIPS, 2021).

  121. Croitoru, F.-A., Hondru, Ionescu, R. T. & Shah M. Diffusion models in vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. https://doi.org/10.1109/TPAMI.2023.3261988 (2023).

    Article  PubMed  Google Scholar 

  122. Huang, Y.-F. & Siepel, A. Estimation of allele-specific fitness effects across human protein-coding sequences and implications for disease. Genome Res. 29, 1310–1321 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Bishop, C. M. Pattern Recognition and Machine Learning 1st edn (Springer, 2006).This classic textbook covers a range of machine learning algorithms and statistical inference approaches, which are also widely used in population genetic inference.

  124. Bengio, Y. in Neural Networks: Tricks of the Trade (eds Montavon, G. et al.) 437–478 (Springer, 2012).

  125. Tieleman, T. & Hinton, G. Lecture 6.5-RmsProp: divide the gradient by a running average of its recent magnitude. Coursera: Neural Networks for Machine Learning 4, 26–31 (Coursera, 2012).

  126. Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In Proc. 3rd International Conference on Learning Representations (ICLR, 2015).

  127. Jospin, L. V., Laga, H., Boussaid, F., Buntine, W. & Bennamoun, M. Hands-on Bayesian neural networks—a tutorial for deep learning users. IEEE Comput. Intell. Mag. 17, 29–48 (2022).

    Article  Google Scholar 

  128. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res 15, 1929–1958 (2014).

    Google Scholar 

  129. Prechelt, L. in Neural Networks: Tricks of the Trade (eds Montavon, G. et al.) 53–67 (Springer, 2012).

  130. Arlot, S. & Celisse, A. A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010).

    Article  Google Scholar 

  131. Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proc. 32nd International Conference on Machine Learning Vol. 37 (eds Bach, F. & Blei, D.) 448–456 (PMLR, 2015).

  132. Luo, P., Wang, X., Shao, W. & Peng, Z. Towards understanding regularization in batch normalization. In Proc. 7th International Conference on Learning Representations, ICLR 2019 (OpenReview.net, 2019).

  133. Green, R. E. et al. A draft sequence of the Neanderthal genome. Science 328, 710–722 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  134. Borji, A. Pros and cons of GAN evaluation measures: new developments. Comput. Vis. Image Underst. 215, 103329 (2022).

    Article  Google Scholar 

  135. Theis, L., van den Oord, A. & Bethge, M. A note on the evaluation of generative models. In Proc. 4th International Conference on Learning Representations (ICLR, 2016).

  136. Sajjadi, M. S. M., Bachem, O., Lucic, M., Bousquet, O. & Gelly, S. Assessing generative models via precision and recall. In Proc. Advances in Neural Information Processing Systems 31, NeurIPS 2018 (eds Bengio, S. et al.) 5228–5237 (NeurIPS, 2018).

  137. Naeem, M. F., Oh, S. J., Uh, Y., Choi, Y. & Yoo, J. Reliable fidelity and diversity metrics for generative models. In Proc. 37th International Conference on Machine Learning Vol. 119 (eds Daumé, H. III & Singh, A.) 7176–7185 (PMLR, 2020).

  138. Perera, M. et al. Generative moment matching networks for genotype simulation. In Proc. 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) 1379–1383 (IEEE, 2022).

  139. Kynkäänniemi, T., Karras, T., Laine, S., Lehtinen, J. & Aila, T. Improved precision and recall metric for assessing generative models. In Proc. Advances in Neural Information Processing Systems 32, NeurIPS 2019 (eds Wallach, H. et al.) 3904–3913 (NeurIPS, 2019).

  140. Cornuet, J. M., Aulagnier, S., Lek, S., Franck, S. & Solignac, M. Classifying individuals among infra-specific taxa using microsatellite data and neural networks. C. R. Acad. Sci. III 319, 1167–1177 (1996).

    CAS  PubMed  Google Scholar 

  141. Guinand, B. et al. Comparisons of likelihood and machine learning methods of individual classification. J. Hered. 93, 260–269 (2002).

    Article  CAS  PubMed  Google Scholar 

  142. Sengupta, S. et al. A review of deep learning with special emphasis on architectures, applications and recent trends. Knowl. Based Syst. 194, 105596 (2020).

    Article  Google Scholar 

  143. Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989).

    Article  Google Scholar 

  144. Schäfer, A. M. & Zimmermann, H. G. Recurrent neural networks are universal approximators. In Proc. 16th International Conference Artificial Neural Networks-ICANN 2006, Part I (eds. Kollias, S. D. et al.) 632–640 (Springer, 2006).

  145. Browning, S. R., Browning, B. L., Zhou, Y., Tucci, S. & Akey, J. M. Analysis of human sequence data reveals two pulses of archaic Denisovan admixture. Cell 173, 53–61 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  146. Frolov, S., Hinz, T., Raue, F., Hees, J. & Dengel, A. Adversarial text-to-image synthesis: a review. Neural Netw. 144, 187–209 (2021).

    Article  PubMed  Google Scholar 

  147. Abrantes, J. P., Abrantes, A. J. & Oliehoek, F. A. Mimicking evolution with reinforcement learning. Preprint at arXiv https://doi.org/10.48550/arXiv.2004.00048 (2020).

  148. Fawzi, A. et al. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature 610, 47–53 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Mankowitz, D. J. et al. Faster sorting algorithms discovered using deep reinforcement learning. Nature 618, 257–263 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  150. Hui, Z., Li, J., Wang, X. & Gao, X. Learning the non-differentiable optimization for blind super-resolution. In Proc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2093–2102 (IEEE, 2021).

  151. Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).

    Article  CAS  PubMed  Google Scholar 

  152. Ibnu, C. R. M., Santoso, J. & Surendro, K. Determining the neural network topology: a review. In Proc. 2019 8th International Conference on Software and Computer Applications 357–362 (ACM, 2019).

  153. Menghani, G. Efficient deep learning: a survey on making deep learning models smaller, faster, and better. ACM Comput. Surv. 55, 259 (2023).

    Article  Google Scholar 

  154. He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. In Proc. 14th European Conference Computer Vision — ECCV 2016, Part IV (eds Leibe, B. et al.) 630–645 (Springer, 2016).

  155. Ouyang, L. et al. Training language models to follow instructions with human feedback. In Proc. Advances in Neural Information Processing Systems 35, NeurIPS 2022 (eds Koyejo, S. et al.) 27730–27744 (NeurIPS, 2022).

  156. Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at arXiv https://doi.org/10.48550/arXiv.2302.13971 (2023).

    Article  Google Scholar 

  157. Kang, M. et al. Scaling up GANs for text-to-image synthesis. In Proc. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 10124–10134 (IEEE, 2023).

  158. Kao, W.-T. & Lee, H.-Y. Is BERT a cross-disciplinary knowledge learner? A surprising finding of pre-trained models’ transferability. In Findings of the Association for Computational Linguistics: EMNLP 2021 (eds Moens, M.-F. et al.) 2195–2208 (ACL, 2021).

  159. Marinó, G. C., Petrini, A., Malchiodi, D. & Frasca, M. Deep neural networks compression: a comparative survey and choice recommendations. Neurocomputing 520, 152–170 (2023).

    Article  Google Scholar 

  160. Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proc. 34th International Conference on Machine Learning Vol. 70 (eds Precup, D. & Teh, Y. W.) 1126–1135 (PMLR, 2017).

  161. Wei, Y., Zhao, P. & Huang, J. Meta-learning hyperparameter performance prediction with neural processes. In Proc. 38th International Conference on Machine Learning Vol. 139 (eds Meila, M. & Zhang, T.) 11058–11067 (PMLR, 2021).

  162. Hospedales, T., Antoniou, A., Micaelli, P. & Storkey, A. Meta-learning in neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5149–5169 (2022).

    PubMed  Google Scholar 

  163. Kaveh, M. & Mesgari, M. S. Application of meta-heuristic algorithms for training neural networks and deep learning architectures: a comprehensive review. Neural Process. Lett. https://doi.org/10.1007/s11063-022-11055-6 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  164. Tirumala, S. S., Ali, S. & Ramesh, C. P. Evolving deep neural networks: A new prospect. In Proc. 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) 69–74 (IEEE, 2016).

  165. Stanley, K. O., Clune, J., Lehman, J. & Miikkulainen, R. Designing neural networks through neuroevolution. Nat. Mach. Intell. 1, 24–35 (2019).

    Article  Google Scholar 

  166. Juan, D., Santpere, G., Kelley, J. L., Cornejo, O. E. & Marques-Bonet, T. Current advances in primate genomics: novel approaches for understanding evolution and disease. Nat. Rev. Genet. 24, 314–331 (2023).

    Article  CAS  PubMed  Google Scholar 

  167. Wang, Y., Yao, Q., Kwok, J. T. & Ni, L. M. Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. 53, 63 (2020).

    Google Scholar 

  168. Wang, W., Zheng, V. W., Yu, H. & Miao, C. A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. 10, 13 (2019).

    Article  CAS  Google Scholar 

  169. Saada, J. N., Hu, A. & Palamara, P. F. in Workshop on Learning Meaningful Representations of Life at 35th Conf. Neural Information Processing Systems. LMRL https://www.lmrl.org/papers2021 (2021).

  170. Lauterbur, M. E. et al. Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. eLife 12, RP84874 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  171. Hudson, R. R. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002).

    Article  CAS  PubMed  Google Scholar 

  172. Baumdicker, F. et al. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 220, iyab229 (2022).

    Article  PubMed  Google Scholar 

  173. Haller, B. C. & Messer, P. W. SLiM 3: forward genetic simulations beyond the Wright–Fisher model. Mol. Biol. Evol. 36, 632–637 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  174. Huang, X. et al. Inferring genome-wide correlation of mutation fitness effects between populations. Mol. Biol. Evol. 38, 4588–4602 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  175. Ewing, G. B. & Jensen, J. D. The consequences of not accounting for background selection in demographic inference. Mol. Ecol. 25, 135–141 (2016).

    Article  PubMed  Google Scholar 

  176. Mo, Z. & Siepel, A. Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data. Preprint at bioRixv https://doi.org/10.1101/2023.03.01.529396 (2023).

    Article  Google Scholar 

  177. Hendrycks, D., Lee, K. & Mazeika, M. Using pre-training can improve model robustness and uncertainty. In Proc. 36th International Conference on Machine Learning Vol. 97 (eds Chaudhuri, K. & Salakhutdinov, R.) 2712–2721 (PMLR, 2019).

  178. Hendrycks, D., Mazeika, M., Kadavath, S. & Song, D. Using self-supervised learning can improve model robustness and uncertainty. In Proc. Advances in Neural Information Processing Systems 32, NeurIPS 2019 (eds Wallach, H. et al.) 15584–15595 (NeurIPS, 2019).

  179. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. TensorFlow https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45166.pdf (2015).

  180. Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. In Proc. Advances in Neural Information Processing Systems 32, NeurIPS 2019 (eds Wallach, H. M. et al.) 7994–8005 (NeurIPS, 2019).

  181. Chen, B. et al. Towards training reproducible deep learning models. In Proc. 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2022 2202–2214 (ACM, 2022).

  182. Walsh, I. et al. DOME: recommendations for supervised machine learning validation in biology. Nat. Methods 18, 1122–1127 (2021).

    Article  CAS  PubMed  Google Scholar 

  183. Sanchez, T. et al. dnadna: a deep learning framework for population genetics inference. Bioinformatics 39, btac765 (2023).

    Article  CAS  PubMed  Google Scholar 

  184. Montserrat, D. M. & Ioannidis, A. G. Adversarial attacks on genotype sequences. In Proc. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech, & Signal Processing (IEEE, 2023).

  185. Ren, K., Zheng, T., Qin, Z. & Liu, X. Adversarial attacks and defenses in deep learning. Engineering 6, 346–360 (2020).

    Article  Google Scholar 

  186. Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 73, 1–15 (2018).

    Article  Google Scholar 

  187. Azodi, C. B., Tang, J. & Shiu, S.-H. Opening the black box: interpretable machine learning for geneticists. Trends Genet. 36, 442–455 (2020).

    Article  CAS  PubMed  Google Scholar 

  188. Novakovsky, G., Dexter, N., Libbrecht, M. W., Wasserman, W. W. & Mostafavi, S. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat. Rev. Genet. 24, 125–137 (2023).

    Article  CAS  PubMed  Google Scholar 

  189. Liang, Y., Li, S., Yan, C., Li, M. & Jiang, C. Explaining the black-box model: a survey of local interpretation methods for deep neural networks. Neurocomputing 419, 168–182 (2021).

    Article  Google Scholar 

  190. Saleem, R., Yuan, B., Kurugollu, F., Anjum, A. & Liu, L. Explaining deep neural networks: a survey on the global interpretation methods. Neurocomputing 513, 165–180 (2022).

    Article  Google Scholar 

  191. Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (ACM, 2016).

  192. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proc. Advances in Neural Information Processing Systems 30, NIPS 2017 (eds Guyon, I. et al.) 4768–4777 (NIPS, 2017).

  193. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at arXiv https://doi.org/10.48550/arXiv.1312.6034 (2013).

  194. McVean, G. A genealogical interpretation of principal components analysis. PLoS Genet. 5, 1000686 (2009).

    Article  Google Scholar 

  195. Peter, B. M. A geometric relationship of F2, F3 and F4-statistics with principal component analysis. Philos. Trans. R. Soc. B 377, 20200413 (2022).

    Article  Google Scholar 

  196. Tenachi, W., Ibata, R. & Diakogiannis, F. I. Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws. Preprint at arXiv https://doi.org/10.48550/arXiv.2303.03192 (2023).

    Article  Google Scholar 

  197. OpenAI. GPT-4 technical report. Preprint at arXiv https://doi.org/10.48550/arXiv.2303.08774 (2023).

  198. Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at arXiv https://doi.org/10.48550/arXiv.2303.12712 (2023).

    Article  Google Scholar 

  199. Pearson, K. Notes on the history of correlation. Biometrika 13, 25–45 (1920).

    Article  Google Scholar 

  200. Denis, D. J. The origins of correlation and regression: Francis Galton or Auguste Bravais and the error theorists? Hist. Philos. Psychol. Bull. 13, 36–44 (2001).

    Google Scholar 

  201. Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. In Proc. Advances in Neural Information Processing Systems 33, NeurIPS 2020 (eds Larochelle, H. et al.) 6840–6851 (NeurIPS, 2020).

  202. Patel, A., Montserrat, D. M., Bustamante, C. & Ioannidis, A. Hyperbolic geometry-based deep learning methods to produce population trees from genotype data. Preprint at bioRixv https://doi.org/10.1101/2022.03.28.484797 (2022).

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful for support from the Life Science Compute Cluster (LiSC) of University of Vienna and feedback from R. N. Gutenkunst. X.H. thanks H.-T. Lin and H.-Y. Lee, who provided extraordinary open courses online for learning machine learning. O.D. and O.L. acknowledge support from the John Templeton Foundation (Id: 62178). O.L. acknowledges CSIC support (Proyecto Intramural I3-2022). M.K. is funded by the Vienna Science and Technology Fund (WWTF) (10.47379/VRG20001). The authors used ChatGPT with GPT-4 from OpenAI for language editing of the initial draft manuscript.

Author information

Authors and Affiliations

Authors

Contributions

X.H., M.K. and O.L. researched the literature and wrote the article. All authors substantially contributed to discussions of the content and reviewed and/or edited the manuscript before submission.

Corresponding authors

Correspondence to Xin Huang, Oscar Lao or Martin Kuhlwilm.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Genetics thanks Alexander Ioannidis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Open Neural Network Exchange: https://onnx.ai/

Glossary

Accuracy

The ratio of the number of correct predictions to the total number of predictions in a data set.

Admixture

The process in which genetic material from multiple populations merges into a single population.

Adversarial attacks

Techniques that slightly perturb input data and cause machine learning models to make wrong predictions on such manipulated inputs with high confidence.

Approximate Bayesian computation

A statistical inference approach that uses simulation to estimate the posterior distribution of model parameters based on observed data when the exact posterior distribution is intractable, as motivated by Bayes’ theorem.

Backpropagation

An optimization algorithm that recursively calculates gradients from the output layer to the input layer based on the chain rule from calculus for updating the parameters of an artificial neural network (ANN).

Cross-entropy

A metric that measures the performance of machine learning models for classification by comparing the similarity of two probability distributions.

Decision trees

A class of supervised learning algorithms that make predictions by learning and organizing rules into a binary tree-like structure from data.

Domain adaptation

A technique that enables machine learning models trained with data from one domain (source domain) to be adapted and make accurate predictions on data from a different but related domain (target domain).

Dropout

A technique to regularize artificial neural networks (ANNs) by randomly deactivating neurons during training.

Early stopping

A technique to regularize artificial neural networks (ANNs) by stopping training before reaching the minimum loss on the validation set.

F1 score

A metric that measures the performance of machine learning models for binary classification by calculating the harmonic mean of a given pair of precision and recall.

Game theory

A mathematics discipline that studies strategies of interaction among rational players.

Gated recurrent units

Units similar to long short-term memory but with fewer parameters that improve the performance of recurrent neural networks (RNNs) on long sequences.

Grid search

A technique that optimizes hyperparameters by training and evaluating the model performance on combinations of a predefined set of hyperparameters.

Hidden Markov models

A class of generative models that process sequential data by assuming the observed sequence is generated by a sequence of unobserved random variables independently transiting from the current state to the next state and not depending on all previous states.

Logistic regression

A type of regression that generates binary output ranging from zero to one and can be viewed as a special type of artificial neural network (ANN) composed by a neuron using a sigmoid activation function.

Long short-term memory

A unit that improves the performance of recurrent neural networks (RNNs) on long sequences by deciding whether information should be remembered or forgotten in the neural network along the sequence.

Loss function

A mathematical function that quantifies the difference (that is, loss) between the actual data and the predictions made by a machine learning model.

Markov chain

A sequence of random variables where the future state of a given step only depends on the current state and remains unaffected by all previous states.

Markov chain Monte Carlo

A statistical inference approach for estimating statistical model parameters by gradually approximating the probability distribution of model parameters by simulating a Markov chain of parameter values.

Maximum likelihood estimation

A statistical inference approach for estimating statistical model parameters by finding parameters that can maximize the probability of observed data.

Meta-learning

A class of machine learning algorithms that automates the learning process of machine learning algorithms for different tasks.

Mixture models

Probabilistic models that are generated from multiple atomic probability distributions.

Precision

The ratio of the number of correctly predicted instances to the total number of instances predicted as belonging to that class by the model in a data set.

Principal component analysis

A technique that reduces the dimensionality of high-dimensional continuous data while keeping most of the information from the data by exploiting linear relationships among the features.

Random search

A technique that optimizes hyperparameters by training and evaluating the model performance on random combinations of hyperparameters from a predefined search space.

Recall

The ratio of the number of correctly predicted instances to the total number of instances actually belonging to that class in a data set.

Rectified linear unit

A common activation function that returns the input value if the input value is larger than zero, or returns zero otherwise.

Regression

A supervised learning task that makes quantitative predictions from input data.

Saliency maps

Graphical representations that visualize the contributions of each pixel in an image to the predictions made by an artificial neural network (ANN), revealing the regions of the image that significantly influence the decision-making process of the network.

Sequence-to-sequence learning

A machine learning task that involves training models to convert input sequences into corresponding output sequences, which is often used in natural language processing for applications such as machine translation and speech recognition.

Simulated annealing approaches

Optimization methods that iteratively conduct probability solution updates proportional to the number of iterations already performed and the quality of the proposed solution to discover the global optimal solution.

Stochastic gradient descent

An optimization algorithm that iteratively updates parameters in machine learning models by randomly choosing data to calculate the gradients of the loss function.

Structured prediction

A supervised learning task that, unlike traditional classification or regression tasks that usually predict a single entity output, forecasts complex structures within the input data and generates outputs such as sequences, trees and graphs.

Summary statistics

Metrics computed from genetic variants that are informative for an evolutionary parameter of interest.

Support

A set of values of a random variable for which the probabilities are greater than zero with a given probability distribution.

Tensors

Multidimensional arrays that are generalized from vectors and matrices for organizing high-dimensional data.

Transfer learning

A technique that allows reusing previously trained successful models for one task in other similar tasks.

Weight decay

A technique to regularize machine learning algorithms by penalizing large values in the model parameters through adding a term to the loss function that is proportional to the square of the magnitude of the parameters.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Rymbekova, A., Dolgova, O. et al. Harnessing deep learning for population genetic inference. Nat Rev Genet 25, 61–78 (2024). https://doi.org/10.1038/s41576-023-00636-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41576-023-00636-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing