The past two decades have witnessed extraordinary technological and computational advances in nucleic acid sequencing. This Milestone timeline provides a perspective of major genomic sequencing-related developments in the 21st century — from the first human reference genome, through methodological breakthroughs, to the impact of sequencing on fields as diverse as microbiology, cancer and palaeogenetics.

Although single reference genomes are valuable resources, they do not capture genetic diversity among individuals. Sherman and Salzberg discuss the concept of ‘pan-genomes’, which are reference genomes that encompass the genetic variation within a given species. Focusing particularly on large eukaryotic pan-genomes, they describe the latest progress, the varied methodological approaches and computational challenges, as well as applications in fields such as agriculture and human disease.

Long-read sequencing is becoming more accessible and more accurate. In this Review, Logsdon et al. discuss the currently available platforms, how the technologies are being applied to assemble and phase human genomes, and their impact on improving our understanding of human genetic variation.

Whole-genome sequencing of monozygotic twins, along with their parents, spouses and children, identifies postzygotic mutations present in the somatic tissue of one twin, but not the other, and characterizes differences in the number and timing of these mutations.

Ancient DNA reveals genetic differences between stone-tool users and people associated with ceramic technology in the Caribbean and provides substantially lower estimates of population sizes in the region before European contact.

Further reading: Research and Reviews articles

Thirty years on from the launch of the Human Genome Project, Richard Gibbs reflects on the promisesthat this voyage of discovery bore. Its success should be measured by how this project transformed the rules of research, the way of practising biological discovery and the ubiquitous digitization of biological science.

In this Review, Lisa Waylen and colleagues provide an overview of techniques used for spatial resolution of gene expression in a tissue or organ. They discuss the advantages, disadvantages and future directions of current methods and illustrate how spatial transcriptomics has impacted our understanding of biology.

The authors summarize the history of the ENCODE Project, the achievements of ENCODE 1 and ENCODE 2, and how the new data generated and analysed in ENCODE 3 complement the previous phases.

Genomes are partitioned into topologically associating domains (TADs). Here the authors present single-nucleus Hi-C maps in Drosophila at 10 kb resolution, demonstrating the presence of chromatin compartments in individual nuclei, and partitioning of the genome into non-hierarchical TADs at the scale of 100 kb, which resembles population TAD profiles.

A chromosome-quality genome of the lungfish Neoceratodus fosteri sheds light on the development of obligate air-breathing and the gain of limb-like gene expression in lobed fins, providing insights into the water-to-land transition in vertebrate evolution.

Genome-wide detection of inversions in great ape genomes by using long-read sequencing and single-cell DNA template strand sequencing (Strand-seq) expands the number of known ape inversions and identifies several regions that have recurrently toggled between a direct and an inverted state during primate evolution.

Whole-genome bisulfite sequencing along with whole-genome and transcriptome sequencing of 100 prostate cancer metastases identifies genomic regions that are differentially methylated during disease progression and a novel epigenomic subtype.

Whole-genome sequencing analyses of African populations provide insights into continental migration, gene flow and the response to human disease, highlighting the importance of including diverse populations in genomic analyses to understand human ancestry and improve health.

Maria Teschler-Nicola et al. use ancient DNA sequencing to report the earliest known case of human monozygotic twins found in a previously discovered Upper Palaeolithic burial site. Using bioanthropological and archaeological techniques, they also find that the twins were full-term newborns and that ancient mortuary behavior included re-opening of grave sites to bury related individuals together.

How Indigenous populations in the southern tip of South America have changed over time has been unclear. Here the authors generate genome-wide data for 20 ancient individuals and examine how past migrations and admixture events correlate to geography and shifts in the archaeological record.

Leonard Goldstein et al. use high-throughput single-cell B-cell receptor sequencing on thousands of individual B cells from rat, mouse, and human repertoires. They obtained paired full-length heavy- and light-chain variable regions, and show that this approach is a powerful tool for antibody discovery.

Nanopore sequencing technology generates longer reads than current technologies, but with more errors. Here, the authors develop new analytical tools to improve accuracy and evaluate the potential of nanopore sequencing for clinical human genomics.

Birch pitch is thought to have been used in prehistoric times as hafting material or antiseptic and tooth imprints suggest that it was chewed. Here, the authors report a 5,700 year-old piece of chewed birch pitch from Denmark from which they successfully recovered a complete ancient human genome and oral microbiome DNA.

RNA velocity, estimated in single cells by comparison of spliced and unspliced mRNA, is a good indicator of transcriptome dynamics and will provide a useful tool for analysis of developmental lineage.

