Recent Regulatory Changes Shaped Human Facial and Vocal Anatomy

Jerusalem, Edmond J. Safra Campus, Givat Ram, Jerusalem 91904, Israel. 2 School of Human Evolution and Social Change, Arizona State University, Tempe, AZ 85281, USA. 3 Department of Genetics, Texas Biomedical Research Institute, San Antonio, Texas 85287, USA. 4 Broad Institute, Cambridge MA 02138, USA. 5 Institute of Evolutionary Biology (UPF-CSIC), 08003 Barcelona, Spain. 6 Center for Bioarchaeological Research, Arizona State University, Tempe, AZ 85287, USA. 7 Institute of Human Origins, Arizona State University, Tempe, AZ 85287, USA. 8 Department of Evolutionary Anthropology , Duke University , Durham, NC 27708, USA. 9 Orthopaedic Department, Hadassah – Hebrew University Medical Center, Jerusalem, Israel. 10 I.E.S.O. ‘Los Salados’. Junta de Castilla y León, Spain 11 Junta de Castilla y León, Servicio de Cultura de León, Spain 12 Max Planck Institute for the Science of Human History, 07745 Jena, Germany. 13 Department of Statistics, The Hebrew University of Jerusalem, Jerusalem 91905, Israel. 14 Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig D-04103, Germany. 15 Department of Genetics, Harvard Medical School, Boston, MA 02115, USA. 16 Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA. 17 Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain. 18 Centro Nacional de Análisis Genómico (CRG-CNAG), 08028 Barcelona, Spain. 19 The Edmond and Lily Safra Center for Brain Sciences (ELSC), The Hebrew University of Jerusalem, Edmond J. Safra


2
Identifying changes in gene regulation that shaped human-specific traits is critical to understanding human adaptation. Here, we use dozens of ancient and present-day DNA methylation maps to detect regulatory changes that emerged in modern humans. We show that genes affecting vocalization and facial features went through particularly extensive changes in methylation. Especially, we identify expansive changes in a network of genes regulating skeletal development (SOX9, ACAN and COL2A1), and in NFIX, which controls facial projection and voice box (larynx) development. We propose that these changes played a key role in shaping the human face, and in forming the human 1:1 vocal tract configuration that is considered optimal for speech. Our results provide insights into the molecular mechanisms that shaped the modern human face and voice, and suggest that they arose after the split from Neanderthals and Denisovans.
The advent of high-quality ancient genomes of archaic humans (the Neanderthal and Denisovan) and anatomically modern humans (AMHs) opened up the possibility to identify genomic changes that are unique to AMHs, and may underlie some of our defining traits 1 . Although the biological impact of mutations in protein-coding regions is easier to predict, these mutations are rare 2 . The overwhelming majority of the ~30,000 fixed substitutions between archaic and present-day humans occurred at noncoding regions 2 . While most of them are probably nearly neutral, a sizeable minority likely affects gene regulation, especially those mutations that reside in such regions as promoters and enhancers. In fact, regulatory changes are thought to account for much of the phenotypic variation that differentiates between closely-related groups 3 . Today, our ability to understand the regulatory function and phenotypic effect of a noncoding substitution is very limited. Thus the importance in looking at regulatory layers such as DNA methylation, which provide direct information on gene regulation.
To this end, we have previously developed a method to reconstruct DNA methylation maps along ancient genomes 4 . This method is based on a predominant degradation process in ancient DNA, where methylated cytosines are deaminated with time into thymines, while unmethylated cytosines are deaminated into uracils [4][5][6] . The subsequent removal of uracils during ancient DNA library preparation creates a signal that enables the distinction between pre-mortem methylated and unmethylated positions [4][5][6] . Using this method, we have reconstructed the methylomes of a Neanderthal and a Denisovan, and compared them to a present-day osteoblast methylation map 4 .
However, the ability to identify differentially methylated regions (DMRs) between the human groups was confined by the incomplete reference map, the differences in sequencing technologies, the lack of an outgroup and the restricted set of skeletal samples (see Methods). Therefore, we have sought to establish a comprehensive assembly of skeletal DNA methylation maps. To the previously reconstructed Denisovan and Altai Neanderthal maps, we added the methylome of the ~40,000 years old (yo) Vindija Neanderthal, and three methylomes of ancient AMHsthe ~45,000 yo Ust'-Ishim indvidual 7 , the ~8,000 yo Loschbour individual 8 , and the ~7,000 yo Stuttgart individual 8 . In addition, we sequenced to high-coverage the ancient genome of the ~7,000 yo La Braña 1 AMH individual, which was previously sequenced only to lowcoverage 9 . To obtain a full modern methylation map, we produced whole-genome bisulfite sequencing (WGBS) methylomes from the bones of two present-day individuals (hereinafter bone1 and bone2). To this we added 53 publically available bone methylation maps from present-day individuals, produced using reduced-representation bisulfite-sequencing (RRBS) 10 and 450K methylation arrays 11,12 . The ancient and modern samples come from a variety of bones and teeth, from both sexes, from individuals of various ages and of different ancestries (Extended Data Table 1). Therefore, DMRs that are identified between the human groups are unlikely to stem from variability that is driven by any of these factors. As an outgroup, we produced six chimpanzee methylomes (WGBS, RRBS, and four 850K methylation arrays, Extended Data Table 1). Together, these data establish a unique and comprehensive platform to study DNA methylation dynamics at the population level in recent human history.
In order to minimize artifacts that might arise from comparing maps produced through different technologies, we used the reconstructed Ust'-Ishim methylome as the AMH reference, to which we compared the Altai Neanderthal and the Denisovan (see Methods). Over 99% of the genome showed no significant variation between the three samples, but 18,080 loci showed methylation differences separating these individuals. Notably, these DMRs do not necessarily represent differences between the human groups. Rather, many of them could be attributed to variability within the AMH population, within archaic humans, or between extant and extinct humans. To account for this, we used our large sample collection to filter out regions where such variability is detected, leaving us with a set of 6,371 DMRs that discriminate between human groups.
Finally, using the chimpanzee samples, we were able to polarize 3,869 of these DMRs into AMH-derived (1,667), archaic-derived (1,103), Neanderthal-derived (597), and Denisovanderived (502). These DMRs were ranked according to their significance level (Extended Data To gain insight into the function of these DNA methylation changes, we first analyzed the gene ontology (GO) of the AMH-derived differentially methylated genes (DMGs). As expected from a comparison between skeletal tissues, there are many enriched terms associated with the skeleton (e.g., chondrocyte differentiation, proteoglycan biosynthetic process, cartilage development, embryonic skeletal system development and ossification, FDR < 0.05). Also notable are terms associated with the skeletal muscle, cardiovascular and nervous system (Extended Data Table 3). To get a more precise picture of the possible functional consequences of these DMGs, we used Gene ORGANizer (geneorganizer.huji.ac.il, paper submitted). This is a phenotype-based tool that links genes to the organs where their phenotypes are observed, and allowed us to identify organs that are significantly over-represented. We tested our three hierarchies of AMH-derived DMGs in Gene ORGANizer and found that, regardless of the hierarchy level, genes that affect the voice are the most enriched (Fig. 1a, Extended Data Table   4). For example, when running the list of 881 skeletal AMH-derived DMRs, we identify 14 enriched body parts, with the strongest enrichment in the vocal cords (x2.18, FDR = 0.01), followed by the voice box (larynx, x1.73, FDR = 0.029) and then by body parts belonging primarily to the face, spine and pelvis (Fig. 1a, Extended Data Table 4). Interestingly, these parts are considered some of the most morphologically derived regions between Neanderthals and AMHs 13 . The voice-affecting DMGs (Table 1) were shown to shape voice production mainly through skeletal alterations of the larynx (where voice is produced) and vocal tract (the pharyngeal, oral and nasal cavities, where sound is filtered to specific frequencies). The phenotypes associated with these genes range from slight changes of the pitch and hoarseness of the voice, to a complete loss of speech ability (Table 1). When taking the top quartile of the most significant DMGs, the over-representation of voice-affecting genes becomes even more pronounced, with the vocal cords being enriched over 3-fold (FDR = 4.2x10 -3 ), and the larynx over 2-fold (FDR = 6.1x10 -3 , Fig. 1b, Extended Data Table 4). The enrichment of the larynx and the vocal tract within AMH-derived DMGs is also apparent when examining patterns of gene expression. The tissues where these DMGs are most enriched are the pharynx and larynx (x1.78, 4.9x10 -6 , x1.64, 5.3x10 -7 , respectively, Extended Data Table 5). Importantly, we ruled out the possibility that the enrichment of the larynx within AMH-derived DMGs is attributed to potential biases driven by inherent characteristics of genes affecting the voice (e.g., potential differences in gene length or genomic distribution). We found no enrichment when simulating DNA degradation and subsequent reconstruction of methylation maps (see Methods), nor do any of the other hominin branches show enrichment of voice-affecting genes (Extended Data Fig. 1, Extended Data Table 4). Finally, when comparing the chimpanzee samples to all human samples (archaic and AMH) to identify DMRs that separate chimpanzees from humans, we did not find any enrichment of genes that affect the voice, larynx or any parts of the vocal tract, supporting the notion that this trend emerged exclusively along the AMH lineage.
The ancient DNA methylation maps we use for DMR detection come from limb bones. At first glance, it may seem that the ability to make inferences about other body parts, such as the larynx, is limited. However, the laryngeal skeleton, and specifically the cricoid and arytenoid cartilages, which are central in vocalization, are very close developmentally to limbs, as both derive from the somatic layer of the lateral plate mesoderm. Furthermore, skeletal DMRs are largely devoid of regions that are variable across bones, and between bones and teeth, and are thus likely to be relevant to all skeletal parts. We conclude that voice-affecting genes are the most overrepresented DMGs along the AMH lineage, regardless of inter-skeletal variability, coverage by methylation array probes, the extent to which a DMR is variable across human or chimpanzee populations, or the significance level of the DMRs (see Methods).
To date, it is still unclear what enabled humans to evolve such exceptional speech and language abilities, and why other primates did not develop similar capabilities. The capacity of humans to communicate vocally is attributed not only to neural changes, but also to structural alterations to the vocal tract 14,15 . The relative role of anatomy in our speech skills is still under debate 14,16 , but it is nevertheless widely accepted that even with a human brain, other apes probably could not reach the human level of articulation 14,15 . Chimpanzees, for example, are restricted not only in their linguistic capacity (e.g., they can hardly learn grammar 15 ), but also in their ability to produce the phonetic range that humans can. As a result, chimpanzees communicate through sign language and symbols much better than they do vocally, even after being raised in an entirely human environment 15 . Phonetic range is determined by the different conformations that the vocal tract can produce. These conformations are largely shaped by the position of the larynx, tongue, lips and mandible. Modern humans have a 1:1 proportion between the horizontal and vertical dimensions of the vocal tract, which is unique among primates 14,17 (Fig. 1c). The extent to which this configuration is a prerequisite for speech is still debated, but it was nonetheless shown to be optimal for speech 14,[17][18][19] . It enables the tongue to move both vertically and horizontally within   the vocal tract, thus increasing the range of vocal tract shapes, and expanding the discriminable   phonetic repertoire 14,15,19 . The 1:1 proportion was reached through a relative shortening of the human face, together with the descent of the larynx 20 (Fig. 1c). In order to test whether this anatomy is shared by archaic humans, researchers turned to analyze Neanderthal archaeological remains 21 . However, cartilage and soft tissues do not survive long after death and the only remnant from the laryngeal region is the hyoid bone 15 . Based on this single bone, on computer simulations or on tentative vocal tract reconstruction, it is difficult to characterize the full anatomy of the Neanderthal vocal apparatus.
Thus, opinions remain split as to whether the Neanderthal, and even more so, the Denisovan, had similar vocal anatomy 15,21,22 . Thus, investigating archaic methylomes opens an opportunity to directly study the regulatory mechanisms that may underlie these changes.
To further examine DNA methylation changes in genes affecting vocal anatomy, we quantified the expanse of methylation change along the genome. To do so, we scanned the genome in windows of 100 kb and computed the fraction of CpGs which are differentially methylated in AMHs (hereinafter, AMH-derived CpGs). We found that this fraction is more than twice as high within genes affecting the voice compared to other genes (0.142 vs. 0.055, P = 3.7x10 -5 , t-test).
In fact, three out of the six body parts associated with genes with the highest fraction of AMHderived CpGs are in the laryngeal region (vocal cords: 0.142, FDR = 2.9x10 -4 ; epiglottis: 0.140, FDR = 4.4x10 -3 ; larynx: 0.133, FDR = 2.6x10 -4 , Extended Data Table 6). Moreover, three of the top five DMGs with the highest fraction of AMH-derived CpGs affect the laryngeal skeleton (ACAN, SOX9 and COL2A1, Fig. 2a,b). The fact that so many of the most derived genes affect the larynx is particularly surprising considering that only ~2% of the genome (502 genes) are known to affect it. Interestingly, the extra-cellular matrix genes ACAN and COL2A1, and their key regulator SOX9, form a network of genes that regulate skeletal growth, pre-ossification processes, and spatio-temporal patterning of skeletal development 23,24 . Although mainly involved in chondrogenesis, these genes were also shown to be active in osteogenic tissues of facial membranous bones. In late stages of development, SOX9 activity was shown to persist in chondrocytes of the larynx and in osteoblasts of the lower face 25 . Hypermethylation of the SOX9 promoter was shown to down-regulate its activity, and consequently, its targets 26 . SOX9 is also regulated by a series of upstream enhancers 27 . We show that its promoter and proximal (20kb upstream) enhancer 27 (covered by DMR #26), as well as its targets -ACAN (DMR #200) and COL2A1 (the most significant AMH-derived DMR, DMR #1)have all become hypermethylated in AMHs (Fig. 2c). Additionally, a more distant putative enhancer, located ~350kb upstream of SOX9, was shown to bear strong active histone modification marks in chimpanzee craniofacial progenitor cells, while in humans these marks are almost absent (~x10 stronger in chimpanzee) 28 . These epigenetic changes suggest that SOX9 became down-regulated along the AMH lineage, followed by hypermethylation and possibly down-regulation of its targets -ACAN and COL2A1 (Fig. 2c).
While this group of genes shapes many skeletal parts, the flattening of the face is the most common phenotype associated with their reduced activity. Heterozygous loss-of-function mutations in SOX9, which result in a reduction of ~50% in its activity, were shown to cause a retracted lower face, as well as affect the pitch of the voice 23,24 . ACAN was shown to affect facial prognathism and the hoarseness of the voice 29 . COL2A1 is key for proper laryngeal skeletal development 30 , and its decreased activity results in a retracted face 31 . Given that these genes have likely become down-regulated along the AMH lineage, and that they are key players in skeletal, and particularly facial development, we turned to investigate AMH-derived facial features. One of the main features separating archaic from modern humans is facial retraction. It was shown that the lower and midface of AMH is markedly retracted compared to apes, Australopithecines, and other Homo groups 20 . The developmental alterations that underlie the flattened ontogeny of the human face are still under investigation. Cranial base length and flexion were shown to play a role in the retracted face 20 , but reduced growth rate, and heterochrony of spatio-temporal switches are thought to be involved as well 32 . Importantly, SOX9 and COL2A1 were implemented in the elongation and ossification of the basicranium 33,34 , and SOX9 is a key regulator of skeletal growth rate, and the developmental switch to ossification 23,24 .
In order to further explore expression changes driven by changes in methylation, we focused on DMRs where methylation levels are strongly correlated with expression. Particularly noteworthy is NFIX, one of the most derived genes in AMH (Fig. 2b), which controls the balance between lower and upper projection of the face 35 . NFIX harbors two AMH-derived DMRs, where DNA methylation alone explains 73.9% and 81.7% of variation in its expression (FDR = 6.2x10 -3 and 7.5x10 -4 , Fig. 3a-c, see Methods). This relationship between NFIX methylation and expression was also shown previously 36 , and suggests that it became down-regulated through hypermethylation along the AMH lineage. Interestingly, NFI proteins were shown to bind the upstream enhancers of SOX9 37 , suggesting a possible mechanism to the simultaneous change in the voice-and face-affecting genes. In order to test whether these changes could explain morphological changes in AMHs, we examined the skeletal phenotypes that are driven by NFIX.
Mutations in NFIX were shown to be behind the Marshall-Smith and Malan syndromes, which include various skeletal alterations such as hypoplasia of the midface, retracted lower jaw, and depressed nasal bridge 35 , as well as limited speech capabilities 38 . In many cases, the syndromes are driven by nonsense-mediated decay, deletions or loss-of-function nonsense mutations, suggesting that the phenotypes associated with NFIX are dosage-dependent 35 . Given that reduced activity of NFIX drives these symptoms, a simplistic hypothesis would be that increased NFIX activity in the Neanderthal would result in changes in the opposite direction. Indeed, we found that in 18 out of 22 Marshall-Smith syndrome skeletal phenotypes, and in 8 out of 9 Malan syndrome skeletal phenotypes, the observed morphology is a mirror image of the Neanderthal. In other words, from the NFIX-related syndromes, through healthy AMHs, to the Neanderthal, the level of phenotype manifestation corresponds to the level of NFIX activity (Fig. 3c, Extended Data Table 7). Interestingly, many cases of laryngeal malformations in the Marshall-Smith syndrome have been reported 39 . Some of the patients exhibit positional changes of the larynx, changes in its width, and structural alterations to the arytenoid cartilagethe anchor point of the vocal cords, which controls their movement 39 . In fact, these laryngeal and facial anatomical changes are thought to underlie the limited speech capabilities observed in some patients 38 .

Discussion
We have shown here that genes affecting vocal and facial anatomy have gone through particularly extensive regulatory changes in recent AMH evolution. These changes are evident both in the number of genes that changed and in the extent of changes within each gene.
Interestingly, we found that it is not only at the DNA methylation level that such changes are detected. When using Gene ORGANizer to analyze genes associated with human accelerated regions (HARs, i.e., regions that are conserved across vertebrates but are significantly derived in humans 40 ), we found that the epiglottis and vocal cords are the most over-represented body parts (although the epiglottis enrichment is not significant; x2.28 and x1.91, P = 0.052 and P = 0.033, respectively). When examining primate accelerated regions, on the other hand, the representation of these body parts does not deviate from expectation (x1.03 and x1.04, P = 0.445 and P = 0.447, respectively).
We report here AMH-specific regulatory changes in genes affecting the larynx and face.
Importantly, most of the studies that linked these genes to the organs they affect were based on sequence, rather than epigenetic, changes. While heterozygous loss-of-function mutations could be paralleled to partial silencing of a gene and thus provide more direct evidence of expression changes in the genes, further study is nevertheless required in order to fully characterize the phenotypic effect of the changes we report.
Our analysis account for the fact that methylation in bone tissues changes throughout development and across bone types. First, both adult and juvenile AMHs have similar methylation patterns in the DMGs we report, and both adult and non-adult archaic and chimpanzee samples are differentially methylated compared to AMH samples. Second, these differences hold throughout a wide range of bone types (Extended Data Table 1). This suggests that the observed DMRs probably reflect a true AMH-specific evolutionary shift, rather than variability related to age or bone type. This is also supported by the phenotypic observation that facial prognathism in general, and facial growth rates in particular, are derived and reduced in AMH 41 .
SOX9, ACAN and COL2A1 are active mainly in early stages of osteochondrogenesis, making the observation of differential methylation in mature bones puzzling at first glance. This could be explained by two factors: i. The DMRs might reflect earlier changes in the mesenchymal progenitors of these cells that are carried on to later stages of osteogenesis. ii. Although these genes are downregulated with the progress towards skeletal maturation, they were shown to persist into later skeletal developmental stages in the larynx, vertebras, limbs, and jaws, including in their osteoblasts 25,42,43 . Interestingly, these are also the organs that are most affected by mutations in these genes 23,24,29-31 . 14.

Skeletal Methylation Maps
Previously, our ability to identify differentially methylated regions (DMRs) that discriminate between human groups was confined by three main factors: i. We had a single DNA methylation map from a present-day human bone, which was produced using a reduced representation bisulfite sequencing (RRBS) protocol, which provides information for only ~10% of the CpG positions in the genome. Moreover, the fact that the archaic and present-day methylomes were produced using different technologiescomputational reconstruction versus RRBSpotentially introduces a bias. ii. The analyses included only one bone methylation map from each of the human groups, which limited our ability to identify fixed differences between the groups. While dozens of maps from additional tissues in present-day humans were included in the analyses, this narrowed the DMRs to represent only human-specific changes that are invariable between tissues. iii. The work did not include a great ape outgroup. Thus, when an AMH-specific change was identified, it was impossible to determine whether it happened on the AMH lineage, or in the ancestor of Neanderthals and Denisovans 4 .
To overcome these obstacles, one of the goals of the current study was to significantly extend the span of our skeletal methylome collection, covering as many individuals, sexes, and bone types as we could. This included the generation of many new samples, including the high-coverage sequencing of additional ancient genome, as listed below.

Present-day human bone DNA methylation maps
We generated full DNA methylation maps in two healthy bones from present-day human using whole-genome bisulfite sequencing (WGBS). In addition, we collected more than 50 publically available partial skeletal methylation maps.

DNA Extraction
DNA was extracted from bones using QIAamp® DNA Investigator kit (56504, Qiagen). In brief, bones were cut to thin slices (0.2-0.5 mm) and then thoroughly washed (X5) with PBS, to clean samples from blood. Bones were crushed with mortar and pestle in liquid nitrogen and 100 mg bone powder was taken to extract DNA according to the protocol "Isolation of Total DNA from Bones and Teeth" of the DNA Investigator kit.WGBS

Partial skeletal DNA Methylation maps of modern humans
Osteoblast RRBS map 10 , extracted from the limb and rib bones of a 12 year-old female, was downloaded from GEO accession GSE27584. 48 450K methylation array maps, extracted from the femora of adult males and females 11 , were downloaded from GEO accession GSE64490.
Four 450K methylation array maps, extracted from unspecified bones of adult males and females 12 , were downloaded from GEO accession GSE50192.

Chimpanzee bone DNA methylation maps
Overall, we produced six methylation maps from bones of six common chimpanzee (Pan troglodytes) individuals. They include one WGBS of a wild chimpanzee, one RRBS of a captive chimpanzee, and four 850K methylation arrays of captive chimpanzees.

Ethics Statement
Chimpanzee tissue samples included in this study were opportunistically collected at routine necropsy of these animals. No animals were sacrificed for this study, and no living animals were used in this study.

Sample collection
We used two unidentified long bone fragments that belonged to a wild chimpanzee infant who Sampling and DNA extractions were conducted at the ASU Ancient DNA Laboratory, a Class 10,000 clean-room facility in a separate building from the Molecular Anthropology Laboratory.
Precautions taken to avoid contamination included bleach decontamination and UV irradiation of tools and work area before and between uses, and use of full body coverings for all researchers.
The bone sample was pulverized in December 2012 using a SPEX CertiPrep Freezer Mill. Three DNA extractions were conducted using 50-100 mg of bone powder (Extended Data Table 8) and

Samples collection
Four chimpanzee cadavers from captive colonies at the Southwest National Primate Research Center in Texas were used. Femora were opportunistically collected at routine necropsy of these animals and stored in -20°C freezers at the Texas Biomedical Research Institute after dissection.
These preparation and storage conditions ensured the preservation of skeletal DNA methylation patterns.

DNA extraction
DNA was extracted from the femoral trabecular bone using a phenol-chloroform protocol optimized for skeletal tissues 47 . From the distal femoral condyles, trabecular bone was collected using coring devices and pulverized into bone dust using a SPEX SamplePrep Freezer/Mill.
Specifically, bone cores were obtained from a transverse plane through the center of the medial condyle on the right distal femur, such that the articular surface remained preserved. Cortical bone was removed from these cores using a Dremel (Extended Data Table 9).

Genome-Wide DNA Methylation Profiling
Genome-wide DNA methylation was assessed using Illumina Infinium MethylationEPIC

Methylation Data Processing
Raw fluorescent data were normalized to account for the noise inherent within and between the arrays themselves. Specifically, we performed a normal-exponential out-of-band (Noob) background correction method with dye-bias normalization 48 to adjust for background fluorescence and dye-based biases and followed this with a between-array normalization method The probes on the arrays were designed to specifically hybridize with human DNA, so our use of chimpanzee DNA required that probes non-specific to the chimpanzee genome, which could produce biased methylation measurements, be computationally filtered out and excluded from downstream analyses. This was accomplished using methods modified from 53,54 . Briefly, we used blastn 55 to map the 866,837 50bp probes onto the chimpanzee genome (Assembly: Pan_tro_3.0, Accession: GCF_000001515.7) using an e-value threshold of e -10 . We only retained probes that successfully mapped to the genome, had only 1 unique BLAST hit, targeted CpG sites, had 0 mismatches in 5bp closest to and including the CpG site, and had 0-2 mismatches in 45bp not including the CpG site. This filtering retained 622,819 probes.
Additionally, β values associated with cross-reactive probes 56 , probes containing SNPs at the CpG site, probes detecting SNP information, probes detecting methylation at non-CpG sites, and probes targeting sites within the sex chromosomes were removed using the minfi package in R 50,51 . This filtering retained a final set of 576,804 probes.

Reconstructing of ancient DNA methylation maps
The Reconstruction procedure  Table 1).
Additional human ancient genomes have been published to date, however, these were sequenced to a relatively low coverage (<5x), and thus, only crude methylation maps could be reconstructed from them. CT ratio was computed for every CpG position along the hg19 (GRCh37) human genome assembly, for each of the samples, as previously described 4 .
In order to exclude from the analyses positions that potentially represent pre-mortem CT mutations, rather than post-mortem deamination, the following filters were applied: i. Positions where the sum of A and G reads was greater than the sum of C and T reads were excluded. ii. For the Vindija Neanderthal, this threshold was raised to 0.5, due to its relatively low coverage (~7x). iv. Finally, a maximum coverage threshold of 100 reads was used to filter out regions that are suspected to be PCR duplicates.
In all genomes, excluding Vindija, a fixed sliding window of 25 CpGs was used smoothing of the CT ratio. This allowed for an unbiased scanning of differentially methylated regions (DMRs) that is not affected by the size of the window. Due to its relatively low coverage, we extended the sliding window used on the Vindija genome to 50 CpGs. This extended window is not expected to introduce a bias, as this genome was not used for DMR detection, but only for subsequent filtering that was applied equally to all genomes (see later).
As previously described, CT ratio was translated to methylation percentage using linear transformation determined from two points: mean CT ratio in completely unmethylated (0% methylation) CpG positions in modern human bone reference (hereinafter μ0) was set to the value 0% methylation, and mean CT ratio in completely methylated (100% methylation) CpG positions in modern human bone reference (hereinafter μ100) was set to the value 100% methylation. Positions where CT ratio > μ100 were set to 100% methylation, and positions where CT ratio < μ0 were set to 0% methylation. For genomes that were extracted from bones, the modern osteoblast RRBS map was used as reference. For genomes that were extracted from teeth, there was no available modern reference methylation map, and therefore, we transformed the CT ratio into methylation percentage based on the assumption that the genome-wide mean methylation is similar to bone tissue. Thus, the genome-wide mean CT ratio represents 75% methylation, which is the genome-wide mean of measured methylation in the bone references.
This was accomplished by setting μ0 to 0, and setting μ100 to 1.33 x mean genome-wide CT ratio.

DMR-detection algorithm
We developed an algorithm specifically designed to identify DMRs between a deamination map and a full methylome reference. Let enumerate the CpG positions in the genome. In the deamination map, let be the number of T's at the C position + the number of A's in the opposite strand at the G position, i.e., it counts the total number of T's that appear in a position that is originally C, in the context of a CpG dinucleotide. We similarly use to count the total number of C's that appear in a position that is originally C, in the context of a CpG dinucleotide.
The CT ratio is defined as / , where = + . Let and (both between zero and one) be the methylation of this position in the reference genome and in the reconstructed one, respectively. If we denote by the deamination rate, assumed to be constant throughout the genome, and if we assume that deamination of C into T is a binomial process with probability of success , we get Our null hypothesis is that the th CpG is not part of a DMR, namely that = . The alternative hypothesis states that this CpG is part of a DMR. The definition of this statement is that | − | ≥ Δ, where Δ is some pre-specified threshold. In other words, under the alternative hypothesis we get that ≥ + Δ if the site has low methylation in the reference genome, and ≤ − Δ if it has high methylation in the reference genome.

Per-site statistic
Let us start with the first option, testing whether ≥ + Δ when is low. A log-likelihoodratio statistic would be ℓ + = ln Pr( | , ( + Δ)) Pr( | , ) Similarly, we can test whether ≤ − Δ when is high using the log-likelihood-ratio We used the value Δ = 0.5 for all samples. The value of , the deamination rate, was estimated using the overall CT ratio in CpG positions whose methylation level is 1 in the modern human RRBS methylation map, after exclusion of putative pre-mortem substitutions, as described in the DNA methylation reconstruction chapter (Extended Data Table 1).

Detecting DMRs
The statistics ℓ + and ℓ − quantify how strongly the estimated methylation in position deviates from . Next, we use these values to identify DMRs using the cumulative-sum procedure explained below. The process is repeated twice: on the statistic ℓ + to identify DMRs where the sample has elevated methylation with respect to the reference, and on the statistic ℓ − to identify DMRs where the sample has reduced methylation with respect to the reference.
For convenience, we explain the cumulative-sum procedure in the context of ℓ + , but an essentially identical procedure is used for ℓ − . We define a new vector + by the recursion 0 + = 0, + = max( −1 + + ℓ + , 0).
Under the null hypothesis, ℓ + has a negative expectation which produces a negative drift that keeps + at zero, or close to zero, levels. Under the alternative hypothesis the expectation is positive, hence the drift over a DMR is positive, leading to an elevation in the values of + .
Therefore, our next step is to find all intervals [ , ] such that −1 + = 0, +1 + = 0, and + > 0 for ≤ ≤ . Let + be the maximum value of + in this interval, where is the position of the maximum. Then, the interval [ , ] would be called a putative DMR.

Filtering DMRs
Of course, + may increase locally due to randomness, and thus a putative DMR may not reflect a true DMR. To filter out such intervals, we used two strategies. First, we applied a set of filters to assure that the putative DMRs have reasonable biological properties. Second, we cleaned the remaining putative DMRs by applying a false discovery rate (FDR) procedure. In the first strategy, we applied three filters: (i) Putative DMRs whose length was less than a 100 nt were removed. (ii) Putative DMRs that harbor less than 50 CpG positions, thus are shorter than twice the smoothing window size, were removed. (iii) To avoid situations where two consecutive CpG sites whose genomic locations are remote appear on the same DMR, we modify the vector + as follows. Let , be the distance along the genome (in nucleotides) between CpG sites and .
Then, for every site such that , −1 > we set + = 0. We used = 1000 nt for all samples.
To further remove putative DMRs that are unlikely to reflect true DMRs, we eliminated all putative DMRs where + < + . Here, + is the maximum value of + in the interval as defined earlier, and + is a threshold determined using a false discovery rate (FDR) procedure, see section "filtering out noise" below.

Testing the algorithm
To verify that the approach above results in a low number of false positives, we applied the procedure for deamination maps, when compared to themselves in the form of reconstructed methylomes. As expected, we obtained a negligible number of DMRs, ranging between 0.4% and 1% of the number of DMRs detected between the humans.

Two-way DMR detection
In order to avoid artifacts that could potentially be introduced by comparing DNA methylation maps that were produced using different technologies, our core analysis relied on the comparison of the three reconstructed maps of the Altai Neanderthal, Denisovan and Ust'-Ishim. These are all high-resolution maps that were derived from genomes sequenced to high coverage (Extended Data Table 1). In particular, the Ust'-Ishim methylome is of exceptional quality due to its high coverage and deamination rate (Extended Data Table 1). Also, going through the same postmortem degradation processes, the Ust'-Ishim cellular composition is likely to be similar to that of the Neanderthal and Denisovan.
In order for a deamination map to serve as a reference in the comparison, we have transformed its CT ratio values into methylation values (see "the reconstruction procedure" section above).
Therefore, the comparison of three genomes required a total of six two-way comparisons: Ust'-Ishim versus an Altai Neanderthal reference, Ust'-Ishim versus a Denisovan reference, Altai Neanderthal versus an Ust'-Ishim reference, Altai Neanderthal versus a Denisovan reference, Denisovan versus Ust'-Ishim reference, and Denisovan versus Altai Neanderthal reference. In order to remove potential bias that could be introduced through the comparison of a reconstructed methylation map to a deamination map, we ran each two-way comparison twice: once with the methylation map of sample 1 against the deamination map of sample 2, and once with the deamination map of sample 1 against the methylation map of sample 2. Overall, this resulted in twelve two-way comparisons. Because the DNA of these three individuals was extracted from both sexes, the DMR-detection algorithm was only applied to autosomal chromosomes.

Three-way DMR detection
In order to identify DMRs where one group of humans (hereinafter, hominin 1) differs from the other two human groups (hereinafter, hominin 2 and hominin 3), we set out to find those DMRs that were detected both between hominin 1 and 2, and between hominin 1 and 3. To this end, we compare the two lists (hominin 1 vs. hominin 2 and hominin 1 vs. hominin 3) and look for overlapping DMRs, as previously described 4 . An overlapping DMR exists when a DMR from one list partially (or fully) overlaps a DMR from the second list, and is constructed as follows (Extended Data Figure 4). The region of overlap between the two DMRs is taken as the core of the overlapping DMR. For a region that is included within the first DMR (hominin 1 vs. hominin 2) but not within the second DMR (hominin 1 vs. hominin 3), we used t-test to check whether methylation in hominin 3 clusters significantly closer to the hominin 2. If it does, the overlapping DMR was extended to include this region. An analogous test was used for regions that are included in the second DMR but not in the first. P-values were adjusted using FDR, and only regions with FDR < 0.05 were taken as part of the overlapping DMR.

Filtering out noise
There are different factors that potentially introduce noise into the reconstruction process. These include the stochasticity of the deamination process, the use of a sliding window to smooth the CT signal, and variations in read depth. In order to account for these factors and estimate noise levels, we ran simulations that mimic the post-mortem degradation processes of ancient DNA, then reconstructed methylation maps from the simulated deamination maps and finally compared them to the original map and identified DMRs.
The simulation process starts with a methylation map, where the measured or reconstructed methylation at position is and assumed the true methylation. Given that is the coverage at this position, we use the binomial distribution (1) to randomly draw -the number of C's that had become T's through deamination. The resulting 's, and their complement 's (where = − ) were then used to compute the CT ratios for each position, smoothed and filtered using the same sliding window and thresholds used in the original analysis, and linearly transformed to methylation percentages as explained above (hereinafter, simulated methylation map, Extended Data Fig. 2). Any differences in methylation levels between the simulated map and the original reference map stem from noise. Thus, running the same DMR-detection algorithm described above on the simulated map vs. the reference map, enables an estimation of the false discovery rate. We ran these simulations 100 times for each of the three genomes (Altai Neanderthal, Denisovan, Ust'-Ishim) and determined the value of the + and − thresholds (see section "filtering DMRs" above) such that the mean number of DMRs that are detected in the simulations is < 0.05 the number of real DMRs detected (i.e., FDR < 0.05).

Polarizing DMRs
DMRs where Ust'-Ishim differs from the Neanderthal and the Denisovan could either arose on the AMH branch, or in the ancestor of Neanderthals and Denisovans. In order to polarize the DMRs, i.e., allocating them to the branch in which the change occurred, we turned to the chimpanzee DNA methylation data. were placed on the Denisovan branch, 851 were deemed inconclusive, and for 107 we had no data in the chimpanzee WGBS map.
We next developed a second, stricter, polarization scheme by also using the chimp 850K DNA methylation arrays datasets. As the probes cover just part of the CpGs in a DMR, we need to adjust the DMR methylation level in order to allow a meaningful comparison of 850K methylation data to full methylation maps. If we mark by the CpGs in a DMR that are covered (2) This procedure was applied to DMRs covered by at least one probe (~65% of DMRs whereas in analyses where it is more important to minimize variability, or where we look at specific DMRs, we use the stricter approach. The chimpanzee RRBS data was adjusted using the same technique. However, it was not used for polarization, but rather only as a source for additional information on DMRs. This is because this protocol particularly targets unmethylated CpGs, and is therefore too biased for polarization.

Removing DMRs with high within-group variability
Our three-way DMR detection algorithm above produces a list of DMRs where one of the three The second approach adds to this the 52 450K methylation array samples. As described above, using also methylation probes for filtering DMRs provides more power, but can also introduce biases. Thus, this filtering was used for most analyses, except those where unbiased genomic distribution of DMRs is critical. Probe methylation data was corrected as described in equation (2) A general concern is working with DNA methylation data is that DMRs that are specific to one group do not necessarily represent an evolutionary change, but rather reflect a characteristic such as tissue, sex or age that is shared by individuals in this group and not by others. In this regard, it is important to mention that such a scenario is unlikely in our study: the chimpanzee samples, as well as the modern human samples, include both males and females, juveniles and adults, and samples that come from limbs, skull, rib and teeth. Thus, it is unlikely that the DMRs that differentiate these groups reflect variability that stems from these parameters 57 .

Comparison to previous reports
We have previously reported that compared to present-day humans, the HOXD cluster of genes is significantly hypermethylated in the Neanderthal and Denisovan 4 . Using the new methylation maps, we show that this observation holds (Extended Data Fig. 3). Adding chimpanzee data, we see that similarly to AMHs, chimpanzee samples are also hypomethylated compared to archaic humans. This suggests that the hypermethylation arose along the archaic-human lineage.
However, we find that the Ust'-Ishim individual is an outlier among modern humans, and that his methylation levels are closer to the Neanderthal than to modern humans, as was also shown by Hanghøj et al 58 . The Neanderthal and Ust'-Ishim individuals are found >2 standard deviations from the mean observed methylation in modern humans. This suggests that although the Neanderthal is hypermethylated compared to most modern humans, she is not found completely outside modern human variation. The Denisovan, on the other hand, is found even further away, and significantly outside the other populations. Given this, the HOXD DMR was classified as Denisovan-derived (Extended Data Table 2).
Compared to the previously reported DMRs 4 , in this study we found four times more AMH-and archaic-derived DMRs (3,442 compared to 891) and roughly twice as many Neanderthal-and Denisovan-derived DMRs (502 and 597 compared to 295 and 307 in the Denisovan and Neanderthal, respectively). We also found that ~20% of the previous list was identified here too.
There are several key factors that could underlie the differences in the reported DMRs: i. the current study used stricter thresholds for DMR detection, including a minimum of 50 CpGs in each DMR (compared to 10 CpGs previously), and a requirement for physical overlap in the three-way DMR detection procedure. ii. While in this study the AMH reference is a reconstructed ancient map, in the previous study the AMH reference, as well as the other tissues used for filtering out noise, were mainly cultured cell lines. iii. The previous study focused on DMRs that are invariable across tissues. In this study we focused on DMRs in skeletal tissues. In the previous study we were therefore able to extrapolate and find trends that extend beyond the skeletal system, such as neurological diseases. In this paper, we focus on the skeletal system, hence the different look of the body map (Figure 1b,c).

Computing correlation between methylation and expression
In order to identify regions where DNA methylation is tightly linked with expression levels, we scanned each DMR in overlapping windows of 25 CpGs (the window used for smoothing the deamination signal). In each window we computed the correlation between DNA methylation levels and expression levels across 21 tissues 59 . For each DMR, we picked the window with the best correlation (in absolute value) and computed regression FDR-adjusted P-value. DMRs that overlap windows with FDR < 0.05 were considered to be regions where methylation levels are significantly correlated with expression levels. 70 such DMRs were found among the skeletal AMH-derived DMRs, and 57 among the archaic human-derived DMRs, 22 among Neanderthalderived DMRs, and 12 among Denisovan-derived DMRs.

Studying the function of DMGs
Similarly to sequence mutations, changes in regulation are likely to be unequally distributed across different body systems, owing to negative and positive selection, as well as inherent traits of the genes affecting each organ. Thus, we turned to investigate which body parts are affected by the DMGs. To this end, we ran the lists of DMGs in Gene ORGANizer (geneorganizer.huji.ac.il, paper submitted), which is a tool that links genes to the organs they affect, through known disease and normal phenotypes. Thus, it allows to investigate directly the phenotypic function of genes, to identify their shared targets and to statistically test the significance of such enrichments. We ran the lists of DMGs in the ORGANize option using the default parameters (i.e., based on confident and typical gene-phenotype associations).  Table 4). For archaic-derived DMGs, endocrine glands such as the thyroid and hypothalamus were consistently the most enriched across analyses. The skeleton was enriched as well, particularly the limbs and face (Extended Data Fig. 1, Extended Data Table 4).
The Neanderthal-derived and Denisovan-derived DMG lists did not produce any significantly enriched organs.
In order to examine whether such trends could arise randomly from the reconstruction method, we repeated the analysis on the previously described 100 simulations. We ran all simulated DMGs (4,153) in Gene ORGANizer and found that no enrichment was detected, neither for voice-related organs (vocal cords: x0.99, FDR = 0.731, larynx: x.1.02, FDR = 0.966, epiglottis: x1.00, FDR = 0.966), nor for any other organ.
We next wanted to check the possibility that genes affecting the larynx and face tend to be longer than other genes, and are thus more likely to contain DMRs. We found that length of genes could not be a factor explaining the enrichment within genes affecting the larynx, as these genes tend to be shorter than other genes in the genome (mean: 62.5 kb vs. 73.2 kb, P = 0.001, t-test). Genes affecting the face, on the other hand, tend to be longer than other genes (mean: 77.1 kb vs. 65.6 kb, P = 4.6x10 -5 , t-test). To examine if this factor may lie behind the enrichment we observe, we repeated the analysis using only DMRs that are found within promoter regions (5 kb upstream to 1 kb downstream of TSS), thus eliminating the gene length factor. We found that the genes where such DMRs occur are still significantly associated with the face (P = 0.036, Fisher's exact test).
Gene ontology and expression analyses were conducted using Biological Process and UNIGENE expression tools in DAVID.

Computing the density of changes along the genome
To gain insight into the distribution of methylation changes, we computed the density of derived CpG positions along the genome in two ways. First, we used a 100 kb window centered in the middle of each DMR, and computed the fraction of CpGs in that window which are differentially methylated (i.e., are found within a DMR). Second, for the chromosome density plots, we did not center the window around each DMR, but rather used a non-overlapping sliding 100 kb window starting at position 1 and running the length of the chromosome. In order to avoid biases in genomic distribution of DMRs, in this analysis we used full bone AMH-derived DMRs, see "polarizing DMRs" and "Removing DMRs with high within-group variability" sections.

NFIX
Skeletal phenotypes that are associated with the Marshall-Smith syndrome were extracted from the Human Phenotype Ontology (HPO) 60 . Non-directional phenotypes (e.g., irregular dentition), and phenotypes that are expressed in both directions (e.g., tall stature and short stature) were removed.
Mutations in NFIX have also been linked to the Sotos syndrome. However, NFIX is not the only gene that was linked to this syndrome; mutations in NSD1 were also shown to drive similar phenotypes 35 . Therefore, it is less relevant in assessing the functional consequences of general shifts in the activity levels of NFIX. Nevertheless, it is noteworthy that in the Sotos syndrome