The genomic origins of the Bronze Age Tarim Basin mummies

The identity of the earliest inhabitants of Xinjiang, in the heart of Inner Asia, and the languages that they spoke have long been debated and remain contentious1. Here we present genomic data from 5 individuals dating to around 3000–2800 bc from the Dzungarian Basin and 13 individuals dating to around 2100–1700 bc﻿ from the Tarim Basin, representing the earliest yet discovered human remains from North and South Xinjiang, respectively. We find that the Early Bronze Age Dzungarian individuals exhibit a predominantly Afanasievo ancestry with an additional local contribution, and the Early–Middle Bronze Age Tarim individuals contain only a local ancestry. The Tarim individuals from the site of Xiaohe further exhibit strong evidence of milk proteins in their dental calculus, indicating a reliance on dairy pastoralism at the site since its founding. Our results do not support previous hypotheses for the origin of the Tarim mummies, who were argued to be Proto-Tocharian-speaking pastoralists descended from the Afanasievo1,2 or to have originated among the Bactria–Margiana Archaeological Complex3 or Inner Asian Mountain Corridor cultures4. Instead, although Tocharian may have been plausibly introduced to the Dzungarian Basin by Afanasievo migrants during the Early Bronze Age, we find that the earliest Tarim Basin cultures appear to have arisen from a genetically isolated local population that adopted neighbouring pastoralist and agriculturalist practices, which allowed them to settle and thrive along the shifting riverine oases of the Taklamakan Desert.


Environmental setting of Xinjiang
Xinjiang is situated in the western-most part of China and has a continental dry climate and a typical mountain-oasis-desert ecosystem.It is among the world's harshest inhabited terrains, and its annual mean temperature is 9-12℃.It consists of the Tarim basin, dominated by the world's second largest sand desert (Taklamakan desert), and Dzungaria, an area mixed of desert, steppe and forest.Dzungaria and the Tarim basin are separated by the Tianshan mountains, a mountain range that bisects Xinjiang in half and forms part of the Inner Asian Mountain Corridor.The Tarim basin is dominated by the Taklamakan desert, which is characterized by high temperature and low rainfall, whereas the climate of Dzungaria is wetter and colder than that of the Tarim basin.
Xinjiang is surrounded by three high mountain ranges: the Altai mountains to the north that border Mongolia and Russia, the Kunlun Mountains to the south rising from the Tarim basin's southern edge to the Tibetan plateau, and the Pamir mountains to the west that wall off the Central Asia, namely Kazakhstan, Kyrgyzstan, Tajikistan, Afghanistan and Pakistan.Only in the east is Xinjiang topographically contiguous with other regions, primarily through the Hexi Corridor, which has long served to link Xinjiang to agropastoral communities farther east in the Gansu-Qinghai region and beyond.This geographic position has historically made Xinjiang a critical bridge connecting east and west Eurasia, most famously during the height of the "Silk Road" (114 BCE to 1450 CE).However, Xinjiang's role in facilitating cultural contact and the transmission of ideas across Eurasia can be traced much further back into prehistory 1 .Although there is growing evidence (e.g., surface stone tools) that humans may have inhabited the region as early as 40,000 BP at sites such as Tongtiandong and Jilintai in northern Xinjiang 2,3 , the earliest human remains found thus far date only to the Early Bronze Age.

The Dzungaria Basin
The Dzungarian (Junggar) basin is located in north Xinjiang.At its center is the Gurbantunggut desert, which lies at an elevation of 200 meters above sea level (masl), while the surrounding mountains rise steeply to around 3,000 masl.The climate is hot and dry in summer and cold in winter, with an annual mean temperature of 5-8°C in the lowlands and desert and 0-2 °C in the mountainous areas.The basin has relatively abundant water sources and annual precipitation of 300 mm in the mountains, but which can reach up to 600 mm in some places.Snow and ice melt feed rivers that flow into Dzungaria and support oases and large grasslands along the mountain foothills 4 .Large numbers of graves dating to the Bronze Age and later that resemble those of the nomadic cultures of the Eurasian steppe (e.g., Afanasievo and Chemurchek) have been discovered around the Gurbantunggut desert, and longstanding nomadic and transhumant pastoralist economies (sheep, goat, cattle and horse herding) in the region suggests that it has deep connections with prehistoric Eurasian steppe populations 4 .In this study, we analyzed individuals from three Dzungarian sites, each associated with Afanasievo culture: Ayituohan, Songshugou, and Nileke.These are currently the oldest human skeletons excavated in the Dzungarian Basin.

Ayituohan
The Ayituohan cemetery is located in Habahe County, in the far north of the Xinjiang Uyghur Autonomous Region of China.It was discovered and excavated by the Xinjiang Institute of Cultural Relics and Archaeology in 2014 during the exploration of copper ores in the Ayituohan District of Habahe County.A total of 27 burials were excavated, and the individuals were dated to the first half of the third millennium BCE and are associated with the Afanasievo culture 5 .Two individuals (one male and a female) from the Ayituohan cemetery were subjected to shotgun sequencing and the male individual was directly radiocarbon dated (in bold).The female individual was dated based on the stratigraphy and archaeological contexts. 

The Tarim Basin
The Tarim basin is located in south Xinjiang and enclosed by the Tianshan Mountains to the north, the Pamirs to the west and the Kunlun Mountains to the south.At its eastern end lies the former Salt Lake and marshland of Lop Nur.In the center of the basin is the Taklamakan desert, the second largest shifting sand desert in the world.The basin has an elevation of 780 meters above the sea level and the climate is extremely dry since the mountains block out moist air from the sea.The annual rainfall is less than 100 mm.All the above factors help to create the extremely arid desert climate in the Tarim Basin, which is now largely inhospitable 8 .However, rivers running down from the surrounding mountains support small oases around the perimeter of the Tarim Basin, and some rivers that run deep into the Taklamakan desert enable small-scale farming and animal herding in this otherwise very hostile environment.Due to the arid environment of the Tarim Basin, a wide diversity of organic materials has preserved there, including Buddhist Tocharian manuscripts (500-800 CE) found in the northern and eastern reaches of the Tarim Basin, organically well-preserved plant, animal, and even microbial (e.g., kefir) remains 9,10 , and naturally mummified human remains, including the so-called "Beauty of Loulan", an Early Bronze Age mummy excavated at the site of Xiaohe in the eastern Lop Nur region of the Tarim Basin 11 .Recent archaeological studies in Xinjiang have found that several other cemeteries in the Tarim Basin, such as Gumugou near Xiaohe and Beifang in the Taklamakan hinterland, share common features (e.g.boat-shaped burials) with Xiaohe 12 .These sites have recently been classified as belonging to the Xiaohe archaeological horizon.In this study, we analyzed individuals from three Tarim basin sites: Xiaohe, Gumugou, and Beifang 12,13,14 .

Xiaohe
Xiaohe cemetery is located in the Lop Nur region of the Tarim Basin and was constructed on a salt-rich sand hill.The cemetery was discovered in 1910 by a local peasant, and later in 1934, the Swedish archaeologist Folke Bergman visited the cemetery and surveyed twelve graves 15 .The complete excavation of the Xiaohe cemetery was carried out by the Institute of Cultural Relics and Archaeology of Xinjiang between 2002 and 2005, during which a total of 167 graves were excavated.Five stratigraphic layers were identified spanning a period of 500 years.Two types of burials were encountered 12 .The majority of burials consisted of boatshaped wooden coffins, that were without bottom but covered with wooden lids and cattle hides.Many of the human remains associated with this burial type were naturally mummified, and the graves also included well-preserved textiles and organic objects 9,16,17 .
The second type of burial is consisting of wooden coffins covered with clay-lids 11 .This burial type was small in number (less than ten have been found so far) and was found only in the lowest (fourth and fifth) layers.In front of each coffin a wooden pole was erected, which for the female dead was frequently decorated with a round wooden post, while the male dead was frequently decorated by an oar-plank wooden post in front of the coffins.Many archaeologists considered that this may have a fertility significance.A wide array of burial goods have been excavated from the cemetery, including grass baskets (instead of pottery), arrows in male graves, and domesticated cereal grains of both eastern (e.g.millet) and western (e.g.wheat) Eurasian origins 17 .Up-to-date, archaeological reports for the uppermost three layers have been published, but until now there has been little publicly available on the older fourth and fifth layers.Cattle, goat and sheep remains were excavated from all five layers and it seems that Xiaohe people had a preference for cattle, as cattle were observed everywhere across the entire cemetery.For examples, only cattle hides were used to cover coffins, and the hearts of cattle were used to make cosmetics to paint the dead.Mummies with characteristic "European" cranial features were found across the five layers.All of the 11 samples analyzed in this study were from the fourth and fifth layers of Xiaohe cemetery.A total of 20 individuals were shotgun sequenced from Xiaohe, of which

Gumugou
Like Xiaohe, the Gumugou Cemetery is also located in the Lop Nur region of the Tarim Basin.It was excavated in 1979 under the direction of Binghua Wang from the Institute of Cultural Relics and Archaeology of Xinjiang.A total of forty-two graves were excavated in the Gumugou cemetery, and two types of burials were identified: a sun-radiation-spokes burial pattern and the main pattern similar to the boat-shaped burials found in Xiaohe 12 .Only six burials belonged to the sun-radiation-spokes pattern, and based on the stratigraphic information, this pattern is slightly later than the boat-shaped burials.The sun-radiationspoke burials are located in the northern part of the cemetery: at the center is a wooden rectangular coffin with a single male, which is surrounded by seven concentric ellipses made by wooden poles.Similar to Xiaohe, burial goods such as grass baskets with wheat and millet, cattle, sheep and goat as well as ephedra twigs were discovered.A total of three individuals were shotgun sequenced from Gumugou, of which one individual from a boatshaped burial yielded high quality data. GMGM1: 1884-1736 calBCE

Beifang
The Beifang cemetery is located in the heart of the Taklamakan desert and situated downstream from the ancient riverbed of the Keriya river.It is more than 500 kilometers southwest of the Xiaohe cemetery, and it is an even more isolated settlement than Xiaohe.
The cemetery was discovered in 2008 and has features resembling those found at Xiaohe.The Beifang cemetery was also constructed on a mound and consists of boat-shaped coffins with oar-shaped posts erected at the head of male burials and multi-angled posts erected at the head of female burials.Some organically preserved goods such as woolen caps, wooden sculptures of human figures, woven baskets, and food made from wheat and millet as well as animal remains (e.g.cattle) were discovered 18 .The mummies excavated from the Beifang cemetery have a strong physical resemblance to the Xiaohe mummies, including fair-colored hair, long and high-bridged noses, and deep-set eyes 19 .A total of four individuals were shotgun sequenced from Beifang, of which one yielded high-quality data. 11KBM1: 1785-1664 calBCE

Linguistic background of the population history in Xinjiang
Although many different languages through the ages have been spoken in the region, only one group is considered autochthonous, the Tocharian languages composed of Tocharian A and B. These languages are attested along the northern rim of the Tarim Basin during the middle of the first millennium CE.Using comparative methods, the common ancestor of these two languages, Proto-Tocharian, is estimated to have diverged at around 500 BC, presumably also in the northern Tarim Basin.By comparing its central vocabulary (e.g., numerals, pronouns, and central grammar), Proto-Tocharian reveals itself to be related to all other Indo-European branches (incl.Greek, Latin, Germanic, Slavic, and Indo-Iranian) through their ancestral language Proto-Indo-European, which was spoken no later than 3000 BCE in the Pontic-Caspian Steppe.This suggests that Tocharian language branch experienced a significant eastward migration of its speakers during prehistory 20 .Parts of this prehistory may be gleaned from contacts with other languages.For example, a number of rather distinctive features of the Tocharian branch (including both sound and grammatical system) suggest extensive and intimate contacts with a Uralic language (likely related or similar to Finnish, Hungarian, Samoyedic, and several other minor languages in Russia 21,22 ), while, on the other hand, certain Uralic lexical items (denoting, for example, numerals and metallurgy) appear to be of Tocharian provenance 23 .There are mutual borrowings between Proto-Tocharian and early Turkic (the predecessor of all modern Turkic languages, including Turkish, Chuvash, and Uighur), while Chinese elements within Tocharian 24 appear to postdate Tocharian borrowings into Chinese 25 .A small number of words has tentatively been associated with an extinct substrate language, possibly affiliated with the BMAC culture of Central Asia 26 .Indo-Iranian, the other Indo-European branch to migrate into Central Asia, similarly wielded influence on the Tocharian languages.Namely, their mainly lexical contributions are as complex as they are sustained, beginning with an unattested Old Iranian dialect already present in the Steppe Zone before Tocharian entered the Tarim Basin and continuing through the successive political, mercantile, and religious regimes of the Bactrians, Sogdians, Khotanese, and the Indic Prakrits that successively entered the region from the west 27 .

Detailed description of genetic isolation of the Tarim group
The Tarim_EMBA1 and Tarim_EMBA2 groups, although geographically separated by over 600 km of desert, cluster closely in PC space (Figs.1-2) and show greatly reduced interindividual genetic differences, as measured by extremely high outgroup-f3 values and low pairwise mismatch rate ("pmr") values of pseudo-diploid genotypes.The reduced genetic difference among the Tarim individuals is comparable to the level of 1 st degree relatives in other published Bronze Age populations from the Eurasian steppe 28 (Extended Data Fig. 4A,  C).However, a lack of long runs of homozygosity (ROH) segments expected for such close relatives suggests that a population bottleneck and not close kinship nor recent inbreeding is the likely explanation for the reduced genetic diversity (Extended Data Fig. 4B).Such observations were further supported by the fact that 12 out of 13 Tarim Basin individuals belong to a single mitochondrial haplogroup, C4 (Extended Data Table 1).Likewise, although limited in number, the two Xiaohe males belong to the Y-chromosome haplogroup R1b1c, which falls outside of the R1b1a clade representative of the Yamnaya and Afanasievo individuals (Extended Data Table 1; Extended Fig. 4D; Supplementary Data S1B).The Ychromosome of the Beifang male belongs to a basal R1 or R1b haplogroup but shares no derived allele with R1a or sublineage of R1b, similar to that of MA-1 (R*, xR1, xR2) 29 (Extended Data Table 1; Extended Fig. 4D).

Tarim mummies and the pre-pastoralist Central Asian genetic substratum
The Tarim mummies are among only a few known Holocene populations that derive the majority of their ancestry from Pleistocene ANE groups, who once made up the huntergatherer populations of southern Siberia, and which are represented by individual genomes from the archaeological sites of Mal'ta (MA-1) 29 and Afontova Gora (AG3) 30 .Interestingly, we observe that most Bronze Age and pre-Bronze Age populations with substantial ANE ancestry, such as Botai_CA from Eneolithic northern Kazakhstan, Kumsay_EBA and Mereke_MBA from western Kazakhstan, West_Siberia_N from Neolithic southern Russia, Okunevo_EMBA from the Minusinsk Basin, Chemurchek from EMBA Altai mountains, and Aigyrzhal_BA, Dali_EBA, Kanai_MBA from the IAMC region, show the highest outgroup-f3 value with Tarim_EMBA1, suggesting that the Tarim mummies are currently the best representative of the pre-pastoralist ANE-related population that once inhabited Central Asia and southern Siberia (Extended Data Fig. 2A), even though Tarim_EMBA1 postdates these populations in time.This calls for a revision of previous admixture models of these populations, which were based on Botai_CA or West_Siberia_N 31,32,33 , to include Tarim_EMBA1 as their ANE source (Supplementary Data S1F-H,I).
Applying qpAdm, we successfully modeled the high-ANE group West_Siberia_N as a mixture of Tarim_EMBA1 (67%) and Eastern European Hunter-Gatherers (EHG) (Supplementary Data S1I).Botai _CA shows a similar profile but requires an additional Eastern Eurasian contribution (5-12%) (Supplementary Data S1I; Extended Data Table 3).Chemurchek, Aigyrzhal_BA, and Dali_EBA fit to the three-way mixture of Tarim_EMBA1 (14-54%), Dzungaria_EBA1 (30-67%) and Geoksyur_EN (9-48%; a pre-BMAC proxy) while Chemurchek and Aigyrzhal_BA do not fit to Tarim_EMBA1+Afanasievo+Geoksyur_EN (Supplementary Data S1F-G).Mereke_MBA from western Kazakhstan also fit to Tarim_EMBA1+Dzungaria_EBA1 but not to Tarim_EMBA1+Afanasievo (Supplementary Data S1G).In contrast, Kumsay_EBA from western Kazakhstan fit to Tarim_EMBA1+Afanasievo+Geoksyur_EN but not to Tarim_EMBA1+Dzungaria_EBA1+Geoksyur_EN (Supplementary Data S1G; Extended Data Table 3).These results confirm that the genetic profile represented by the Tarim mummies is a good proxy for the ANE substratum that was once more widely distributed across pre-pastoralist Central Asia (Fig. 3C), and suggests that Dzungaria_EBA locally relayed Afanasievo ancestry into northern Xinjiang and its neighboring regions, and that IAMC/BMAC-related ancestry spread into Xinjiang independent of the dispersal of the Afanasievo herders.We find that the Chemurchek, an EBA pastoralist culture that succeeds the Afanasievo in both the Dzungarian Basin and Altai mountains, derive from these three ancestry streams, which helps to explain both the IAMC/BMAC-related ancestry previously noted in Chemurchek individuals 31 as well as their reported cultural and genetic affiliations to Afanasievo groups 33 .Importantly, however, none of these admixed groups were the source population for the Tarim Basin Xiaohe culture, who we instead find were a highly isolated local autochthonous population.The extreme genetic bottleneck specific to Tarim_EMBA may have been related to the EMBA colonization of this challenging desert environment.