Evolutionary origin of genomic structural variations in domestic yaks

Yak has been subject to natural selection, human domestication and interspecific introgression during its evolution. However, genetic variants favored by each of these processes have not been distinguished previously. We constructed a graph-genome for 47 genomes of 7 cross-fertile bovine species. This allowed detection of 57,432 high-resolution structural variants (SVs) within and across the species, which were genotyped in 386 individuals. We distinguished the evolutionary origins of diverse SVs in domestic yaks by phylogenetic analyses. We further identified 334 genes overlapping with SVs in domestic yaks that bore potential signals of selection from wild yaks, plus an additional 686 genes introgressed from cattle. Nearly 90% of the domestic yaks were introgressed by cattle. Introgression of an SV spanning the KIT gene triggered the breeding of white domestic yaks. We validated a significant association of the selected stratified SVs with gene expression, which contributes to phenotypic variations. Our results highlight that SVs of different origins contribute to the phenotypic diversity of domestic yaks.

1.The authors start by recognizing that genes related to multifactorial traits are of particular interest to understanding the genetic bases of similar traits in humans (see lines 176-177) but this is quickly ignored as all the paper got focused on mendelian or near mendelian traits (1-2 genes explain the variance of the entire trait).I do not appreciate this sort of long-shot aim in a work that <i>per si</i> has the potential to provide more interesting findings regarding the evolutionary history of domesticated animals.This statement just belittles their work as they could not produce a single piece of evidence on the genetic architecture of such complex traits, and therefore, can be interpreted as not providing enough "novelty" that deserves publication.This is even more anecdotal in the following statement, at line 191, "a 277 bp insertion in an intron of the Disrupted-in-Schizophrenia-1 (DISC1) gene is frequent in domesticated yaks but rare in wild yaks".What is the meaning of this finding?does it contribute to the message of the manuscript? 2. Line 45.The authors refer that they have produced 28 de novo genomes among which 7 were Asian cattle (4 taurine, 2 zebus, and 1 hybrid).I understand that the hybrid is between cattle.But why they did not rather sequence cattle x yak hybrids? it would be more interesting to understand the recombination patterns of interspecific crosses with yak as the cattle hybrids are well covered.This would help to understand whether post-mating selection or gametic incompatibility would play a role in the frequency of some SVs among yak x cattle crossbreds.As far as I know the yak x cattle hybrids are more limited in terms of altitude adaptation but do better than pure cattle.Further down, line 126, some conclusions were drawn regarding the absence of some SV haplotypes in other analyzed species that were fixed in wild yaks.But the uneven number of samples on which they based this conclusion precludes this kind of "hard" statement ( i.e., 18 yaks, 2 bison, 7 wisents, 11 European cattle).How can be speculated that two bison genomes are enough to rule out the presence of an SV? 3. Judging by the title and abstract I was expecting more work and findings regarding major differences between domestic and wild yaks.Indeed, there is a growing body of uncertainty among the scientific community on whether or many pure wild yaks still exist.This work could offer a very good clarification of this issue if the authors focused on analyzing the SV differences between both.Often, these differences are pointed as the consequence of cattle introgression, and not just because of different selective pressures.I understand that finding differences between such close related taxa is way harder than finding differences between yak and cattle, but the findings would be also much more rewardable than the ones reported in this study, which by the way, are not very novel.minor suggestions refrain from abbreviating terms or words that are only mentioned once, e.g., line 178 large difference (FDR < 0.05) in allele frequencies (AFs).
Reviewer #2 (Remarks to the Author): The authors investigate the evolutionary origins of yak traits linked to structural variants (SV) through multi-species phylogenetic and pangenome analysis.To achieve this they de novo assemble multiple wild and domestic yak as well as Asian cattle.They combine their data with publicly available assemblies from diverse members of the bovini tribe to build a multi-species pangenome.SVs were genotyped using short-read data from a larger panel of individuals.SV associations and impacts were validated with orthogonal datasets.

Major comments:
The manuscript is well designed and thorough however the writing requires major revisions for clarity.It is not clear if all of the data used in this manuscript is publicly accessible.The manuscript needs a data availability section and should provide clear links to the data used.For instance, supplemental tables 1 and 2 should have a column with assembly accessions, SRA/ERA or equivalent numbers, or a link to the data source instead of referencing manuscripts or simply stating "this study".

Specific comments:
Line 60: Supplementary Table 4 should indicate which assemblies were from HiFi data.Line 62: "resulted in a panel of" A panel of what?Line 65: What is the origin of these SNPs?Methods are unclear.Line 66: typo, should be form not for.Line 74: soft core has a connotation in English that might be distracting.I suggest replacing with near-core or something similar.Line 95: What do you mean by "genotyped"?Line 105: Why obviously?Line 117: What is the description of the light line in figure 2e? Line 219: Since this section introduces introgressed SVs, I suggest you move it before the previous section "Contribution of SVs to yak domestication" so that the reader has a better understanding of the introgressed SVs mentioned there.This work reports 28 de novo genomes from five bovine species (mostly yak) which were analyzed together with published WGS data.The data and findings of the work are very interesting and novel as structural variants differences have been seldom analyzed to infer selection and hybridization signatures left in the genomes of domestic species by the process of environmental adaptation or human-guided crossings.In general, this MS read relatively well and is worthy of publication but not before moderate revision.Please find below my main suggestions and concerns: 1.The authors start by recognizing that genes related to multifactorial traits are of particular interest to understanding the genetic bases of similar traits in humans (see lines 176-177) but this is quickly ignored as all the paper got focused on mendelian or near mendelian traits (1-2 genes explain the variance of the entire trait).I do not appreciate this sort of long-shot aim in a work that per is has the potential to provide more interesting findings regarding the evolutionary history of domesticated animals.
This statement just belittles their work as they could not produce a single piece of evidence on the genetic architecture of such complex traits, and therefore, can be interpreted as not providing enough "novelty" that deserves publication.This is even more anecdotal in the following statement, at line 191, "a 277 bp insertion in an intron of the Disrupted-in-Schizophrenia-1 (DISC1) gene is frequent in domesticated yaks but rare in wild yaks".What is the meaning of this finding?does it contribute to the message of the manuscript?
[Response]: We realize that the previous lines 176-177 "Genes controlling aggression, tameness and sociability in domestic yaks are of particular interest and may shed light on the genetic control of similar traits in humans" may create expectations that are not fulfilled in the rest of the manuscript.However, we did not mention "multifactorial traits".In order to prevent a misunderstanding, we now have removed any reference to the multifactorial or complex nature or the traits, because it is not essential for our message.In addition, we added an introductory sentence (Lines 234-238) and have moved this section on early yak domestication (in the previous version starting with the sentence on lines 176-177) and now it follows the section on SVs of domestic yaks from cattle and the origin of white yaks.DISC1 is a marker gene for major psychiatric disorders in humans, and its mutations are associated with depression, schizophrenia, and bipolar disorder.Of course, it is entirely unclear if these disorders exist in other species than humans, but we did find that a 277 bp insertion in the DISC1 intron is frequent in domestic yaks, but rare in wild yaks.Combining our results with those of previous studies, we propose that mutations in the introns of DISC1 may have modulated behavior of yaks during the early stages of domestication.This has been clarified in the revised version (Line 250-267).
2. Line 45.The authors refer that they have produced 28 de novo genomes among which 7 were Asian cattle (4 taurine, 2 zebus, and 1 hybrid).I understand that the hybrid is between cattle.But why they did not rather sequence cattle x yak hybrids? it would be more interesting to understand the recombination patterns of interspecific crosses with yak as the cattle hybrids are well covered.This would help to understand whether postmating selection or gametic incompatibility would play a role in the frequency of some SVs among yak x cattle crossbreds.As far as I know the yak x cattle hybrids are more limited in terms of altitude adaptation but do better than pure cattle.Further down, line 126, some conclusions were drawn regarding the absence of some SV haplotypes in other analyzed species that were fixed in wild yaks.But the uneven number of samples on which they based this conclusion precludes this kind of "hard" statement (i.e., 18 yaks, 2 bison, 7 wisents, 11 European cattle).How can be speculated that two bison genomes are enough to rule out the presence of an SV?
[Response]: Thank you for this thoughtful comment.We agree that the recombination pattern of interspecific crosses and post-mating selection or gamete discordance are indeed very interesting.For example, we discovered that the whitening of the yak's coat is primarily caused by chromosome recombination resulting from genetic introgression from cattle.In addition, the yak has lower productivity than cattle.Crossing of yaks and cattle is popular in regions of intermediate altitude and results in a unique highly productive offspring, the male sterile dzo and the fertile female dzomo.However, the pattern of genomic recombination during interspecific hybridization is complex.For example, it is not clear if the sterility of the dzo is due to abnormal sex chromosome recombination.In subsequent studies we will follow-up on these valuable comments on the basis of long-range whole-genome sequences for the yak-cattle hybrids.Indeed, the yak x cattle hybrids have a better altitude adaptation than pure cattle but are kept at lower altitudes than yaks.In this and previous studies, we have found that genes involved in the hypoxia response pathway (including EPAS1, EGLN1, EGLN2 and HIF3a) have been introgressed from yak to Tibetan cattle, which may have facilitated the adaptation of yak to high altitude environments (https://doi.org/10.1038/s41559-018-0562-y;DOI: 10.1038/s41467-018-04737-0).
Regarding the comments of reviewer: on the uneven number of samples, we agree that our formulation caused a misunderstanding.As clarified in the revised text, the absence of the haplotypes in two American bisons is supported by their absence in other lowland bovine species, including 7 European bisons (Lines 124-126).

3.
Judging by the title and abstract I was expecting more work and findings regarding major differences between domestic and wild yaks.Indeed, there is a growing body of uncertainty among the scientific community on whether or many pure wild yaks still exist.This work could offer a very good clarification of this issue if the authors focused on analyzing the SV differences between both.Often, these differences are pointed as the consequence of cattle introgression, and not just because of different selective pressures.I understand that finding differences between such close related taxa is way harder than finding differences between yak and cattle, but the findings would be also much more rewardable than the ones reported in this study, which by the way, are not very novel.
[Response]: Thank you for this most interesting comment.Animal conservationists have estimated that there are only about 15,000 wild yaks in the world.To investigate whether pure wild yaks still exist, we carried out additional Admixture, TreeMix and D-statistic analyses, which indicate that wild yaks did not receive genetic introgression from domestic yaks or QTP cattle (Supplementary Fig. 4,5,.This is also supported by previously reported SNP-based clustering (https://doi.org/10.1038/ncomms10283;https://doi.org/10.1186/s12862-020-01702-8).
In addition, in Supplementary Fig. 8e, we did not find any genetic introgression between wild yaks and Tibetan plateau cattle.In fact, breeders often cross male wild yaks with domestic yaks to produce in order yak breeds.Wild yaks live in isolated areas and domestic yaks cannot approach them due to the presence of predators.In conclusion, we believe that pure wild yaks exist, although a larger sample of wild yaks is needed to verify this.
We agree that "Often, these differences between wild and domestic yak are pointed as the consequence of cattle introgression, and not just because of different selective pressures".In fact, this has been investigated in this study by selection of SV loci with different frequencies in wild and domestic groups (Fig. 3i).This method was mostly used to identify variable loci that differed between wild ancestors and domestic breeds (https://doi.org/10.1186/s13059-020-02169-y;https://doi.org/10.1038/s41467-022-33366-x). Less than half of the selected SVs (1100 out of 2354) were introgressed from cattle and we considered the other 1254 as being associated with the domestication.
"I understand that finding differences between such close related taxa is way harder than finding differences between yak and cattle, …".Indeed, the domestication of the yak around 7000 years ago took place later than the domestication of several other I hold numerous concerns regarding the utilisation of elementary F-statistics (Fst) at the interspecies level.Despite their foundation in F-statistics, there already exist derived parameters that aim to incorporate the genealogy of alleles (SNPs), such as ΦST.These derived parameters represent a middle ground in terms of their application of statistical measures at the inter-species level.While I comprehend the line of argumentation put forth by the authors, which essentially rests upon the precedent principle, I am uncertain whether the respective distinctions, mutatis mutandis, have been thoroughly considered to sufficiently justify the use of simple Wright's FST in this particular case.
Regrettably, I am unable to offer any alternative suggestions at this juncture, and the editor is left to make a decision unaided.
Lines 243-258: I'm confused about the roll of Cs29.It doesn't appear to matter which allele of Cs29 is present.Line 284: Link does not work.Non-exhaustive examples of text to rephrase for clarity: Lines 23you for your comments concerning our manuscript entitled "Evolutionary origin of structural variations in domestic yaks" (ID: NCOMMS-23-01736-T).All comments were valuable and very helpful for revising and improving our paper, and allowing us to indicate better the significance of our study.We have studied the comments carefully and have made several corrections, which hopefully meet with your approval.We highlight (in red font) the major revisions and indicate below the line numbers in the new version of manuscript.The main corrections to the paper and our responses to the reviewer's comments are as follows:Reviewer #1 (Remarks to the Author):