Association of TLR2 haplotypes encoding Q650 with reduced susceptibility to ovine Johne’s disease in Turkish sheep

Ovine Johne’s disease (OJD) is caused by Mycobacterium avium subsp. paratuberculosis (MAP) and carries a potential zoonotic risk for humans. Selective breeding strategies for reduced OJD susceptibility would be welcome tools in disease eradication efforts, if available. The Toll-like receptor 2 gene (TLR2) plays an important signaling role in immune response to MAP, and missense variants are associated with mycobacterial infections in mammals. Our aim was to identify and evaluate ovine TLR2 missense variants for association with OJD in Turkish sheep. Eleven TLR2 missense variants and 17 haplotype configurations were identified in genomic sequences of 221 sheep from 61 globally-distributed breeds. The five most frequent haplotypes were tested for OJD association in 102 matched pairs of infected and uninfected ewes identified in 2257 Turkish sheep. Ewes with one or two copies of TLR2 haplotypes encoding glutamine (Q) at position 650 (Q650) in the Tir domain were 6.6-fold more likely to be uninfected compared to ewes with arginine (R650) at that position (CI95 = 2.6 to 16.9, p-value = 3.7 × 10–6). The protective TLR2 Q650 allele was present in at least 25% of breeds tested and thus may facilitate selective breeding for sheep with reduced susceptibility to OJD.

Our interest in TLR2 was based on three observations: (1) its increased mRNA abundance soon after MAP infection in sheep 22 , (2) its actively involvement in the innate immune response against tuberculosis 23,24 and paratuberculosis diseases 16,17 , and (3) its missense variant association with susceptibility to mycobacterial infections in humans and farm animals 25 .
In the present report, our aim was to identify and phase missense variants encoded by TLR2 in globallydiverse breeds of sheep, and evaluate those present in Turkish sheep for association with OJD. We identified 11 TLR2 missense variants in whole genome sequence (WGS) data in global populations of 221 sheep representing 61 breeds. PCR-sanger sequencing was used to focus on the four most relevant variants in a Turkish sheep cohort of six native, two crossbred, and three composite breeds from 11 previously described flocks 26 . The study design was a retrospective matched case-control with an indirect-enzyme-linked immunoassay (ELISA) to establish OJD serostatus. PCR-based Sanger sequencing was used to genotype the predominant four missense variants (K137R, D225A, R650Q, F670L) and analyzed by predicted propeptide haplotype with McNemar's test for association with OJD. Analyses indicated that ewes with one or two copies of the Q650 variant had a 6.6-fold reduced risk for MAP infections. This raises the possibility for use in selective breeding strategies designed to reduce OJD prevalence for disease eradication.

Results
OJD seroprevalence of Turkish native and composite sheep, and assembly of matched casecontrol pairs. Indirect-ELISA results revealed that OJD was wide-spread among all but the most isolated flocks of Turkish sheep. The mean seroprevalence was 7.3% among 2257 ewes from 11 flocks (Tables 1 and S1). The Cine Capari (flock 10) was the only flock free from the MAP infection, and the only flock were reared alone in an isolated upland region. The range of seroprevalence among infected flocks was 1.3% (Karakacan) to 29.4% (Chios) with the two milk breeds (Chios and Awassi) having the highest seroprevalence. From the above 2257 ewes, 120 pairs of seropositive and seronegative animals were identified, where each pair was matched for breed, sex, flock, and age as described in the "Materials and methods". Of these 120 matched pairs, 18 were eliminated from genetic testing due to difficulties in amplifying their DNA. The remaining 102 ewe pairs (Table 2) were tested for genetic association with TLR2 missense variants.
Identification of TLR2 missense variants in WGS from 221 sheep. The TLR2 gene is composed of two exons, with the second exon encoding the entire 784 amino acid propeptide sequence. Eleven TLR2 missense variants were identified in silico by aligning WGS from 221 sheep from 61 breeds around the world ( Fig. 1A and Table S2). Haplotype-phased propeptide variants (diplotypes) were unambiguously assigned for animals that were homozygous, or only had one heterozygous missense variant. In the 221 reference animals, 68% of them had unambiguous phased diplotypes (Table S3). Of the remaining 71 ewes with ambiguous diplotypes, 52 (73%) were heterozygous at only two positions and consistent with that of an animal heterozygous for the two most common haplotype alleles (TLR2 haplotypes "1" and "2" , Fig. 1B). These two haplotype alleles accounted for 83% of the total. The remaining 15 haplotypes were observed with frequencies between 0.03 and www.nature.com/scientificreports/ 0.003 (Table S4). A comparison of propeptide haplotypes encoded by TLR2 from closely related bovidae species indicated the ancestral root lies close to haplotypes "1" or "6" (Fig. 1B, Table S5). The presence of loop structures in the ovine TLR2 phylogenetic tree indicates the occurrence of either recombinant haplotypes or convergent mutations. For example, TLR2 haplotype "2" (A225, L670) has at least three evolutionary paths: meiotic recombination between haplotypes "5" (L670) and "6" (A225), an A225 mutation arising on haplotype "5", or a L670 mutation arising on haplotype "6". Although imperfect, organization of the predicted ovine propeptide variants encoded by TLR2 into a maximum parsimony phylogenetic tree, provides an initial framework for analyzing and interpreting TLR2 association testing for OJD disease susceptibility.

Identification of ovine TLR2 missense variants in 102 matched case-control Turkish sheep.
Resources were not available for PCR-based Sanger sequencing of the entire propeptide sequence encoded by TLR2 in the 120 matched pairs of Turkish sheep. Thus, the focus was on two regions containing the most polymorphic missense SNPs: K137R, D225A in the "L domain-like" region, and R650Q, F670L in the Tir domain region (PCR regions 1 and 2, Fig. 1A). Missense SNPs R563H and R697C were contained in these regions, but not polymorphic in the 120 matched pairs. DNA sequencing was successful for 102 of the 120 matched pairs and haplotype assignment was unambiguous for 115 of 204 ewes (56% , Tables S6 and S8). The remaining haplotypes were inferred by using the maximum parsimony tree in Fig. 1 as a framework for interpreting the most probable diplotypes and their frequencies. The resulting maximum parsimony tree for the 102 matched pairs of Turkish sheep was collapsed to seven nodes with propeptide haplotypes "1" and "2" comprising 85% of the total observed (Fig. 2). Based on the frequencies of the propeptide alleles, TLR2 haplotypes "1", "2", and "13" were tested individually as potential risk factors for OJD in 1-copy, 1-or 2-copy, and 2-copy allelic models. Haplotypes "1" and "2" did not reach the target criteria for highly significant results (Methods) in any allelic model, failing on either odds ratio (OR) or number of informative matched pairs (Table 3). Conversely, TLR2 haplotype "13" with Q650 was strongly associated with significance as a "protective factor" when one copy was present (OR = 0.095, p-value = 6.0 × 10 -5 (Table 3). However, the number of informative pairs for this test was 23, and landed short of the 25 informative pairs recommended for a robust McNemar's test. Nevertheless, a ewe with one copy of haplotype "13" was 10.5-fold less likely to be infected with MAP than it's matched seropositive flock mate (CI 95 2.4 to 45.4). The effect size of one copy of haplotype "13" was considered large as measured by Cohen's g (0.41). Since haplotype "13" contained the novel Q650 missense mutation, the association of all haplotypes containing Q650 (i.e., "13", "15", and "17") was also tested as a group (Fig. 2, gray shaded area). When these haplotypes together, the significance of the association increased when one or two copies of any of these Q650-containing haplotypes were present (OR 0.15, p-value = 3.7 × 10 -06 , Table 3). This result exceeded the 95% power expectations for detecting an effect with 37% discordant pairs (38/102) with a two-sided McNemar test at a significance level of 0.01. Thus, a ewe with one or two copies of Q650 from derived from any of haplotypes "13", "15", and "17", was 6.7-fold less likely to be infected with MAP than it's matched seropositive flock mate (CI 95 2.6 to 16.9). Like that of TLR2 haplotype "13" alone, the effect size of one or two copies of Q650 was considered large as measured by Cohen's g (0.37). The distribution of TLR2 Q650 haplotypes in domestic sheep was limited mostly to breeds native to the Fertile Crescent region. The threeTLR2 Q650 haplotypes associated with OJD in Turkish sheep ("13", "15", and "17") were detected in a total of eight breed groups: Awassi, Bandirma, Chios, Imroz, Karakacan, Kivircik, Karacabey Merino (crossbred of Kivircik), Santa Inês (Table S4). These haplotypes were the most prevalent in native Turkish breeds and these breeds were the likely source of Q650 alleles in improved or composite breeds such as Karacabey Merino, Ramlic, Hampshire crosses, and SBA which were developed backcrossing by European originated terminal breeds. There were three additional TLR2 Q650 haplotypes ("3", "10", and "11") that were not present in the matched case-control pairs but detected in nine additional breeds: Bangledeshi, Castellana, Changthangi, Dorper, Eastern Tibetan, Garole, Santa Inês, Sumatra, and White Dorper. The effect of these Q650-containing TLR2 haplotypes is unknown, however it is reasonable to hypothesize they may be associated with reduced susceptibility to OJD.  7  14 11 13 4 3 ----52   Bandirma  2  -6  1  1 3 1 ---14   Kivircik  3  1  2  4 1 -----11 www.nature.com/scientificreports/

Discussion
This report describes the association of TLR2 haplotypes encoding Q650 with reduced susceptibility to ovine Johne's disease in Turkish sheep. We combined a candidate gene approach with a retrospective matched case-control design to evaluate propeptide haplotypes encoded by TLR2 for association with OJD in Turkish sheep. The    www.nature.com/scientificreports/ sheep and did not have the Q650 allele. Isolation from other infectious sheep is the most likely explanation of the negative MAP status of the Cine Capari and the purpose of this flock's isolation. The vast majority of seropositive ewes in this study appeared to be clinically normal which is typical of MAP infections. The serological results suggest that all but the most isolated Turkish flocks are infected with MAP and at risk for OJD. This is consistent with other reports of OJD prevalence around the world [27][28][29] . A critical step in evaluating candidate genes for association with traits is defining the spectrum of protein variants 30,31 . Gene function in livestock may be affected by a wide range of large and small scale genomic sequence differences 32,33 . However, variants that alter amino acid sequences via missense, nonsense, frameshift, and splice site variants, are readily detectable in sheep WGS and among those most likely to affect function 34 . Thus, defining the prevalence and diversity of the predicted propeptide haplotypes encoded by TLR in global sheep populations was important for understanding function and identifying which breeds are affected. This report identified 11 missense variants comprising 17 predicted propeptide haplotypes encoded by TLR2 in 221 sheep from 61 breeds from around the world and in Turkish sheep. Frameshift, splice site, and nonsense mutations were not detected in this study, nor were the R315W or the R723H variants reported in dorper sheep 35 . Organizing the phased propeptides encoded by TLR2 into a rooted phylogenetic tree in global sheep populations was important for inferring phased diplotypes in Turkish sheep and gaining insight into the evolutionary history of the coding sequence. The phylogenetic tree structure had multiple loops involving infrequent haplotypes and was suggestive of recombination within the TLR2 exon 2 and consistent with directional selection away from the root, and towards more recent haplotypes.
A retrospective matched case-control design, combined with McNemar's test for correlated proportions, has been a successful approach for genetic association studies with other chronic infectious diseases in sheep 26,36 . The pairwise identification of affected and unaffected sheep matched for age, year, sex, breed, flock, and location appears to effectively control for confounding factors like pathogen exposure, pathogen strain, exposure duration, and admixture. The present study had good statistical power to evaluate the three most frequent TLR2 propeptide haplotypes in 102 matched pairs of Turkish sheep (haplotypes "1", "2", and "13", Fig. 2). Only TLR2 haplotype "13", with its Q650 variant was significantly associated with OJD. This observation was reinforced in combined analyses with the other TLR2 haplotype harboring the Q650 variant (haplotypes "15" and "17"). This suggests that selective breeding for TLR2 haplotypes "13", "15" and "17" in Turkish flocks may reduce the genetic susceptibility of animals to OJD. Moreover, increasing the frequency of TLR2 Q650 variants in Turkish flocks may reduce the incidence of OJD in affected flocks.
The impact of the Q650 variant on the TLR2 protein function is unknown. The missense variant is located in the cytoplasmic TIR domain and reduces the net negative charge by one compared to the R650 residue. TIR domain interactions between host cellular receptors and adaptors are important for activating conserved cellular signal transduction pathways in response to bacterial and viral pathogens, cytokines and host growth factors 37 . However, it is also possible that the Q650 variant is linked to a nearby, causal mutation that does not affect the primary sequence of the TLR2 protein. There are a number of related species that suffer from Johne's disease, yet have Q650 as their predominant residue at that position, including bighorn sheep 38 , goats, and water buffalo 39 (Table S7). Moreover, sequencing the TIR region of TLR2 in 14 saanen goats with OJD showed all were homozygous Q650 (data not shown). The most recent common ancestor (MRCA) between goats and sheep is approximately 10 million years ago 40 and thus the Q650 allele may be rather ancient. Species alignments with TLR2 propeptide sequences show the Q650 residue present in ruminants, cetacea, ungulates, primates, rodents, amphibians and jawless vertebrates such as lamprey. The latter shares a MRCA approximately 615 million years ago 40 with sheep (Tables S5 and S7). Furthermore, the TLR2 Q650 variant is present in the human 1000 genomes data set with < 0.01 frequency (rs200483398) and listed as "neutral" and "well tolerated" 41 . Thus, if the presence of the ovine Q650 residue is responsible for the association of TLR2 with OJD in Turkish sheep, perhaps it is also dependent on a limited number of combinations of the other 783 residues encoded by TLR2.
The occurrence of Q650 variants was previously reported in Djallonke, Dorset, and Red Maasai 35 , however their frequencies and phased diplotypes were not provided for these sheep, and thus could not be compared here. Although, breeds like Bangledeshi, Castellana, Changthangi, Dorper, Eastern Tibetan, Garole, Santa Inês, Sumatra, and, White Dorper may provide a potential source for TLR2 Q650 alleles, it is unknown whether haplotypes "13", "15", and "17" are associated with reduced susceptibility to OJD in those breeds, and whether other haplotypes containing Q650 (e.g., "3", "10", and "11") are also associated with infection. Thus, more studies are needed to determine whether the results observed in the present report may be extended to other populations and production environments.
In conclusion, TLR2 haplotypes encoding Q650 were associated with reduced susceptibility to ovine Johne's disease in Turkish sheep. Ewes with one or two copies of the Q650 variant on haplotypes "13", "15", and "17" had a 6.6-fold reduced risk for MAP infections. This suggests that selection for TLR2 Q650 alleles in Turkish sheep may be useful for reducing OJD prevalence in Turkish sheep. Moreover, it raises the possibility that TLR2 haplotypes encoding Q650 in other breeds or species may affect susceptibility to MAP infections and Johne's disease. However, further research is needed to replicate the present results in other affected flocks segregating these TLR2 haplotypes encoding Q650 alleles. Pairwise ewe matching. Controlling for pathogen exposure is essential for successful genetic association studies involving susceptibility to infectious disease. Retrospective matched case-control designs can help control for exposure intensity and duration to persistent infectious diseases like OJD. MAP exposure was controlled by using ewes older that two years and matching for sex, age, breed, flock, and location. This also accounted for possible population stratification. Where possible, the age criterion included having the seronegative ewe be the same age or older than the seropositive pair mate to allow for equivalent or increased exposure. Thus, the aim was to match a seropositive ewe with a seronegative ewe from the same flock, same breed type, and same or older age (Table S9). Ewe age was not available for some private flocks, and thus approximately 10% of the pairs were not ideally matched for MAP exposure duration.

PCR-based Sanger sequence genotyping.
To investigate variable regions of the ovine TLR2 coding sequence, two separate PCR was carried out using the designed primers; TLR2_F_1: GGG GGC CAA TGA AAT TCA CAC, TLR2_R_1: GTC AGT GCT GTA AAA TCG CCA and TLR2-F_2: ACC ACT CGC TCC TCA CAA AG, TLR2-R_2: GAC TTC CTG TCC TTC GCA CA, thus, 1168 bp of the 2355 bp coding region of ovine TLR2 was amplified in total. The TIR region of caprine TLR2 was amplified with the same primers. PCR products were sequenced with following protocols; exo-sap incubation, chain termination, ethanol purification, and capillary electrophoresis (Applied Biosystems ABI 3500).
Assigning haplotype phase and assembling a phylogenetic tree. Haplotype-phased propeptide variants predicted to be encoded by TLR2 were unambiguously assigned in individuals that were either: (1) homozygous for all 11 missense variants, or (2) had exactly one heterozygous missense variant. A maximum parsimony phylogenetic tree was manually constructed from the unambiguously phased protein variants. This tree was used, together with simple maximum parsimony assumptions, to infer TLR2 haplotype phase in ewes where two or more heterozygous missense variant sites occurred in. The protein phylogenetic trees were rooted by comparing the 11 missense variant sites in sheep to the equivalent 11 sites in other species, starting with those most closely related. The ovine peptide sequence encoded by TLR2 was used to search NCBI's refseq_protein database with BLASTP (version 2.6.123 43 ). Aligned protein sequences from a representative subset of 33 species throughout the vertebrata subphylum were used for the comparison.
McNemar's test for correlated proportions. Propeptide isoforms predicted to be encoded by TLR2 (i.e., haplotype alleles) were tested as possible disease risk factors in three different models: 1-copy, 1-or 2-copy, and 2-copy settings. The presence (+) or absence (−) of a possible genetic risk factor was assigned to each animal and the matched pair was then classified by one of four possible binomial outcomes (Table 4).
In McNemar's test, only the discordant pairs are informative (i.e., quadrants b and c). A genetic risk allele was assigned when quadrant b was greater than quadrant c (i.e., OR > 0). A protective allele was assigned when quadrant c was greater than quadrant b (i.e., OR < 0). A power analysis was performed using G*Power v3.1.9.4 software 44 to evaluate the statistical strength of the study. The target criteria for highly significant results were an OR greater than or equal to 3 and alpha of 0.05, and at least 25 total informative pairs (i.e., (b + c)/n). Although not all haplotypes tested had 25 informative pairs, the mid-p was used since it performs well in these situations compared with the asymptotic, asymptotic with continuity correction, and exact conditional tests 45 . Cohen's P (c/[b + c]) and Cohen's g (Cohen's P-0.5) were used to roughly estimate the effect size 46  www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.