Scanning the genomes of parents for imprinted loci acting in their un-genotyped progeny

Depending on their parental origin, alleles at imprinted loci are fully or partially inactivated through epigenetic mechanisms. Their effects contribute to the broader class of parent-of-origin effects. Standard methodology for mapping imprinted quantitative trait loci in association studies requires phenotypes and parental origin of marker alleles (ordered genotypes) to be simultaneously known for each individual. As such, many phenotypes are known from un-genotyped offspring in ongoing breeding programmes (e.g. meat animals), while their parents have known genotypes but no phenotypes. By theoretical considerations and simulations, we showed that the limitations of standard methodology can be overcome in such situations. This is achieved by first estimating parent-of-origin effects, which then serve as dependent variables in association analyses, in which only imprinted loci give a signal. As a theoretical foundation, the regression of parent-of-origin effects on the number of B-alleles at a biallelic locus — representing the un-ordered genotype — equals the imprinting effect. The applicability to real data was demonstrated for about 1800 genotyped Brown Swiss bulls and their un-genotyped fattening progeny. Thus, this approach unlocks vast data resources in various species for imprinting analyses and offers valuable clues as to what extent imprinted loci contribute to genetic variability.

ScIeNtIFIc REPORTS | (2019) 9:654 | DOI: 10.1038/s41598-018-36939-3 model and regression 33 ). Ordered genotypes also form the basis for genomic relationship matrices that account for imprinting 34 . To determine the ordered genotype of an offspring, conclusions on the parental allele origin can usually be drawn from the genotypes of its parents. However, if both parents are heterozygous, it is not possible to determine the offspring's ordered genotype 32 . Then, if available, markers adjacent to the considered locus (haplotype information) may indicate the parental allele origin and thus the offspring's ordered genotype. Furthermore, to detect iQTLs, phenotypic information is necessary in addition to ordered genotypes. This is a common issue in livestock, where the traits of interest are often measured in progeny which are often ungenotyped (e.g. milk production traits 35 ) in contrast to genetically influential parents, such as bulls in dairy cattle. Therefore, there is a large amount of existing data that could be used for imprinting analyses.
To overcome the limitations of the currently applied methods of analysis for the detection of iQTLs, we propose a novel approach where estimated POEs (ePOEs) of parents are exploited to map imprinted genes expressed in their progeny. In the first step, phenotypic information can be summarised by applying a mixed imprinting model (first described in Blunk et al. 27 ) that describes POEs as differences between two sex-specific transmitting abilities (TAs) for each parent. In the second step, the ePOEs are used as dependent variables in a model, where un-ordered genotypes of parents constitute the explanatory variable related to a particular fixed marker effect and a random polygenic effect accounts for the unconsidered genetic variability in the POEs. This novel approach does not require ordered genotypes or both genotypes and phenotypes from the same individuals. This principle is demonstrated in simulated data and is applied to empirical data consisting of genotyped Brown Swiss sires with ePOEs for slaughter traits. First, we provide theoretical justification by investigating the regression of POEs on the number of B-alleles at an imprinted locus in a random mating population.

Theory
Here, we consider the regression of POEs on the number of B-alleles as a population parameter for a biallelic imprinted locus in a random mating (Fisher-Wright) population without selection and mutation. The alleles B and b have respective frequencies of p and q = 1 − p. Differences between the phenotypic means of the ordered genotypes (BB, Bb, bB, bb) are described by the additive effect a (BB − bb = 2a), the dominance effect d (0.5(Bb + bB) = d; assuming a centering of genotypes at the midpoint of the two homozygotes) and the imprinting effect i (Bb − bB = 2i), as described in Knott et al. 36 , Mantey et al. 37 and the review by Lawson et al. 1 . Note that a somewhat different parameterization has been given by Spencer 38 . Parent-specific allele-substitution effects 39 are given for male parents as α α = + + = + a q p d i i ( ) p and for female parents as α α The quantity α = +a q p d ( ) is known as the allele-substitution effect in a standard Mendelian case 40 . In the presence of genomic imprinting, a locus exerts its effect on the offspring either under a male or a female expression pattern. In the former, the average phenotype of a parent's offspring equals half of its breeding value 'as father' and in the latter, half of its breeding value 'as mother' 24,25 . For the ordered genotypes BB, Bb, bB and bb, where the first allele is paternally derived, these breeding values can be expressed as functions of parent-specific substitution effects and allele-frequencies (Table 1). By following the definition in previous studies on variance components (e.g. Blunk et al. 27 ), we define genotype-specific POEs as differences between the two sex-specific TAs for each genotype or, equivalently, as half the difference between both kinds of breeding values. For the four ordered genotypes, the POEs are 2qi, (q − p)i, (q − p)i, and −2pi, in the same order as above (Table 1).
For our purposes, we need to know the expectation and variance of the POEs in the population. The expectation is E(POE) = p 2 2qi + 2pq(q − p)i − q 2 2pi = 0, and the variance is given by Var(POE) = p 2 4q 2 i 2 + 2pq(q − p) 2 i 2 + q 2 4p 2 i 2 = 2pqi 2 , which is equal to the imprinting variance at the locus, as derived by de Koning et al. 39 . Moreover, we are interested in the relationship between the number of B-alleles (gene count, gc), which has the expectation 2p and variance 2pq, and POEs. Their covariance can be derived as: which translates into a correlation of r = 4p 2 q 2 i 2 . In summary, for a single imprinted locus, we have the covariance matrix: Finally, we arrive at the regression of the POEs on gene count, which is: genotype frequency gene count BV as father BV as mother POE This final result shows that the regression of the POEs on the number of B-alleles of the same locus equals the imprinting effect. Next, we demonstrate how estimates of this population parameter can serve for marker-based imprinting analyses of simulated and real data.

Data Background
Simulated data. Importantly, POEs can be estimated for genotyped parents even when their phenotypes are not known, as it is often the case in livestock. In practice, ePOEs can first be obtained 24,25,27,28 and, after a deregression step, be regressed on the number of B-alleles at a marker locus. In the light of the results for the single locus model, the resulting regression coefficient can then be interpreted as the imprinting effect of an iQTL in linkage and/or associated with the marker. Reliabilities of ePOEs are a prerequisite for deregression 41 . Only recently, their efficient calculation using mixed-model methodology has been described 27,28 .
We simulated two types of populations; the first comprised two generations, while the second contained three generations. Phenotypes were always available for the last generation only, while parents and grandparents remained without known phenotypes. The two-generation population was simulated to prove the applicability of the principle. One hundred unrelated parents were drawn from a base population. These parents were then inter-mated in a cross-classified manner, resulting in 100 × 100 full-sib families of size three. In this way, each parent was mated as father to all 100 parents (including itself), which acted as mothers. Vice versa, each parent was mated as mother to all 100 parents (including itself), which acted as father. This gave rise to 100 × 3 maternal half-sibs per parent. Thus, imprinted alleles were passed from each parent to progeny with both of the two possible parental expression patterns.
In the three-generation pedigree, sources of family information contributed to the ePOEs of the individuals in generations one and two. These sources were the parent-average (PA), records of the final progeny and records of the progeny's final progeny. The naïve utilization of ePOEs as dependent variables would negatively influence the outcome of association studies as shown for breeding values in Ekine et al. 42 . Thus, deregressed and weighted ePOEs, free from the influence of the PA, were needed. The three-generation population was simulated to illustrate the effects of the PA-correction, deregression and weighting. Ten grandparents were randomly chosen from the base population and produced 100 full-sib families, whose variable size was chosen from a Poisson distribution with a mean of 5.0. This led to 514 parents in the second generation, which were again inter-mated in a cross-classified manner. The number of full-sibs per family in the third generation was also Poisson distributed with a mean of 5.0. Only a fraction of all possible progeny (3%) with phenotypes was retained so that 41,273 records remained for the analysis. The full-sib families contained one to four phenotyped offspring each.
In both population types, 15 mutually unlinked marker genotypes were simulated, where five were in linkage disequilibrium (LD) with QTLs. The LD was created by first drawing the QTL alleles from a binomial distribution depending on their allele frequencies. The alleles of the adjacent markers were then drawn from a conditional distribution, where an initial LD of 0.1 among founders was assumed. The first QTL at marker locus two (minor allele frequency [MAF] = 0.4) was chosen to be biparentally expressed (Mendelian QTL) contributing 30% to a total additive genetic variance of 1.35. A purely paternally and a purely maternally expressed QTL each contributed 5% to the total additive genetic variance and 25% to the total imprinting variance of 0.27. These were linked to the marker loci five (MAF = 0.5) and eight (MAF = 0.5). The marker loci 11 (MAF = 0.5) and 14 (MAF = 0.5) were both linked to partially imprinted QTLs with opposite POEs, which contributed 30% to the total additive genetic variance and 25% to the total imprinting variance. A residual variance of 2 generated a heritability of 0.40 for the simulated trait, and the imprinting variance overall accounted for 20% of the total additive genetic variance. The trait was created by adding additive and imprinting effects according to the simulated QTL genotypes. All genetic distances between associated markers and QTLs were set to 10 centiMorgans in the base population. For each of the two populations, 100 repetitions were simulated by drawing new genotypes and residual effects without altering family sizes or pedigree structures.
Next, we compared the results for imprinted and un-imprinted (Mendelian) QTLs under both our new and the traditional approach. Therefore, four different kinds of analyses, labelled 1A, 1B, 2A and 2B, were applied to the two-generation data. First, in analyses 1A and 1B, we made use only of the un-ordered genotypes and ePOEs and TAs of the 100 parents. Second, in two consecutive association analyses (2A, 2B), we assumed that the phenotypes and ordered genotypes of the 30,000 offspring were available. To find associations with imprinted loci, in analysis 2A an adapted version of a measured genotype model 43 was applied. Analysis 2B considered additive effects only by neglecting the imprinting effect.
Finally, to illustrate the effects of the PA-correction, deregression and weighting, the ePOEs from the three-generation data were used for association analyses. In analysis 3A, ePOEs were left completely untreated, while in 3B they were deregressed and PA-corrected, and in 3C, the ePOEs were additionally weighted. A summary of the analyses is provided in Table 2.
Brown Swiss data. The ePOEs and parental TAs as well as their reliabilities were derived from an imprinting analysis of Brown Swiss cattle slaughterhouse data published in detail in Blunk et al. 27 . They were derived for up to 428,710 sires and dams based on the routinely recorded performance of their progeny, which were up to 173,051 fattening bulls (exact numbers vary from trait to trait). The imprinting variance contributed a significant proportion to the total genetic variance in the net body weight (BW) gain (carcass weight divided by age [g/d]), carcass muscularity and carcass fatness. While muscularity was described using five monetary grades reflecting price differences, fatness was categorised by scores ranging from 1 (lean) to 5 (very fat). Assuming these traits to be normally distributed, ePOEs were first generated applying a linear imprinting model (these traits are subsequently indicated with the subscript L). Then, ePOEs were generated applying a threshold imprinting model assuming the traits to be binomially distributed (subscript T). As data were routinely obtained, animal care guidelines were not obtained.
Upon agreement of all involved organisations, genotypes were made available from the central genome database, which is maintained for breeding purposes at LKV Bayern in Munich. Data retrieval was assisted by the Institute for Animal Breeding at the Bavarian State Research Centre for Agriculture. After quality control, 37,443 genotypes remained for 1,857 sires for the net BW gain and 37,433 genotypes remained for 1,831 sires for the muscularity and fatness traits.
The genome was scanned via single marker regression, where ePOEs, defined as deviations of the TAs as dam from the TAs as sire, were regressed on their marker gene counts individually. With respect to the TAs (derived from the reduced imprinting model), both kinds (TAs as sire and TAs as dam) were used as dependent variables. Both ePOEs and TAs were PA-corrected, deregressed and weighted adapting the approximate method published in Garrick et al. 41 for breeding values. When their deregressed reliabilities were <3%, they were discarded. With regard to ePOEs, 1,793 remained for the net BW gain with an average reliability of 25.86%; 1,033 remained for muscularity (L) with an average reliability of 9.73%; 1,720 remained for muscularity (T) with an average reliability of 14.01%; 1,277 remained for fatness (L) with an average reliability of 10.63%; and 1,649 remained for fatness (T) with an average reliability of 12.51%. As discussed in detail later, the sensitivity to c, which was chosen to calculate the weighting of ePOEs and which defines the proportion of the imprinting variance not captured by markers 41 , was not particularly high in the simulation study. Therefore, c was limited to 0.1, 0.5 and 0.8. These values were only applied during the genome scan for ePOEs on the net BW gain to investigate the effect of changing c within a real data framework. A c of 0.1 was chosen to analyse the ePOEs on all other traits.

Results and Discussion
Simulated data. To investigate whether ePOEs are suitable to detect iQTLs, they were used as dependent variables regressed on the gene counts of 100 unrelated parents in the simulated two-generation data (analysis 1A). At the fully imprinted marker loci (five and eight), the average p-values ranged from p = 0.012 to p = 0.025, indicating the presence of imprinted loci (1A in Fig. 1; Table 3). The signals at the marker positions 11 and 14 (linked to partially expressed iQTLs) were visible with average p-values of 0.010, respectively. In contrast, the mean estimated effect at marker locus two (linked to a purely Mendelian QTL) did not significantly deviate from zero. As expected, the same pattern of signals was observed when phenotypes and ordered marker genotypes of offspring were used in a measured genotype model to detect iQTLs in analysis 2A (2A in Fig. 1; Table 3). Together, these results provide proof of principle that associations with iQTLs can be detected by regressing ePOEs on un-ordered genotype information. In comparison to ePOE-analysis 1A, higher average signals (in terms of −log 10 p-values) indicated the positions of iQTLs (2A in Fig. 1) when phenotypes of offspring served as the dependent variable (analysis 2A). However, ordered marker genotypes of 30,000 individuals with records were necessary to observe this result, whereas un-ordered genotypes of only 100 parents without records were used in analysis 1A. Therefore, using ePOEs as dependent variables constitutes a suitable alternative to detect iQTLs especially when ordered marker genotypes are not available for individuals with phenotypes or phenotypes are not observed for the individuals with marker genotypes. The ePOEs and reliabilities can be directly estimated for parents by applying the reduced imprinting model 27 . This model is also available in a sire-maternal grandsire version, which is particularly helpful when there are large data sets. This was demonstrated in Blunk et al. 28 , where more than 1.3 Mio records of Simmental fattening bulls were analysed.
When, in contrast to ePOEs, estimated TAs were used as dependent variables (analysis 1B), mean p-values ranging from 0.002 to 0.011 indicated QTLs at all loci with a Mendelian component, i.e. purely Mendelian and partially imprinted loci (1B in Fig. 1; Table 3). The mean effects estimated for the two fully imprinted markers did not significantly deviate from zero, with mean p-values of 0.328 and 0.369. This demonstrates that TAs, or, equivalently, breeding values, are only suitable to detect QTLs contributing to Mendelian genetic variation but not to detect fully imprinted loci, nor do such analyses offer any clue as to the imprinted nature of partially imprinted analysis model dependent variable (y) genotypic information 1A y = μ + bx a + e ePOEs un-ordered observed phenotype un-ordered and ordered Table 2. Overview of analyses using different models and genotypic information. The dependent variable y is either the parent-of-origin effect of a parent (ePOE), the transmitting ability of a parent (TA) or the observed phenotype of an offspring. In analysis 3, the ePOEs were deregressed, parent-average (PA) corrected (3B), and weighted (3C); x a is the gene content of an un-ordered genotype (parental allele origin unknown) at a marker locus; x p has the values of 0, 1, −1, and 0 for the ordered genotypes BB, Bb, bB and bb (parental allele origin known); b, b a , and b p are the regression coefficients; the random variables u, g s (effect of the father's gamete) and g d (effect of the mother's gamete) account for the stratification of the population into families; μ is the general mean and e is the residual.
ScIeNtIFIc REPORTS | (2019) 9:654 | DOI:10.1038/s41598-018-36939-3 QTLs. The same is true for the model for observed phenotypes in analysis 2B, that did not consider imprinting, since only an additive effect was included. At all marker loci, the mean test statistics indicated the existence of QTLs without the possibility to distinguish Mendelian and imprinted loci, exactly as was the case with TAs (Table 3). With regard to the estimation of marker effects, estimates were of similar magnitude at all marker loci when ePOEs were used as dependent variables in analysis 1A and when the model for phenotypes included an imprinting effect in analysis 2A ( Fig. 2; Supplementary Table S1). Note, however, that the sign was reversed. The sign of ePOEs depends on their definition in the imprinting model 27 , which can be either the genetic effect as father minus the genetic effect as mother or the reverse. Neither the estimation of the imprinting variance nor the detection of iQTLs depends on the sign of that difference as it has no effect on the absolute value of the marker effect and its estimate. When, in contrast to ePOEs, estimated TAs were used as dependent variables (analysis 1B), the estimates were about half the magnitude of those observed when the model for phenotypes included an additive effect in analyses 2A and 2B ( Fig. 2; Supplementary Table S1). Since a TA equals half the breeding values, this was expected.
To illustrate the effects of the PA-correction, deregression and weighting, the ePOEs from the three-generation data were used in analyses 3A, 3B and 3C. In analysis 3A, 'untreated' ePOEs of parents were regressed on their own un-ordered marker genotypes. Again, the mean p-values indicated that ePOEs can be used to detect imprinted loci, although a further loss of power must be noted in comparison to analysis 1A ( Table 3). The loss of power may be explained by a loss of LD between the QTLs and their markers due to more recombination events since the base generation until alleles are transmitted to the last (third) generation. With an initial LD of 0.1 among founders, their distance was large (10 centiMorgans), so the LD decreased rapidly from one generation to the next. As the ePOEs of individuals in generation two showed an undesired regression to their PA with an average reliability of 0.80, the ePOEs were deregressed and PA-corrected in analysis 3B. In comparison to the results in analysis 3A, only minor changes in the mean F-values and p-values were observed (Table 3). In contrast, the deregression and PA-correction increased the mean estimates of marker effects by almost one-third so that they approximated the estimates observed when ePOEs from the two-generation data were used as dependent variables in analysis 1A ( Fig. 2; Supplementary Table S1). Due to their heterogeneous variance, ePOEs were additionally weighted in analysis 3C. With regard to the test statistics, no particular differences could be observed in comparison to analysis 3B (Table 3). Moreover, neither the mean marker effect estimates nor their variation and standard errors differed ( Fig. 2; Supplementary Table S1). To calculate the weighting of ePOEs, a c-parameter was needed. An examination of the mean c-parameters across all replications demonstrated that a grid-search, which was performed to choose a suitable c-parameter, did not favour c-values in a certain range because no particular differences could be observed between markers. However, the standard deviations reflected strong fluctuations among replications. The log-likelihoods changed little in relation to the development of c (Table 3). This indicates a flat log-likelihood function and thus a low sensitivity to c. According to Garrick et al. 41 , this sensitivity depends on the heterogeneity of the information content in the data. Thus, due to the straightforward simulation design, a low sensitivity could have been expected from the outset. To summarise, no particular changes in the test statistics could be observed due to deregression, PA-correction or weighting of ePOEs. With regard to the effect estimates, deregression and PA-correction did have an impact, whereas weighting had only a minor role ( Fig. 2; Supplementary Table S1).
Brown Swiss data. Parent-of-origin effects. Of 37,443 single nucleotide polymorphisms (SNPs), one SNP (ARS-BFGL-NGS-101636) on the Bos taurus autosome (BTA) 11 was associated with ePOEs estimated in the net BW gain assuming a genome-wide false discovery rate (FDR) of 5% ( Fig. 3; Table 4). As shown in Supplementary  Fig. S1, SNP ARS-BFGL-NGS-101636 can be assigned to an intron of REEP1. The imprinting status of REEP1 is unknown. This applies also to its human and mouse orthologs. Imumorin et al. 44 identified bovine iQTLs in regions containing orthologs of imprinted genes in mouse and human. One bovine ortholog (HADHB) was located on BTA11 and was found in a region harbouring iQTLs with an effect on weaning weight. However, this gene is located about 24.75 mega base pairs (Mb) away from the marker locus ARS-BFGL-NGS-101636. Furthermore, ARS-BFGL-NGS-101636 displayed low LD with its surrounding markers (r 2 < 0.20). The highest LD (r 2 = 0.34) was calculated for the SNP BTA-97072-no-rs, located in a non-coding region at 47.52 Mb (Supplementary Fig. S1). Thus, a causal variant may be expected within or very close to REEP1. Mutations in the human ortholog of the bovine REEP1 are related to neurodegenerative disorders such as hereditary spastic paraplegia, a syndrome characterised by progressive lower-limb spastic paralysis 45 . Furthermore, Reep1 expression has been suggested to regulate the adipogenesis in white adipose tissue in mice. In Reep1-null mice, the expression of pro-adipogenesis markers was reduced, whereas the expression of anti-adipogenesis markers was upregulated. Thereby, Reep1-null mice were observed to be thinner, albeit not lighter, than their wild-type counterparts. Males showed a significant decrease in the proportion of total adipose tissue 46 .
To investigate the effect of c within a real data framework, we scanned the genome for ePOEs on the net BW gain, varying this parameter. With a change from c = 0.1 to c = 0.8, the estimated effects increased, whereas the  p-values decreased (Table 4). Thus, as expected, real data with an increasing amount of heterogeneous information leads to a higher sensitivity to c.
In addition to ARS-BFGL-NGS-101636, an association significant assuming a chromosome-wide FDR of 5% was found for BTA-97072-no-rs nearby on BTA11 when c equalled 0.5. When c equalled 0.8, further significant markers (chromosome-wide FDR of 5%) were detected on BTA24 ( Fig. 3; Table 4). Except for two SNPs, they are closely located in an area containing the genes MYOM1 and YES1. The first encodes the Myomesin-1 protein, which was found, among others, to provide a scaffold of myosin filaments 47 . The latter is involved in cytokinesis and cell cycle mechanisms 48 . The imprinting status of these genes is not known in cattle or in mice. However, with regard to the YES1 ortholog in mice, an adjacent gene (Il6) on chromosome 5 at 30.01 Mb is imprinted 49 .
With regard to muscularity (L) , two SNPs significant assuming a chromosome-wide FDR of 5% ( Fig. 3; Table 4) were found in non-coding areas on BTA7 (74.53 Mb) and BTA18 (35.58 Mb). While the existence of imprinted genes on BTA7 is unknown, four maternally imprinted genes on BTA18 have been reported 49 . However, these genes are clustered in a region from 64.26 Mb to 64.54 Mb. SNPs within this cluster and the SNP found on BTA18 were independent, with r 2 < 0.02. Furthermore, the significance for the SNP found on BTA18 could not be reproduced when the ePOEs estimated for muscularity (T) were analysed. Instead, a significant SNP (chromosome-wide FDR of 5%) was found in a non-coding area on BTA7. As this marker is distantly located at 38.41 Mb, no relationship can be assumed to the SNP found on BTA7 when ePOEs on muscularity (L) were analysed. With regard to fatness (L) , no significant SNPs were found. However, for fatness (L) the smallest number of records was available because 798 ePOEs were discarded from the analysis due to their low reliabilities. This might have considerably reduced the power to detect imprinted loci. With regard to fatness (T) , 45 SNPs significant assuming a 5% chromosome-wide FDR were found on BTA5. As described in detail later, a series of these SNPs were also observed to be significantly associated with the parental TAs in fatness (L) and fatness (T) . The strongest signals were displayed by SNPs surrounding TROAP and FAIM2. These genes are located in the regions from 30.65 Mb to 30.66 Mb and 30.15 Mb to 30.18 Mb, respectively. Some SNPs, being significantly associated with ePOEs, are located within or proximal to these genes (Table 4). So far, no imprinted loci on BTA5 are known in cattle. However, the gene Slc38a4 is known to be imprinted in mice 49 . The bovine ortholog is located on BTA5 close to the genes TROAP and FAIM2 in a region from 33.61 Mb to 33.65 Mb. In a comparative expression Figure 3. Marker loci in relation to their −log 10 p-values generated by regressing estimated parent-of-origin effects (ePOEs) on marker genotypes. Net body weight (BW) gain was analysed with varying c-parameters of 0.1, 0.5 and 0.8. The ePOEs for carcass muscularity and carcass fatness were generated using a linear (subscript L) and threshold model (subscript T). Red line = 5% genome-wide false discovery rate (FDR); dashed line = 10% genome-wide FDR; blue diamonds = markers significant assuming a chromosome-wide FDR of 5%.   The LD to SNPs proximal to PLAGL1 (r 2 ≤ 0.03) and to SNPs within IGF2R (r 2 ≤ 0.02), however, suggests that ARS-BFGL-NGS-36861 is a fully independent signal. Another significant SNP (chromosome-wide FDR of 5%) was found in a non-coding region at 13.95 Mb on BTA29. An imprinting cluster is located on BTA29 incorporating the genes IGF2 and H19. However, this cluster is distantly located in the region from 49.32 Mb to 50.15 Mb. To conclude, our results identify potentially imprinted genes on BTA5 and BTA11, affecting the carcass fatness and net BW gain, respectively. Their imprinting status cannot be established from the results of this study alone. Moreover, as ePOEs were used as dependent variables, it remains unclear whether their effects actually arose from genomic imprinting or another parent-specific genetic phenomenon (e.g. maternal genetic effects). To answer this question, follow-up studies are necessary to gain deeper insights.
Transmitting abilities. Due to strong similarities between the results for the TAs as sire and the TAs as dam, findings for the latter are subsequently omitted from Tables. The same holds true for the results found in traits analysed via the threshold model. Moreover, only markers found to be significant assuming a 5% and 10% genome-wide FDR are listed (Table 5) Table S2; Supplementary Table S3).
With regard to net BW gain, only one SNP, located in a non-coding region on BTA28, was found to be significantly associated with the parental TAs assuming a chromosome-wide FDR of 5% ( Fig. 4; Supplementary Fig. S4; Supplementary Table S2; Supplementary Table S3).
With regard to muscularity (L) , six SNPs on BTA10 were associated with the parental TAs assuming a 5% genome-wide FDR ( Fig. 4; Table 5). Except for one SNP, these findings were reproduced for both parental TAs estimated in muscularity (T) . Three SNPs can be assigned to the gene UNC13C and one SNP to the gene AQP9. Their effects are of similar magnitude (Table 5), and slightly higher effects were observed when the TAs as dam were analysed (Supplementary Table S3). This holds true for muscularity (L) as well as for muscularity (T) . Both genes are protein coding; however, their functions are still unclear in cattle. They are located in the regions from 52.15 Mb to 52.19 Mb (AQP9) and from 55.71 Mb to 56.41 Mb (UNC13C). Moderate lambdas (up to 1.042 for the TAs as sire) indicate almost no genome-wide inflation of p-values 51 , as shown for the muscularity traits in the Q-Q plot in Fig. 4. However, SNPs located in UNC13C and AQP9 are not fully independent, with r 2 > 0.24 (for LD patterns and gene assignments on BTA10 see Supplementary Fig. S3). Therefore, whether causal variants are located in either UNC13C, AQP9 or in both cannot clearly be determined. According to the CattleQTLdb (Release 32 52 ), a QTL associated with lean meat yield was found in Holstein-Friesian on BTA10 at 56.10 Mb, a region which is located within UNC13C 53 . Apart from the findings on BTA10, further markers significantly associated with the TAs estimated in the muscularity traits were found on BTA20, BTA23 and BTA28. Associations on BTA20 were only found for the TAs as dam assuming a chromosome-wide FDR of 5%. They cannot be assigned to genes. However, Doran et al. 53 detected SNPs on BTA20, which were strongly associated with carcass conformation and were <1 Mb away from the gene GHR. In the current study, one of the SNPs on BTA20 is located only 41.75 Kb away from GHR.  Table 4. Single nucleotide polymorphisms (SNPs) significantly associated with parent-of-origin effects (ePOEs) estimated for net body weight (BW) gain, carcass muscularity and carcass fatness. Net BW gain was analysed with varying c-parameters of 0.1, 0.5 and 0.8. The ePOEs for carcass muscularity and carcass fatness were generated using a linear (subscript L) and threshold model (subscript T). The estimated SNP effects are provided with standard errors (se) and p-values. Their base pair (bp) positions on chromosomes (chr) and known genes are indicated. ***5% genome-wide false discovery rate; *5% chromosome-wide false discovery rate.
ScIeNtIFIc REPORTS | (2019) 9:654 | DOI:10.1038/s41598-018-36939-3 A SNP on BTA23 was found to be significantly associated with both parental TAs estimated in muscularity (L) and muscularity (T) with a 5% genome-wide FDR. However, it is located within a non-coding region, and markers found to be significantly associated with carcass conformation in Doran et al. 53 Table 5. Single nucleotide polymorphisms (SNPs) significantly associated with transmitting abilities (TAs) as sire estimated for net body weight (BW) gain, carcass muscularity and carcass fatness. The TAs for carcass muscularity and carcass fatness were generated using a linear (subscript L) and a threshold model (subscript T). The estimated SNP effects are provided with standard errors (se) and p-values. Their base pair (bp) positions on chromosomes (chr) and known genes are indicated. ***5% genome-wide false discovery rate; **10% genomewide false discovery rate. With respect to fatness (L) , 22 SNPs on BTA5 were found to be associated with the TAs as sire assuming a genome-wide FDR of 5%. The smallest p-values (p < 2.6e-08) were displayed by markers surrounding TROAP located in a region from 30.65 Mb to 30.66 Mb. Overall, 41 further SNPs on BTA5 were significant assuming a genome-wide FDR of 10% or a chromosome-wide FDR of 5%. They were replicated for the TAs as dam as well as for the parental TAs estimated in fatness (T) . Some SNPs are located within genes other than TROAP (Table 5). However, due to their moderate to high LD to SNPs located within TROAP, most of them may not be causal but indicators for causal variants within or proximal to TROAP (Supplementary Fig. S2). According to the CattleQTLdb (Release 32 52 ), haplotypes with an effect on backfat breeding values are closely located to TROAP in a region from 32.6 Mb to 34.2 Mb 54 . Further significant SNPs (genome-wide FDR of 5% and 10%) were found on BTA2, BTA13, BTA16 and BTA19 when the TAs as sire estimated in fatness (L) were used as dependent variables ( Fig. 4; Table 5). The SNP on BTA13 is located in the gene DZANK1, which is positioned in an area from 38.78 Mb to 38.83 Mb. McClure et al. 55 found a QTL with an effect on fat thickness within this area (CattleQTLdb; Release 32 52 ). With regard to SNPs solely found when TAs as dam were analysed, a SNP significant assuming a 5% chromosome-wide FDR was revealed for fatness (T) on BTA29 (Supplementary Fig. S4; Supplementary Table S3). The SNP (13.59 Mb) is located far from IGF2 (50.04 Mb to 50.06 Mb). Therefore, a connection seems unlikely.
To conclude, the strongest indications for markers associated with TAs observed in the muscularity and fatness traits were found on BTA10 and BTA5, respectively. These signals mainly pointed to variants located within or proximal to the genes UNC13C (BTA10) and TROAP (BTA5). Scanning the genome for effects on TAs, which were estimated in the muscularity and fatness traits via linear and threshold models, led to results of high consistency. This suggests -at least for the categorical traits under consideration -that the linear and threshold models perform equally well at capturing the underlying genetic variation.

General Discussion
From the single-locus population model with imprinting it was derived that the regression of true POEs on allele counts equals the imprinting effect. This was the foundation for our simulation study, by which estimates of POEs were related to marker alleles. In a two-step procedure, ePOEs and their reliabilities were first obtained by employing appropriate mixed models. In this way, information on parent-of-origin-specific genetic effects was extracted from un-genotyped offspring and summarised for genotyped parents. In a second step, the deregressed ePOEs then served as dependent variables to be regressed on allele counts of marker alleles, indicating the association of an iQTL with the marker. Most importantly, in this way, the complete lack of any genotype information for progeny and the phenotypes of the parents could be overcome. This is a substantial step forward compared to previously published methods for association mapping of imprinted loci, as these require data where phenotypes and ordered genotypes are known for the same individuals (GRAMMAR 32,33 ; measured-genotype approach 43 ). Available marker information could, in principle, already be integrated in the estimation of ePOEs by replacing the numerator relationship matrix in the mixed model through a combined marker and pedigree-based relationship matrix H 56 . Following Wang et al. 57 , ePOEs of genotyped parents could then be transformed into marker effects and potential QTL-regions be made visible. However, significance testing of such marker effects has, to our best knowledge, not yet been described in the literature. The Brown Swiss beef traits data set is a meat animal example, where phenotype-providing progeny never become parents and remain un-genotyped. In livestock genetics, other candidate data show the same characteristics, e.g. in dual-purpose Simmental 28 or Gelbvieh cattle 58 . In other cases, however, accumulating data from ongoing breeding programmes may often also comprise a certain proportion of phenotyped progeny that also has ordered genotypes available (e.g. Guo et al. 59 ) and, to a certain fraction, may also be parents themselves (e.g. Jiang et al. 60 ). This calls for analyses of mixed data, where the vector of dependent variables contains ePOEs for genotyped parents, as a summary of information from un-genotyped progeny, and the phenotypes of genotyped progeny. Their respective counterparts on the explanatory side of the model are marker allele counts and ordered genotype information. Thus, including ePOEs in marker-based imprinting analyses enables researchers to make full use of available information which otherwise would remain un-accessible.
How power and cost effectiveness of the measured-genotype approach 43 (with genotypes on progeny only) and our new approach compare depends on several factors. For the use of ePOEs sufficient data of adequate structure (sires and dams have to be related) are required to estimate variance components. Another important point is the average reliability of ePOEs. With genotyped progeny we also need to assign resources to the genotyping of their parents, otherwise we cannot derive the parental origin of marker alleles. In general, our new approach is obviously well suited for further studies of the importance of POEs in species where large progeny groups can be produced or are available and their phenotypic information can be combined into reliable estimates of parental POEs. This is especially the case in plants and some animal species with large half-sib families, such as horses, cattle or birds.
Some authors have stressed the fact that POEs may be caused by other phenomena than imprinting alone, e.g. by maternal genetic effects 61 . Accordingly, ePOEs may pick up such effects, especially when not explicitly accounted for in the model for POE-estimation (e.g. Blunk et al. 27,28 ). Other potential confounders are Y-chromosomal and mitochondrial effects, which, however, seem to be negligible at least for beef traits 24,25,62 . However, if present, the latter would not show any signal of association with purely autosomal markers, as investigated in the Brown Swiss analysis. Moreover, model-specification is also important for the ability to find associations between different kinds of imprinted loci and markers. Models that account for all kinds of genomic imprinting simultaneously, no matter if paternal, maternal, fully or partial, have not been described in the literature before 24,25 . When such models are used for POE-estimation, all types of imprinted loci can, in principle, be located in the genome, as was successfully demonstrated in the simulation study. Nevertheless, follow-up studies are necessary, as for other approaches like mapping in line-cross experiments 39 , in order to confirm both genomic imprinting and its exact type.
As shown for breeding values in Ekine et al. 42 , preprocessing the ePOEs is necessary to achieve reliable marker effect estimates and to avoid large false-positive rates. With regard to the weighting, c was limited to 0.1 to analyse the muscularity and fatness traits. When the ePOEs on net BW gain were analysed, c was varied, which led to some changes in the p-values and estimated effects. As this indicates a degree of sensitivity to c, varying this parameter with regard to the muscularity and fatness traits might have resulted in the detection of additional associations. However, as no effect of the weighting was observed in the simulation study, it remains questionable whether changing c actually results in true associations or in an inflation of effects. Thus, whether the associations found on BTA24 for ePOEs on the net BW gain constitute true associations or false-positives remained unclear when c was set to 0.8. Analyses with greater amounts of data would help investigating the true imprinting status of loci, especially regarding the SNPs found on BTA24.
Note that the approximate method of deregression and PA-correction according to Garrick et al. 41 did not allow the deregression of two genetic effects simultaneously, which would, however, be appropriate using models that consider imprinting by including two genetic effects. In an article on the international genetic evaluation of beef cattle weaning weight 63 , the separate deregression of direct genetic effects and maternal genetic effects was described. A correction was presented, in which the contributions of the second correlated effect to the variance of the first are eliminated, when the first genetic effect is deregressed. With the help of a λ-value, corrected heritability was computed both for the direct and maternal effect, which can be interpreted as equivalent heritability under a single trait model 63 . The same principle could be applied to the reduced imprinting model (for details see the Supplementary Material). However, using an adapted λ-value in the approximation method by Garrick et al. 41 did not have any impact as the λ-value was cancelled out (proof is provided in the Supplementary Material).
In summary, the possibility to detect imprinted loci by regressing ePOEs of parents on the gene counts of their un-ordered marker genotypes has not been recognised before. Fields of application are, in principle, all kinds of pedigreed populations, be it animals or plants, where genotyping of parents has become part of routine breeding procedures, or, possibly even in human populations with available detailed genealogies 64 . Apart from whole-genome scans, targeted analyses may also concentrate on associations in and around known imprinted genes. Further research will aim at more general analyses that include phenotypes from individuals also providing ordered genotype information. In any case, vast repositories of already collected phenotypes will now become accessible for imprinting analyses by applying our new approach.

Methods
Simulated data. To estimate the ePOEs (needed for the analyses 1A, 3A, 3B, and 3C) for each parent together with their respective reliabilities, the phenotypic information from the progeny in the last generations were summarised by applying a reduced imprinting model (as described in Blunk et al. 27 ). Estimates for the TAs (needed for analysis 1B) were derived from a reduced animal model that assumes no imprinting (as described in Neugebauer et al. 24,25 ). Then, the model: numerator relationship matrix A was required. Matrix A only contained the relationships between the animals with genotypes and was generated using the kinship2 R-package version 1.6.4 75 in R 76 . Except otherwise specified, all positions of SNPs and further genetic information were obtained from the UCSC Genome Browser (http://genome.ucsc.edu/) based on the Bos Taurus UMD3.1.1/bosTau8 (assembly date: Dec. 2009). Pairwise LD between SNPs was specified as r 2 estimated using Haploview (version 4.2 77 ). Information about the known imprinting status of genes was derived from the geneimprint database 49 .

Data Availability
Simulated data. The code used to generate the simulated data is freely available from the corresponding author upon request. The simulation results as well as corresponding programmes to visualise these results (F-values, p-values, estimated marker effects and standard errors for each marker and replication) are available at: https://doi. org/10.22000/81. Brown Swiss data. The Brown Swiss data merely served as an application example for our new method. The datasets are not publicly accessible as they were made available on a confidential basis due to commercial sensivity. The data would be available from our co-author Henning Hamman upon reasonable request but restrictions apply to the availability of these data (Material Transfer Agreement). Programmes to visualise the results as well as locations of SNPs and their p-values are, however, available for all traits at the above mentioned link.