Single-step genome-wide association study for social genetic effects and direct genetic effects on growth in Landrace pigs

In livestock social interactions, social genetic effects (SGE) represent associations between phenotype of one individual and genotype of another. Such associations occur when the trait of interest is affected by transmissible phenotypes of social partners. The aim of this study was to estimate SGE and direct genetic effects (DGE, genetic effects of an individual on its own phenotype) on average daily gain (ADG) in Landrace pigs, and to conduct single-step genome-wide association study using SGE and DGE as dependent variables to identify quantitative trait loci (QTLs) and their positional candidate genes. A total of 1,041 Landrace pigs were genotyped using the Porcine SNP 60K BeadChip. Estimates of the two effects were obtained using an extended animal model. The SGE contributed 16% of the total heritable variation of ADG. The total heritability estimated by the extended animal model including both SGE and DGE was 0.52. The single-step genome-wide association study identified a total of 23 QTL windows for the SGE on ADG distributed across three chromosomes (i.e., SSC1, SSC2, and SSC6). Positional candidate genes within these QTL regions included PRDM13, MAP3K7, CNR1, HTR1E, IL4, IL5, IL13, KIF3A, EFHD2, SLC38A7, mTOR, CNOT1, PLCB2, GABRR1, and GABRR2, which have biological roles in neuropsychiatric processes. The results of biological pathway and gene network analyses also support the association of the neuropsychiatric processes with SGE on ADG in pigs. Additionally, a total of 11 QTL windows for DGE on ADG in SSC2, 3, 6, 9, 10, 12, 14, 16, and 17 were detected with positional candidate genes such as ARL15. We found a putative pleotropic QTL for both SGE and DGE on ADG on SSC6. Our results in this study provide important insights that can help facilitate a better understanding of the molecular basis of SGE for socially affected traits.


Results and discussion
Variance components and genetic parameters of SGE and DGE. The (co)variances and parameters obtained from the studied model for Landrace pigs are presented in Table 1. The total heritability ( T 2 ) estimated by the extended BLUP model including both DGE and SGE was 0.52 in our Landrace population. This value was greater than classical heritability ( h 2 ) in this pig breed (0.36). Moreover, the SGE, (n−1) 2 σ 2 a S , contributed 16% of the total heritable variation ( σ 2 TBV ) of ADG, which is available for response to selection in Landrace pigs. The correlation coefficient between DGE and SGE was moderate ( r a DS = 0.23). The h 2 and T 2 estimates in our study were higher than those of the previous studies 7,8 . An explanation of the observed differences is due to the fact that the breeder of these maternal lines for pig selection have focused on reproductive traits, such as litter size, there has been little chance of genetic improvement for growth. Bergsma et al. suggested that the absence of competition between mate growth and an individual's own growth might be a result of neutral or marginally cooperative social interactions 8 . According to Bijma et al., the positive covariance between direct variance and indirect (i.e., social genetic) variance is likely to increase the σ 2 TBV , which well corresponds to the findings of Table 1. Estimates of variance components and genetic parameters for average daily gain in Landrace pigs. a σ 2 a D , direct genetic variance; σ 2 a S , social genetic variance; σ 2 p , phenotypic variance; σ 2 TBV , total heritable variance; h 2 = σ 2 a D / σ 2 P , classical heritability; T 2 = σ 2 TBV / σ 2 P , total heritability for the extended BLUP model including SGE; r a DS , correlation between direct and social genetic effects; r lk , correlation between birth litter and early-life environmental effects; σ 2 c , random pen variance; σ 2 g , random group variance; σ 2 l , random litter variance; σ 2 k , random early-life environmental variance; σ 2 pe , random permanent variance; σ 2 e , residual variance; σ a DS , covariance between direct and social genetic effects; σ lk , covariance between birth litter and early-life environmental effects.

GWAS of SGE and DGE on ADG. SGE on ADG.
Non-overlapping 1-Mb windows that explained more than 0.5% of the additive genetic variance were determined to be QTLs for SGE on ADG in this study. The Manhattan plot of the single-step GWAS (ssGWAS) for SGE is shown in Fig. 1A. Following this criterion, we identified a total of 23 QTL windows for SGE on ADG distributed across three chromosomes (i.e., SSC1, SSC2, and SSC6) (Supplementary Table S1). The top QTL, which explained 5.36% of the additive genetic variance, was mapped on SSC1 at the 61-62 Mb region. However, this window did not harbour any known gene. Among the 22 other QTLs, sixteen were also mapped on SSC1. The QTL window on SSC1 at 66-67 Mb explained the 2nd highest percentage of additive genetic variance (4.96%) for SGE on ADG and included a positional candidate gene, PRDM13. It has been known that GABA (gamma-aminobutyric acid) plays a critical role in the control of neurotransmitters as an inhibitor of neuronal activities related with neuropsychiatric processes 20 . PRDM13 is required to generate the precise number of GABA (gamma-aminobutyric acid) associated neurons (i.e., GABAegic neurons) 21 . There is a report that a microdeletion of PRDM13 locus has been implicated in autism and intellectual disability in human 22 . The QTL www.nature.com/scientificreports/ window on SSC1 at 58-59 Mb includes MAP3K7, which is known to be involved in autophagy. Autophagy is known to be related to psychiatric diseases such as schizophrenia 23 . CNR1, which encodes the type 1 cannabinoid receptor protein, is localised to the QTL window on SSC1 at 55.6-56.6 Mb. This window accounts for 3.52% of additive genetic variance for SGE on ADG. The endocannabinoid system plays a crucial role in the regulation of neurological activity throughout the central nervous system 24 . This gene is known to be associated with neurological phenotypes in mice 25 and humans 26 . Also on SSC1, another QTL window at 56.9-57.9 Mb, accounting for 1.21% of additive genetic variance, harboured GABRR1 and GABRR2, which encode receptors for GABA. The GABA makes use of its effects via their receptors. Hence, GABA receptors influence neurological and behavioural phenotypes in many animal species [27][28][29] . It was reported that GABRR2 is a plausible positional candidate gene on SGE of feed conversion rate in Yorkshire pigs 19 . The QTL window on SSC1 at 54.4-55.4 Mb, accounting for 1.09% of additive genetic variance, harboured HTR1E, which encodes 5-hydroxytryptamine 1E receptor protein.
The 5-hydroxytryptamine, which is also known as serotonin, plays an important role in mood control 30 , HTR1E has been implicated in autism spectrum disorders 31 . On SSC1, the QTL window at 51.5-52.5 Mb, explaining 0.58% of additive genetic variance, contained RIMS1. The protein encoded by RIMS1 is a RAS gene superfamily member that controls synaptic vesicle exocytosis 32 and influences behavioural traits linked to schizophrenia 33 .
On SSC2, only one QTL at the 134-135 Mb region was identified. This QTL region accounted for 0.55% of additive genetic variance for SGE on ADG. Three genes are located in this region that encode IL4, IL5, and IL13. These interleukins are involved in T cell receptor signaling pathway 34 . It is well documented that T cells are linked to the homeostasis of the central nervous system and therefore may be related to neurological or psychological traits 35,36 . KIF3A was also located in this region; this gene encodes the kinesin-like protein 3A, which has been implicated in a disorder involving motor neuron degeneration 37 . Furthermore, this gene was identified as a component of gene networks associated with conditional fear phenotypes in mice 38 .
On SSC6, five windows were identified as QTLs. The most significant QTL on SSC6 was located in the 73.7-74.7 Mb region, and explained 0.95% of additive genetic variance for SGE on ADG. EFHD2, which encodes EF hand domain containing 2 protein, is located in this QTL. EFHD2 is expressed in human and mice brains. EFHD2 protein is a Ca 2+ sensor protein which is likely involved in synaptic plasticity and affects various behavioural traits in mice 39 . Another QTL window at 19.9-20.9 Mb, which accounted for 0.87% of additive genetic variance, contained SLC38A7, which encodes solute carrier family 38 member 7 (also known as the SNAT7 protein). Hägglund et al. reported that SLC38A7 is a sodium-coupled amino acid transporter in glutamatergic neurons in the brain 40 , and it has been implicated as a risk-conferring glutamatergic gene in schizophrenia 41 . In the QTL region of SSC6:70.8-71.8 Mb, the mammalian target of rapamycin (mTOR) gene was found to be associated with SGE on ADG. mTOR is a broadly expressed serine-threonine kinase that coordinates major cellular processes, including synaptic and autophagy activation. Altered mTOR signal transduction is known to be associated with neuropsychiatric disorders such as schizophrenia 42 .

DGE on ADG.
Eleven QTL windows that affected DGE on ADG were identified by ssGWAS. These QTLs were located on SSC2, 3, 6, 9, 10, 12, 14, 16, and 17 ( Fig. 1B, Supplementary Table S2). Among the identified QTLs, the top QTL on SSC9 (128.2-129.2 Mb) accounted for 1.10% of the additive genetic variance of DGE on ADG. No obvious positional candidate gene for growth-related traits was found in this QTL region. The QTL that explained the 2nd highest percentage of additive genetic variance (1.05%) for DGE on ADG was located at 33-34 Mb of the SSC16, and included a positional candidate gene, ARL15 encoding a small GTP-binding protein. In humans, the ARL15 locus was found to be associated with plasma insulin, HDL cholesterol, adiponectin levels, and obesity. Moreover, a functional characterisation revealed that ARL15 influences adiponectin secretion and adipocyte differentiation 43 .
To evaluate the presence of putative pleiotropic QTL regions associated with SGE and DGE on ADG, we compared the two GWAS results; only one QTL (i.e., SSC6: 19.9-20.9 Mb) was co-localised for SGE and DGE in this study. The previously mentioned SLC38A7 located in this QTL window is expressed in GABAergic and other neurons in brain, and in liver and skeletal muscle. In addition to neuropsychiatric traits, this gene has been implicated in energy metabolism and cell growth 44 . Therefore, SLC38A7 is considered as a potential candidate gene for both SGE and DGE on ADG.
We previously reported the identification of several QTLs for SGE and DGE using a standard GWAS approach 45 . In the previous study, the mean phenotypic value of ADG for unrelated pen partners and the phenotypic value of individual ADG were used as the naïve SGE and DGE for the GWAS, respectively. Therefore, several other fixed and random effects were not previously corrected compared with the current study. However, an extended animal model was used to compute BLUP estimates of SGE and DGE on ADG in this study. Subsequently, the two BLUP estimates were used as dependent variables for the ssGWAS. We found that there is a lack of concordance between our previous and current GWAS results. No overlapping QTL was identified in SSC1 and SSC2. The previous study identified significant SNP markers in SSC6. However, these markers were not co-localised with the QTL identified in the current study. This lack of concordance may not be surprising, because we used two different types of dependent variables for the two different GWAS approaches. Nevertheless, we argue that the current ssGWAS results are more plausible, because the extend BLUP model in this study is likely to be better fit for the social interactions between pigs. Moreover, we identified a list of positional candidate genes involved in neuropsychiatric processes that are more relevant to SGE in this study.

Biological pathway and network analyses of SGE and DGE on ADG in pigs. SGE on ADG.
To evaluate the association of a curated set of genes (i.e., biological pathways) with the SGE on ADG, a total of 199 positional candidate genes, which are located within the QTL windows, were uploaded to the Enrichr database 46 . Functional annotation of the positional candidate genes to biological processes is presented in  GABRR1, GABRR2, CNR1, and PLCB2) were overrepresented in this pathway. As a complementary method to biological pathway analysis, a network-based analysis was conducted to establish gene network associated with the SGE on ADG using Ingenuity Pathway Analysis (IPA) 47 . The IPA identified a gene-gene interaction network with a score of 31. This network contained 18 of the positional candidate genes detected by ssGWAS (Fig. 2). Among these 18 genes, four (CNOT1, HYDIN, RIMS1, and PLCB2) of them were listed in the SZDB, which is a database for schizophrenia genetic research 48 . Moreover, the IPA revealed that this network is associated with cell morphology, developmental disorder, and nervous system development and function.

DGE on ADG.
To conduct biological pathway analyses for the DGE on ADG, a total of 108 positional candidate genes were used ( Table 3). The top biological pathway for the DGE on ADG was related to cysteine and methionine metabolism. Four genes (LDHB, SDS, SDSL, and GOT2) involved in this amino acid metabolism were overrepresented in this pathway. Castellano et al. reported that cysteine and methionine levels can affect preadipocytes proliferation and differentiation which can be relevant to fatness traits in pigs 49 . The Enricher database identified a signaling pathway that regulates pluripotency of stem cells for the DGE on ADG. Three genes (WNT9B, LHX5, and WNT3) were overrepresented in this signaling pathway. It is well documented that WNT protein-mediated signaling plays an important role in the pathogenesis of obesity 50,51 . Additionally, the mTOR signaling pathway was also detected by this analysis; this pathway included three genes (WNT9B, LAM-TOR1, and WNT3). The mTOR pathway orchestrates multiple major cellular processes, such as cell growth 52,53 . The IPA detected a gene-gene interaction network with score of 49. This network contained 23 of the positional candidate genes detected by ssGWAS for DGE on ADG (Fig. 3). The IPA elucidated that the functions of this network included cardiovascular system development and function, cellular assembly and organisation as well as cellular function and maintenance.

conclusion
Using an extended BLUP model, we found that genetic variation in pen mates (i.e., SGE) influences the variation in ADG in Landrace pigs. We also provide a list of QTLs and positional candidates associated with SGE and DGE on ADG using ssGWAS. The identified positional candidate genes for SGE on ADG (PRDM13, MAP3K7, CNR1, HTR1E, IL4, IL5, IL13, KIF3A, EFHD2, SLC38A7, mTOR, CNOT1, PLCB2, GABRR1, and GABRR2) have biological roles that are strongly associated with neuropsychiatric processes. Furthermore, the post-GWAS pathway and network analyses also supported the association of neuropsychiatric processes with SGE on ADG. This study contributes to our understanding of the molecular basis of SGE.

Materials and methods
Animals and phenotype data. The total numbers of animals in the pedigree of Landrace pigs were 23,152.
These pigs were born and raised in a closed nucleus (breeding) farm in the Republic of Korea. Both parents were known for a total of 22,983 individuals. The average inbreeding coefficient was 0.064. The range of inbreeding coefficients was 0.00009-0.304. The observed average family size was 4.12 with ranges of 2-23. The population structures of these breeds were determined using the CFC v1.0 software 54 . At the same closed nucleus farm, the phenotypic dataset for ADG was obtained by performance tests of Landrace (N = 21,554) pigs between years 2001 and 2015. A total of 4 − 13 pigs of the same sex were placed in each pen to form the groups of pigs. The average group size was 6.8 ± 1.9. As Hong et al. described, the performance evaluations of ADG of pigs started soon after each animal reached a live body weight of 30 kg, and continued until a target weight of 90 kg was attained; pigs were fed ad libitum, and water was constantly accessible through   55 . On average, fewer than 160 days were required to attain this target weight. The mean and standard deviation of ADG were 804 ± 91 g/day.
Genotype data. The genomic DNA of pigs was extracted from blood samples collected from jugular veins using a standard protocol. A total of 1,041 Landrace pigs were genotyped using the Illumina PorcineSNP60 v2 BeadChip panel, which included 61,565 SNP markers 56 . This population was previously used in GWAS for the naïve SGE 45 . The quality control process of the genotype data included the removal of individuals with pedigree errors, omission of monomorphic SNP genotypes, SNPs on sex chromosomes or SNPs with minor allele frequencies (< 0.95), genotype call rate of < 0.90, animal missing rate of > 0.90, Hardy-Weinberg equilibrium of 0.15, and the SNPs with displaced segregation distortion 57,58 . After quality control, the final dataset included genotypes from a total of 1,029 pigs. The total number of autosomal SNPs was reduced to 39,136 after 36.4% of SNPs from the original Illumina marker panel were removed.
Quantitative genetic analysis of SGE. The ADG trait was analysed by the extended animal model below. Bayesian inference using Gibbs sampling procedure was used to estimate (co)variance components of the studied trait. The effects of birth year-month (168 levels), sex (male or female), and group size (10 levels) were fitted as fixed effects. In the model, age at target weight was fitted as a covariate. The models also included the random effects of the physical pen (112 levels), group identity (3,635 levels), litter of birth (6,098 levels), and permanent environment of the mother (1,888 levels). Animals were fitted as a random effect in the model. where y is the vector of observations (ADG), b is the vector of the fixed effects, a D is the vector of the random additive DGE, a S is the vector of the random additive SGE, c is the vector for the random pen with c ∼ N 0, Iσ 2 c , g is the vector of the random group with g ∼ N 0, Iσ 2 g , pe is the vector for the random nongenetic permanent effects of the mother with pe ∼ N 0, Iσ 2 pe , l is the vector for the random birth litter, k is the vector for the random early-life environment (birth litter of its group mates), and e is the vector of the residuals with y = Xb + Z D a D + Z S a S + Wc + Vg + Tpe + Ul + Qk + e, Figure 3. Gene network of interactions between GWAS positional candidate genes using ingenuity pathway analysis (IPA). Inter-relationship among molecules were determined using information stored in the IPA repository. The blue label indicates the positional candidate genes from QTL windows for DGE on ADG.
Scientific RepoRtS | (2020) 10:14958 | https://doi.org/10.1038/s41598-020-71647-x www.nature.com/scientificreports/ e ∼ N 0, Iσ 2 e . X, Z D , Z S , W, V, U, T, and Q are the corresponding incidence matrices. I is an identity matrix of appropriate dimensions. To take into account differences in group size, as suggested by Canario et al. 7 , an additional covariate term known as dilution Average group size −1 Group size −1 was added to the SGE and early-life environmental effects.
DGE and SGE had the following multivariate normal (MVN) distribution: a D a S ~ MVN (0, C ⊗ A), in which C is defined by the matrix σ 2 a D σ a D a S σ a D a S σ 2 a S , σ 2 a S is the variance of social genetic effects, σ a D a S is the covariance between DGE and SGE, and C ⊗ H denotes the Kronecker product of two matrices. The relationship matrix H, in single-step evaluation, defines the relationship among genotyped and non-genotyped animals. The inverse of the H matrix is rather simple in structure 59,60 and can be given as: where A 22 is the matrix for only genotyped animals (a submatrix derived from the pedigree-based relationship matrix, A and G is the relationship matrix among individuals based on genomic information. In addition, birth litter and early-life environmental effects had the following multivariate normal (MVN) distribution: l is the variance of birth litter effects, σ kl is the covariance between birth litter and early-life environment effects, and σ k is the variance of early-life environment effects.
To fit a SGE model, GIBBS2F90 with Bayesian inference using Gibbs sampling was used 61 , and the Gibbs samplers were run as single chains 550,000 rounds. The first 50,000 rounds were discarded as burn-in with thinning every 50 samples. This resulted in a total of 10,000 samples used for post-Gibbs analyses, which were completed using POSTGIBBSF90 61 .
According to Bijma 62 , for traits affected by heritable social effects, the variance of total breeding value (TBV) represents the total heritable variation that is exploitable for selection. The TBV of the ith animal was defined as follows: where n indicates the average size of social groups. The TBV is the heritable effect of an individual on trait values in the population, which is the sum of its DGE ( a D,i ) on its own phenotype and its SGE ( a S,i ) on the phenotypes of its n − 1 group mates. Bijma (2011) also stated that the total heritable variance determines the population's potential in response to selection and can be expressed as 62 : According to Canario et al., the phenotypic variance for such a model can be calculated as follows 7 : The total heritable variance can be expressed relative to phenotypic variance (Bergma et al. 2008) as follows: ssGWAS. A ssGWAS, which jointly uses genotype, phenotype, and pedigree information in one step, was conducted on the vector of random additive SGE (i.e., a S ). In this ssGWAS approach, greater power and more precise estimates of variance components can be achieved by including non-genotyped animals if the number of genotyped animals is limited 63 . The G matrix was constructed with method 1 in VanRaden (2008) 64 : where Z is a matrix of gene content that contains genotype adjusted for allele frequencies; D is a diagonal matrix of weights for variances of SNPs (initially D = I); and q is a weighting factor. This factor can be derived by ensuring that the average diagonal in G is close to that of A 22 65 . The SNP effects and weights were derived as follows: 1. Let D = I in the first step. 2. Calculate breeding values 3. Convert breeding values to SNP effects û = DZ ′ ZDZ ′ −1â g , where â g is the breeding values of the animals which were also genotyped. 4. Calculate the weight for each SNP: d i = û 2 2p i 1 − p i , where i is the i-th SNP 5. Normalise SNP weight to retain the total genetic variance constant.
The breeding values, the D matrix and the SNP effects were iteratively recalculated over two iterations, as suggested by Wang et al. 63 .
The percentage of additive genetic variance explained by i-th region was computed as below:
Search for positional candidate genes. The windows (chromosomal segments) that accounted for equal to or greater than 0.5% of the additive genetic variance from GWAS were determined to be the QTL regions [66][67][68] . The QTL windows were identified and located for candidate genes using the Sus scrofa Build 11.1 assembly and tools available in NCBI.
Biological pathway and network analyses. The selected candidate genes were uploaded into the Enricher database to investigate relevant biological pathways 46 . The gene interaction network analysis was conducted using IPA tools as a complementary analysis to the Enricher biological pathway analysis 47 . The same list of candidate genes used for the pathway analysis were uploaded into the IPA for network analysis. These analyses were conducted with their default settings. ethical approval. All experimental protocols were approved by the Institutional Animal Care and Use Committee (IACUC) at the National Institute of Animal Science (NIAS), Republic of Korea. All methods in this study were carried out in accordance with relevant guidelines and regulations. Necessary approval was obtained from the IACUC of the NIAS, Republic of Korea (Approval number: NIAS20191709).