Combining transcriptomics and genetic linkage based information to identify candidate genes associated with Heterobasidion-resistance in Norway spruce

The Heterobasidion annosum s.l species complex comprises the most damaging forest pathogens to Norway spruce. We revisited previously identified Quantitative Trait Loci (QTLs) related to Heterobasidion-resistance in Norway spruce to identify candidate genes associated with these QTLs. We identified 329 candidate genes associated with the resistance QTLs using a gene-based composite map for Pinaceae. To evaluate the transcriptional responses of these candidate genes to H. parviporum, we inoculated Norway spruce plants and sequenced the transcriptome of the interaction at 3 and 7 days post inoculation. Out of 298 expressed candidate genes 124 were differentially expressed between inoculation and wounding control treatment. Interestingly, PaNAC04 and two of its paralogs in the subgroup III-3 of the NAC family transcription factors were found to be associated with one of the QTLs and was also highly induced in response to H. parviporum. These genes are possibly involved in the regulation of biosynthesis of flavonoid compounds. Furthermore, several of the differentially expressed candidate genes were associated with the phenylpropanoid pathway including a phenylalanine ammonia-lyase, a cinnamoyl-CoA reductase, a caffeoyl-CoA O-methyltransferase and a PgMYB11-like transcription factor gene. Combining transcriptome and genetic linkage analyses can help identifying candidate genes for functional studies and molecular breeding in non-model species.

Scientific RepoRtS | (2020) 10:12711 | https://doi.org/10.1038/s41598-020-69386-0 www.nature.com/scientificreports/ Genetic variation for resistance to H. annosum s.l. exists in Norway spruce [11][12][13][14][15][16] and there are no adverse correlations between resistance to Heterobasidion infection and growth or wood quality traits 12,13,16 . Hence, selection for resistance to H. annosum s.l. in breeding programs could lead to considerable gain 12 , without compromising other breeding achievements. Resistance to wood rotting pathogens, such as H. annosum s.l. in conifers, presents a challenge to tree breeders as the disease often manifests late in the tree's life, which makes phenotypic selection difficult and time consuming. Marker Assisted Selection (MAS) holds promises for increasing the gain in tree breeding and specifically in resistance breeding 17 . Yet, MAS has not been widely implemented within tree breeding programs mainly due to the difficulty in translating Quantitative Trait Loci (QTLs) analysis into operational MAS i.e. validation of potential markers 17,18 . Recently, one of the marker candidates, PaLAR3 was identified in the first Quantitative Trait Locus (QTL) analysis for resistance against H. parviporum (Fr.) Niemelä & Korhonen 14,15 , which is a member of the H. annosum s.l. complex and lives almost exclusively in Norway spruce 19 . The validation of PaLAR3, a gene that encodes for a leucoanthocyanidin reductase that is involved in the flavonoid-biosynthetic pathway 20 , was done through the integration of information of phenotypic, transcriptional, metabolic, and genetic evidence 15 . Similarly, two other reported markers that are ready to be used for MAS of trees with improved resistance to pathogens and pests in conifers 21,22 ; build on the integration of genetic information and complementing techniques.
The QTL analysis in Norway spruce for resistance to H. parviporum identified 13 QTLs linked with four traits related to host resistance 14 : lesion length at the inoculation site (LL), exclusion of the pathogen from the host after initial infection (E), infection prevention from establishing at all (IP) and fungal spread within the sapwood (SWG). The validated marker, PaLAR3, comprised one of the four QTLs associated with SWG 14,15 . In this study, we aimed to use the high degree of synteny and macrocollinearity within Pinaceae to identify novel candidate resistance genes associated with the QTLs for resistance to H. parviporum 14 . We used the composite map of the Pinaceae family and gene expression patterns in Norway spruce after challenging it with H. parviporum; therefore, we could identify candidate resistance genes for future validation and functional analyses.
First, we used a Pinaceae composite linkage map to identify additional Norway spruce candidate genes associated with already described resistance QTLs. Second, we predicted that the combination of genetic and transcriptional information, would enable us to identify candidate resistance genes of induced response for future analyses; allowing us to hypothesize that the genes in the QTL regions, which are likely to be important for controlling spread of H. parviporum, are also likely to be more strongly regulated by inoculation than by wounding alone. Previous studies have highlighted broad similarities in defence responses to H. parviporum infection and to wounding control compared to naïve material, although inoculated samples showed a heightened response 23,24 .
The composite map of the Pinaceae family 25 integrated published maps of Norway spruce 14 , Picea glauca and Picea mariana 26 and Pinus taeda 27 with genetic maps from multiple crosses of Pinus pinaster. QTL markers from the Norway spruce linkage map 14 were included in the Pinaceae composite map 25 . As the composite map is considerably denser than the Norway spruce map we chose to focus on markers in and around a significant QTLs for resistance identified by Lind et al. 14 to identify candidate genes in the Pinaceae composite map 25 . This summed up to 329 candidate genes associated with 12 genomic regions of Norway spruce. We further determined the transcriptional responses of these candidate genes at three and seven days after inoculation with H. parviporum. We detected 124 differentially expressed candidate resistance genes, including two putative NAC family transcription factor genes in response to H. parviporum inoculation compared to the control treatment of mechanical wounding.

Materials and methods
Identification of conifer candidate genes associated with QTL regions from a composite map. The markers in the Norway spruce QTL map 14 were integrated in the Pinaceae composite map 25 . The markers in the Norway spruce QTL map 14 were used to identify the corresponding QTL regions in the Pinaceae composite map 25 . The markers associated with the QTL regions were identified around a significant QTL LODpeak which contained significant markers (P value < 0.05) according to the Kruskal-Wallis test in Norway spruce linkage map 14 . The candidate genes between significant markers in the confidence interval, in the Pinaceae composite map were identified and selected as confidence interval candidate genes (CCGs) for future analyses (Supplementary file1). The candidate genes in between the next subsequent markers outside the confidence interval were also selected. These genes were referred as putative candidate genes (PCGs) (Supplementary file1). We chose to include PCGs because all the markers used in the Norway spruce QTL map 14 were not included in the Pinaceae composite map 25 . For some QTL regions this discripancy made it difficult to delineate the QTL region, thus the genes in the category represent genes which are suspected to associate with the QTL region. Therefore, differentially expressed PCGs could still be interesting in this experimental set up as a part of an induced defence system.
Identification of Norway spruce candidate resistance genes associated with QTL regions. The FASTA sequences of the unigenes derived markers corresponding to genes in regions of the genome associated with the resistance QTLs were downloaded from the P. pinaster unigene catalogue (https ://www.scbi.uma.es/ susta inpin edb/unige ns) and the most probable Norway spruce orthologues were identified by a blastN query (E-value cutoff:1e-3) in the Norway spruce gene catalogue (Pabies v1.0, www.conge nie.org), excluding low confidence candidate genes with less than 30% coverage. norway spruce materials. For the RNAseq study, six 7-year-old rooted cuttings of each of the genotypes S21K0220126 and S21K0220184, originating from a well-studied full-sib family (S21H9820005) of Norway spruce 11 inoculation experiment. Branches were artificially inoculated with H. parviporum (isolate Rb175) as previously described 23 . The same isolate was used to generate the QTL map in Norway spruce 14 . Briefly, branches were wounded with a five-mm cork borer and wooden plugs covered with mycelium from H. parviporum were attached to the wound with Parafilm®; control branches on the tree were also wounded and a sterile wooden plug was attached and sealed with Parafilm. Based on the difference in necrotic lesion length (LL) extension from the inoculation site after inoculation with H. parviporum the genotypes S21K0220126 and S21K0220184 (short and long, respectively, data not shown) were selected for RNA sequencing. For the RNAseq study, bark and phloem samples were harvested at three and seven days post-inoculation (dpi). At the time of harvest, bark surrounding the wounds and inoculation sites was cut into two sections and samples were collected at the inoculation site (A) 0-0.5 cm around the wound, and distal to the inoculation site (C) 1.0-1.5 cm. We used six ramets per clone and three inoculations per twig were done. The bark samples were frozen separately in liquid nitrogen and stored at − 80˚C until further use. For the qRT-PCR study sampling is described in detail elsewhere 23 , briefly one bark and phloem sample was taken for each treatment and time point from six separate full-sib genotypes.
RNA extraction, transcriptome sequencing and qRT-PCR. RNA extraction. Total RNA was isolated by using a modified CTAB extraction protocol 28 . The samples were treated with DNase I (Sigma-Aldrich) to eliminate contamination of genomic DNA. The RNA integrity was analysed by using the Agilent RNA 6,000 Nano kit (Agilent Technologies Inc.). transcriptome sequencing and bioinformatics analyses. Three biological replicates of clones S21K0220126 and S21K0220184 per treatment were used for Illumina sequencing. Sequencing libraries were prepared at the SNP&SEQ Technology Platform (SciLifeLab, Uppsala) using the TruSeq stranded mRNA sample preparation kit according to the manual TruSeq stranded mRNA sample preparation guide. Sequencing was done using HiSeq 2,500, paired-end 125 bp read length, v4 sequencing chemistry. The raw sequences were submitted to the Sequence Read Archive (SRA) portal (NCBI) under BioProject accession number PRJNA522265.
RNAseq analyses were performed with the Tophat-Cufflinks pipeline as previously described 29 . Briefly, Nesoni clip 0.97 (https ://githu b.com/Victo rian-Bioin forma tics-Conso rtium /neson i) was used to filter adaptors and lowquality bases. Illumina reads were filtered based on phred-scale with a quality score cut-off of 20, minimum adapter length match of 20, with maximum errors of one in the adaptor and reads shorter than 35 were discarded. A Bowtie reference from the 'Pabies1.0-all-cds.fna' was constructed, downloaded from the Norway spruce genome portal (https ://conge nie.org/) using Bowtie2 version 2.2.4 (https ://bowti e-bio.sourc eforg e.net/bowti e2/index .shtml ) to enable alignments to a reference database. The filtered read pairs were aligned to 'Pabies1.0-all-cds' reference gene model with Tophat version 2.0.13 30 . Cufflinks version 2.2.1 was used to assemble all transcripts of each sample with the results of the alignment from TopHat. Cuffmerge included in the cufflinks package was used to merge all assemblies. Cuffquant (https ://colet rapne ll-lab.githu b.io/cuffl inks/manua l/) calculated transcript abundance from the single assembly of the sample, and the aligned read files produced by the Tophat output were run separately for each sample. Cuffdiff was used for differential expression analysis using default settings 30,31 . qRt-pcR. One µg of total RNA was reverse transcribed to cDNA with the iScript cDNA Synthesis Kit (Bio-Rad) in a total reaction volume of 20 μl according to the manufacturer's instructions. A ten-fold dilution of the cDNA was stored at − 20 °C. cDNA equivalent to 25 ng of total RNA worked as template for each PCR reaction, using SSoFast EVAGreen Supermix (Bio-Rad). Primers for candidate genes were designed within the exons of the predicted candidate genes in the P. abies v 1.0 release using Primer3 software 32 with a melting temperature (Tm) between 60 °C and 61 °C. A final concentration of 0.15 μM of each primer (Supplementary Table 1) was used. The thermal-cycling condition parameters, ran on an iQ™5 Multicolor Real-Time PCR Detection system (Bio-rad) using the following cycling parameters: 98 °C for 2 min; 40 cycles of 98 °C for 5 s, 60 °C for 10 s. A melt-curve analysis followed the qRT-PCR reactions, to confirm that the signal was the result of a single product amplification. Primer amplification efficiency was determined by amplification of serial dilutions of cDNA from Norway spruce with PCR conditions described above. The relative expression was calculated from threshold cycle values (Ct) using the 2ΔΔCT-method 33 . Transcript abundance was normalized to the reference genes eukaryotic translation initiation factor 4A (elF4A) 34 and elongation factor 1-α (ELF1α) 23 . The gene expression experiments were done with six biological and two technical replicates. Gene expression data were analysed by analysis of variance (ANOVA) using a general linear model approach implemented in R-program (https :// www.r-proje ct.org/).

Results
Markers and candidate genes associated with resistance QTLs. To identify additional Norway spruce unigene derived SNPs markers in the already described resistance QTLs, we identified 369 P. pinaster unigene derived SNPs markers in the composite map 25 based on markers in the QTLs in Norway spruce 14 in the composite map 25 (Table 1). The P. pinaster unigene derived SNPs markers were used to query the Norway spruce gene catalogue. A total of 329 candidate genes previously not known to associate with the resistance QTLs were successfully identified, of which 83 were CCGs in between the significant markers within the confidence interval and 246 were PCGs in between the subsequent markers outside the confidence interval in Pinaceae composite  Table 2). The fractions of induced and repressed genes were similar in all comparisons. Of the 329 candidate genes selected in the Pinaceae composite map 25 , 298 were expressed (80 in CCG and 218 in PCG categories respectively), in at least one of the treatments in the RNAseq experiment (Supplementary Table 3). The candidate genes showed differentially expression at three and seven dpi after H. parviporum inoculation compared to wounding, of which 41 were differentially expression in CCG and 83 in PCG categories in the treatments in RNAseq experiment ( Fig. 1, Supplementary Table 3).

DEGs in QTLs associated with infection prevention. The QTLs associated with IP positioned on
LG1, LG2 and LG11 14 were identified in the composite map. Twelve expressed and six differentially expressed candidate genes (DEGs) associated with the QTL on LG1 (Fig. 1, Supplementary file 4). Four were upregulated adjacent to the inoculation site and two DEGs were downregulated at three dpi proximal to the inoculation site compared to wounding (Fig. 1). Most expression changes were quite moderate at LG1 with maximum log 2 fold change of 1.75. Fifteen expressed genes were associated with IP on LG2 (Supplementary file 4). Twelve of these were classified as CCGs and three as PCGs. None of PCGs were differentially expressed (Supplementary file 4). Four DEGs were associated with IP on LG2 were identified as CCGs (Fig. 1, Supplementary file 4). Two DEGs were upregulated at three dpi at the inoculation site, and one DEG was still upregulated at seven dpi. Again, most changes in expression patterns were quite moderate on LG2, except a putative glycosyltransferase gene (MA_186971g0010), an ortholog of the Arabidopsis protein UGT85A1 that encodes a UDP-Glycosyltransferase (UGT) protein. This gene has an interesting expression pattern, it is highly upregulated around the inoculation site at three and seven dpi, although its expression level was low ( Fig. 1; Supplementary file 4). It should be mentioned that a second UGT like gene, MA_10436196g0010 was among the identified candidate genes. MA_10436196g0010 correspond to the original marker 0.276-A13-934 significantly associated with SWG on LG9 14 , but this candidate gene was not differentially expressed ( Table 2, Supplementary file 4).
Linkage group 11 associated with a QTL for IP contained a total of 29 expressed genes and 15 DEGs ( Fig. 1; Supplementary file 4). Only five of the expressed genes were categorized as CCGs, three of these were differentially expressed. Interestingly, the PgMYB11-like (R2R3-MYB transcription factor PgMYB11-like) candidate gene MA_24271g0020/BT103501, associating with the QTL in the original study 14 , was highly upregulated at both three and seven dpi adjacent to the pathogen inoculation site (Fig. 1, Supplementary file 4). Likewise, the candidate gene MA_4742g0010 (PabiesFT1-1,251/BT115191 encoding spruce Mother of FT1, MFT1), was also positioned in this QTL in the original study. It was up-regulated in response to the inoculation at seven dpi even  25 . d The number of unique candidate genes in Norway spruce genome these markers correspond to. "CCG" and "PCG" stands for candidate genes within the confidence interval and outside the confidence interval respectively.
LG6 has two separate QTL regions for control of SWG (Fig. 1). The first region included six DEGs and comprises the marker BT105286, which corresponds to the Norway spruce candidate gene MA_10428976g0010 (PLAC8), which was moderately upregulated at three dpi at the inoculation site.
The second QTL region on LG6 included 10 DEGs, all categorized as CCGs (Supplementary file 4). The previously validated marker PaLAR3 (MA_176417g0010/ BT109050) is positioned in this QTL in the original linkage map 14 . However, the homologous marker (sp_v3.0_unigene4600) is positioned nearly 30 cM away from BT116508 in the composite map 25 . PaLAR3 was not expressed in this study. However, the candidate gene MA_10433955g0020, encoding the homolog of the original QTL marker BT116508 also associated with SWG in the original study, is highly but not differentially expressed in this study ( Table 2, Supplementary file 4). The candidate gene MA_853405g0010 (hypothetical protein) upregulated proximal to the inoculation at seven dpi (and three dpi) in response to inoculation, but it was not expressed in the distal samples collected from the inoculation treatment at seven dpi (Fig. 1). MA_122748g0010 (a putative GRX, Glutaredoxin) also showed significant up-regulation proximal to the inoculation in our RNAseq analysis. None of the candidate genes associated with the SWG QTL were upregulated distal to the inoculation site (Fig. 1).
On LG9, we found 15 expressed candidate genes in the SWG QTL region (Supplementary file 4), five of these were also differentially expressed, including MA_19215g0010, the homolog of the original marker BT115393. All the DEGs in this region showed higher expression after wounding than after inoculation with H. parviporum (Fig. 1). identify with the QTL for lesion length in the phloem included in the Pinaceae composite map was located on LG8. We found 13 candidate genes that were expressed in this region, and eight of these were differentially expressed. All of the candidate genes except UDP-glucuronate 4-epimerase (MA_52380g0010), which was induced at seven dpi at site A and C respectively, were downregulated in response to inoculation (Fig. 1, Supplementary file 4). A Leucine-rich repeat protein kinase family protein gene (MA_17691g0010) was down regulated proximal to inoculation site at three and seven dpi. UDP-glucuronate 4-epimerase and Leucine-rich repeat protein kinase family protein gene were categorized as PCGs. It is noteworthy that a cinnamoyl-CoA reductase (MA_10435810g0010) was repressed at seven dpi at both proximal and distal to the inoculation. It is also the only candidate gene in CCG category (Fig. 1, Supplementary file 4).

DEGs in QTLs associated with exclusion of the H. parviporum from the norway spruce.
We could identify the QTLs associated with exclusion of the fungus from the host located on LG1, LG2, LG3 and LG6 14 in the composite map. On LG1 five DEGs were found and two DEGs of which were upregulated at seven dpi one at proximal and other at distal site (Fig. 1, Supplementary file 4). On LG2 we found 16 DEGs associated with the QTL and only one of these is a CCG, which is the homolog of the original QTL marker (BT100742) MA_10431443g0010 a superoxide dismutase (Fig. 1, Table 2, Supplementary file 4). Both MA_10431443g0010 and MA_48816g0010 were downregulated proximal to the inoculation site at seven dpi. A Catalase (MA_10437148g0010) and Esterase family protein (MA_10435680g0010) were induced at seven dpi at proximal site and at distal site respectively (Fig. 1, Supplementary file 4). The candidate gene MA_28209g0010 (NAD(P)-binding Rossmann-fold superfamily protein) was upregulated proximal and distal to the inoculation site at both three and seven dpi. It was notable that candidate gene (MA_10436080g0010) dehydroquinate dehydratase, / shikimate dehydrogenase was induced in response to inoculation in all conditions tested (Fig. 1, Supplementary file 4).
There were three DEGs associated with the exclusion QTL on LG3. Interestingly, phenylalanine ammonialyase (PAL) (MA_15852g0010), an ortholog of the Picea glauca PgPAL3 (Genbank: BT119163) 35 identified as PCG, was upregulated at both proximal and distal site; while the other two were downregulated proximal to inoculation, at both three and seven dpi (Fig. 1, Supplementary file 4).
Like with the second QTL for SWG on LG6, the exclusion QTL on LG6 had a large number of expressed CCGs (19) and 72 PCGs. Forty two of these candidate genes were also differentially expressed in at least one comparison (Fig. 1, Supplementary file 4). The orthologs (MA_14707g0010, MA_16728g0010 and MA_942991g0010) of the three markers associated with the original QTL, all showed differential expression in this study (Fig. 1). Interestingly enough, the QTL region for exclusion on LG6 showed three candidate genes encoding class III-3 NAC transcription factors: MA_264971g0010 (PaNAC04), MA_86256g0010 and MA_103386g0010. The candidate gene PaNAC04 and MA_103386g0010 were classified as CCGs. However, MA_86256g0010 was classified as a PCG. All three candidate genes were relatively highly expressed and showed clear induction, both proximal and distal, to the inoculation compared to wounding. This was especially noticeable at one week after the inoculation (Fig. 1), although, only PaNAC04 was differentially regulated in S21K0220184 distally from the inoculation site at seven dpi (Fig. 1, Supplementary file 4). This expression pattern together with the previously published phylogeny of the sub group III-3 NACs 29 led us to investigate if the MA_103386g0010 candidate gene represents a different gene from the previously described PaNAC04. Additionally, an analysis of PaNAC04 (MA_264971g0010) and MA_103386g0010 expression by qRT-PCR in six well characterized Norway spruce genotypes 23 showed that, on average PaNAC04 was not significantly differentially expressed. However, MA_103386g0010 was differentially expressed at seven dpi in response to inoculation (Table 3). Furthermore, expression of PaNAC04 was only detected in three of the genotypes (Fig. 2b, d) whereas MA_103386g0010 was expressed in all genotypes and treatments (Fig. 2a, c).

Discussion
Throughout evolution, species in the Pinaceae have maintained a high degree of synteny and collinearity in their genomes 25,26,[36][37][38] . In this study we have capitalized on this feature to expand the array of potential candidate genes associated with the resistance QTLs originally reported by Lind and co-workers 14 . As the composite map 25 is considerably more dense than the original map, it may allow for identification of additional candidate genes Table 3. Comparison of log 2 fold change of the genes in RNAseq and qRT-PCR. Comparison of expression patterns of candidate genes in RNAseq and qRT-PCR experiment. The value represent log 2 fold change in inoculation compared to wounding at 3 and 7 days post inoculation. I3 and W3 means inoculation and wounding at 3 dpi and I7 and W7 means inoculation and wounding at 7 dpi. Asterisks (*) indicate significant higher induction level in H. parviporum compared to wounding alone (P < 0.05).

RNA-seq log 2 (fold change)
RNA-seq log 2 (fold change) qPCR log 2 (fold change) qPCR log 2 (fold change) www.nature.com/scientificreports/ associated with the reported QTL regions. Using shared markers associated with the QTL regions, identified around a significant LOD peak inside the confidence interval and subsequent markers outside the LOD peak in the Pinaceae composite map 25 , we found 329 potential Norway spruce candidates associated with 12 out of 13 original QTLs. Two hundred and ninety eight of the candidate genes were expressed, of which 80 were within confidence interval candidate genes (CCGs) and 218 putative candidate genes (PCGs). Out of these 41 CCGs and 83 PCGs were differentially expressed in the RNAseq study. The products of the candidate genes could affect the resistance trait either as part of the constitutive or the induced defence 39,40 . Candidate genes involved in constitutive defence response are important for the host, but were not included in our current study. Compared to induced candidate genes associated with induced defences, whose involvement can be detected by transcriptional, protein or metabolite accumulation the contribution of constitutive defences are very difficult to quantify in short-term experiments. Therefore, we delibrately chose to focus on identifying candidate genes with a potential role in the induced defences in Norway spruce. It is reasonable to hypothesize that candidate genes associated with the resistance QTLs and also showing differential regulation in response to H. parviporum inoculation may be connected to the resistance phenotype 15,24 . In this study we used progenies from one Norway spruce cross and one H. parviporum strain to create expression profiles of the candidate genes associated with the resistance QTLs. Plant materials used in this study was represented by a set of closely-related individuals drawn from the original QTL population 11,14 . This approach may have limitations as the use of more host and pathogen genotypes could have given a broader picture. However, in this study we wanted to study the expression pattern of the candidate genes mapped in the original QTL mapping study 14 in response to same fungal isolate H. parviporum Rb175 that was used to detect the QTLs. Using more H. parviporum isolates in the study would have perhaps generated information on how broad the transcriptional responses of the candidate genes associated with the QTLs were. Even though the choice to use full-sib plants and H. parviporum Rb175 might have the drawback of not being able to identify some of the broader defence responses in Norway spruce, it can help removing "signal noise" as resistance against H. parviporum in Norway spruce is a quantitative trait with multiple genes having small effect where highly resistant individuals might www.nature.com/scientificreports/ rely on a different set of genetic factors. The approach in this study allowed us to work with well-studied plant materials and to use a narrow genetic base to generate initial data 41,42 and identify potential resistance candidates for further work, which was one of the primary objectives of this study. Clearly we missed out on potential candidate genes which are not associated with the QTL regions in the genetic linkage map and also candidates associated with QTLs which are not present in the original study 43,44 . Nevertheless, combining genomic and transcriptomic analysis we identified 124 candidate resistance genes which could be considered as candidates for induced resistance. Fungal sapwood growth (SWG) is a trait that reflects the trees capacity to restrict the spread of the pathogen in its sapwood. The general assumption is that the trees with shorter SWG after inoculation with H. annosum s.l. in the sapwood would also display shorter decay columns after natural infections 45 . We could locate all of the original QTLs for SWG in the composite map 14,25 . Interestingly, the inspection of the QTL for SWG2 on LG6 including the previously validated marker PaLAR3 15 in the composite map, positioned PaLAR3 nearly 30 cM away from the other markers in this QTL region. This could suggest that the markers found around 154 cM on LG6, including BT116508/ MA_10433955g0020 encoding magnesium-protoporphyrin, may not be linked to the original SWG QTL or the original QTL 14 comprises several independent QTLs which could not be separated due to low density of markers in this QTL region. Despite this we chose to include these candidate genes in the subsequent analyses, and 39 expressed candidate genes associated with the SWG QTL were identified. In fact, 16 of the candidate genes differentially expressed (10 CCGs and 6 PCGs) in response to inoculation were found at this position in the map. Among the upregulated candidate genes associated with the SWG QTLs MA_853405g0010 (a hypothetical protein containing a domain of unknown function, DUF4228), and the MA_122748g0010 (a putative GRX, Glutaredoxin) showed significant up-regulation during inoculation compared to wounding at both three and seven dpi in our RNAseq analysis. Both of these DEGs were positioned at 154 cM on LG6. The expression patterns of the candidate genes in this region could possibly indicate that they are involved in the expression of the resistance trait. Clearly, fine mapping the region between BT116508 and BT109050/PaLAR3 would be very helpful to resolve the locus structure.
One of the two QTLs for lesion length (LL) could be identified in the composite map 14,25 . We found two DEGs associated with cell wall modifications and specialized metabolism in this QTL region. The moderately upregulated DEG MA_52380g0010 identified as PCG (log 2 Fold Change of 0.75) with similarity to UDP-glucuronate 4-epimerase that presumably catalyzes the formation of the key building block of pectins, UDP-d-galacturonic acid 46,47 . The second identified as CCG in this QTL that is putatively associated with cell wall modifications encodes a CCR-like protein (cinnamoyl-CoA reductase, MA_10435810g0010) and was repressed both proximally and distally at seven dpi. CCR is the first committed enzyme in the lignin-specific pathway 48,49 , and it is possible that the downregulation of this gene is part of a redirection of resources away from the lignin biosynthesis pathway to other branches of the phenolics metabolism (see also discussion on NAC transcription factors below). The repressed CCR gene would thus be in line with Norway spruce allocating more resources to potential antifungal low molecular weight phenolics following challenges with H. annosum 24 . It has been shown that suppression of CCR gene expression in Norway spruce decreases lignin content 50 . Suppression of CCR gene expression in tobacco has been accompanied by accumulation of phenolic substances 51,52 . Further analyses, eg. with RNAi-or overexpression constructs would shed light on cinnamoyl-CoA reductase's (MA_10435810g0010) role in the interaction with H. parviporum.
There are two expression hotspots associated with the QTL regions for IP and E. The traits IP and E are measures of the ability the host have to stop the fungus from entering the wound upon inoculation or the ability to hem in and exclude an invading pathogen 14 . Thus, these traits could potentially reflect the "true" resistance to infection, and consequently the candidate genes associated with these QTLs were of special interest to us. Two thirds of the expressed genes as well as differentially expressed genes were associated with these QTLs. It is difficult to decide when to best capture gene expression patterns associated with IP and E in the artificial inoculation system we employ, as the read out of the traits takes place after several weeks of interaction 14,53 . Consequently, the expression data gathered for these traits should be seen as a probe and genes not differentially expressed in the current study could still be highly relevant to induced defence responses in H. parviporum resistance.
It is notable that several of the candidate genes associating with the QTL regions for IP are genes or orthologs to genes, which have been shown to respond to biotic stress 29,54 or control other adaptive traits 55,56 in the genus Picea. MA_24271g0020 (PgMYB11-like), which corresponds to the original marker BT103501 in the first QTL for IP on LG11 14 , is one example. This marker has been found to associate with a QTL for bud set in black spruce 55 . The observation that variation in BT103501 associate with two apparently different traits in spruce, indicates pleiotropic effect or tightly linked loci 55,57 . Another of the original markers in this QTL MA_4742g0010 (PabiesFT1-1,251/BT115191) encoding a spruce Mother of FT1-like protein MFT1, suggested to control the formation of resin ducts in male buds 56 . MA_4742g0010 was weakly expressed but significantly upregulated at seven dpi both proximal and distal to the inoculation. Neither PgMYB11-like nor Norway spruce MFT1, have been connected to host defence previously. In contrast, a DEG associated with this QTL that has a recognized role in the induced defences in conifers is MA_6931g0010. This candidate gene encodes a putative caffeoyl-CoA O-methyltransferase (CCoAOMT) and it is highly expressed in all treatments and was weakly upregulated at seven dpi proximal to the inoculation site. CCoAOMT1, an enzyme in the lignin biosynthesis pathway 54 has been implicated in budworm and white pine weevil resistance in white spruce.
The candidate gene Phenylalanine ammonia-lyase (PaPAL3) identified as a PCG (MA_15852g0010), an ortholog of the Picea glauca PgPAL3 (Genbank: BT119163) 35 at the QTL for exclusion on the LG3, was upregulated both proximal and distal to the inoculation site at three and seven dpi. Two other PaPAL1 and PaPAL2 is reported to be induced upon wounding and H. annosum s.l. 24,58,59 . PAL is the first enzyme committed in the phenylpropanoid biosynthesis pathway 60 . The activation of phenylpropanoid biosynthetic pathway which leads to the production of polyphenolics. Flavonoids and stilbene monomers plays a central role in the induced defence Scientific RepoRtS | (2020) 10:12711 | https://doi.org/10.1038/s41598-020-69386-0 www.nature.com/scientificreports/ towards wounding and fungal infection in conifers [61][62][63][64] . Stilbene astringin was negatively correlated with the depth of the hyphal penetration of Heterobasidion annosum in Norway spruce bark 65 . Flavonoids have an antimicrobial effect on H. annosum s.l. 24 and E. polonica 20,[61][62][63] in Norway spruce. Just like the IP QTL on LG11, the QTL region for fungal exclusion from the sapwood on LG6 also involved a number of candidate genes that had been studied previously in conifers. However, the most interesting feature of the QTL region for exclusion on LG6 is that it harbours three of the previously identified seven Norway spruce candidate genes with similarity to subgroup III-3 NAC transcription factors 29 ; PaNAC04, MA_86256g0010 and MA_103386g0010. All three candidate genes were relatively highly expressed and showed clear upregulation in response to inoculation with H. parviporum both proximal and distal to the inoculation (Fig. 1). Their upregulation could possibly be associated with a shifted balance from cell wall use to defense active phenylpropanoids 29 . Naturally, the question arose if these three predicted candidate genes indeed represent distinct genes. The difficulties in assembling the large and repetitive conifer genomes into scaffolds will lead to errors in the assembly 66,67 . Thus, it would not be unlikely that the three highly similar candidate genes 29 , correspond to one single gene located in the Exclusion QTL on LG6. However, based on the expression patterns detected by qRT-PCR, which agrees with the previous phylogenetic analysis 29 , placing PaNAC04 and MA_86256g0010 together on a supported branch separate from MA_103386p0010 in subgroup III-3 of the NAC transcription factor family, we argue that there are at least two NAC genes associated with this QTL. Albeit, tightly linked and not much diverged, but this must be confirmed by resequencing of the region.
This study gives an insight into Norway spruce genome organization with information of position of the candidate genes associated with resistant trait in the genome e.g. the previously described PaNAC04. PaNAC04 was not only upregulated in response to H. parviporum but it was also located in the region important for controlling resistance determined by QTL mapping 14 . Therefore, allelic variation in PaNAC04 needs to be further studied in future experiment to understand the role of PaNAC04 in controlling induce defence response.
In conclusion; this study, combining map-based information and expression analyses have associated previously identified candidate genes, such as PaNAC04 and MA_103386g0010, with genomic regions in Norway spruce harboring resistance QTL, strengthening their predicted role in control of H. annosum s.l. infection. This approach has also allowed us to identify a set of novel candidate genes for future analyses, the most prominent being genes associated with the phenylpropanoid pathway CCR (MA_10435810g0010), the PgMYB11-like gene and PAL (MA_15852g0010).