## Introduction

Genome-wide association studies (GWAS) of biomarkers have been highly successful in identifying novel biological pathways and their impact on health and disease. Biomarkers increase statistical power in GWAS, compared to disease diagnoses, due to their quantitative nature and lack of errors due to subjectivity, such as misclassification. Thus, biomarker GWAS have identified thousands of biomarker-associated loci and elucidated the mechanisms underlying numerous disease associations [1,2,3]. A recent study on 38 biomarkers in the UK Biobank (UKBB) identified over 1,800 independent genetic associations with causal roles in several diseases [4]. Proteomics and metabolomics integrated with genomics has also revealed causal molecular pathways connecting the genome to multiple diseases, e.g., autoimmune disorders and cardiovascular disease [5,6,7,8]. Although biomarkers are more closely related to pathophysiology, a single biomarker is usually an inaccurate estimator of complex disease due to phenotypic heterogeneity and individual variation. Therefore, combinations of biomarkers provide a more robust predictive molecular signature. Studies examining combinations of biomarkers are increasingly feasible given the availability of biobank resources around the globe with deep phenotyping, i.e., precise and comprehensive data on phenotypic variation including quantitative measures such as biomarkers [9, 10].

Multivariate GWAS increases statistical power compared to univariate analysis, especially in the case of complex biological processes and correlated traits [8, 11, 12]. This leads to identifying multivariate associations that are otherwise missed by univariate analysis [8, 13]. Efficient software programs are available for performing multivariate GWAS such as metaCCA [14], yet multivariate analyses currently have shortcomings in interpreting the arising signals. Follow-up tools for fine-mapping causal variants within the associated loci are lacking and the subset of tested traits that drive the association signals have not been identified. These shortcomings are largely due to the lack of a multivariate counterpart to the univariate regression coefficients (beta estimates). Lack of these necessary follow-up tools has hindered the utilization of multivariate methods.

In this study, we developed a novel computational workflow for multivariate GWAS discovery and follow-up analyses including fine-mapping and identification of driver traits (Fig. 1). Our workflow includes (1) a customized version of the metaCCA software that overcomes the problem of missing beta estimates by turning each multivariate association into its optimal univariate Linear Combination Phenotype (LCP), enabling an LCP-GWAS, (2) fine-mapping, i.e., identifying putative causal variants underlying each association using summary statistics from the LCP-GWAS and a multivariate extension to FINEMAP [15], and (3) determining the traits driving each multivariate association using a newly developed tool, MetaPhat [16] that efficiently decomposes the multivariate associations into a smaller set of underlying driver traits. Taken together, we present to our knowledge the first comprehensive framework to map multivariate associations into individual causal variants and a subset of driver traits. We demonstrate the potential of our workflow in a Finnish population-based cohort with 12 inflammatory biomarkers implicated in the pathogenesis of autoimmune disorders and cancer [17,18,19]. This set of highly correlated biomarkers is particularly advantageous for multivariate analysis as high correlation between traits increases the boost in statistical power achieved by multivariate methods. Using multivariate analysis, we identify additional hits compared to univariate analysis, totaling 11 independent associations. We follow them up in a phenome-wide association study (PheWAS) in the FinnGen study (n = 176,899) across 2367 disease endpoints and in the UKBB (n = 408,910) [10]. We discover multiple disease associations, as well as identify orthogonal evidence for the biological impact of the causal variants through several protein quantitative trait loci (pQTLs) within the multivariate loci.

## Materials and methods

### Study cohort and data

We studied 12 highly correlated inflammatory biomarkers in the population-based national FINRISK Study [20] collected in 1997 (n = 6890) (Table 1, Supplementary Fig. 1). The FINRISK Study is a large Finnish population survey of risk factors for chronic, non-communicable diseases, and it has been collected by independent random population sampling every five years beginning in 1972 with multiple recruiting waves. The 12 inflammatory biomarkers included five interleukins (IL-4, IL-6, IL-10, IL-12p70, IL-17), three growth factors (FGF2, PDGF-BB, VEGF-A), one colony-stimulating factor (G-CSF), one interferon (IFN-γ), one chemokine (SDF-1ɑ), and one tumor necrosis factor (TNF-β) (Table 1, Supplementary Fig. 1). Hierarchical clustering identified the cluster of 12 inflammatory biomarkers out of 66 quantitative traits of cardiometabolic or immunologic relevance (Supplementary Fig. 2, Supplementary Table 1, and Supplementary Methods). The 66 quantitative traits were measured as previously described [1, 20, 21].

### Genotyping, imputation and quality control

Samples were genotyped using multiple different genotyping chips (Supplementary Table 2), for which pre-imputation quality control (QC), phasing and imputation were done in multiple chip-wise batches (Supplementary Methods). Imputation of the genotypes was done utilizing a Finnish population-specific reference panel of 3775 high-coverage whole-genome sequences. Genotype imputation was followed by an additional post-imputation sample QC (Supplementary Methods) and variant QC (imputation INFO > 0.8, minor allele frequency > 0.002 and Hardy–Weinberg equilibrium p value > 1 × 10−6). A total of 26,717 samples and 11,329,225 variants passed this rigorous quality control. All variants are reported based on the human genome reference sequence GRCh38.

### Univariate and multivariate GWAS

Univariate genome-wide association analyses for the biomarkers were performed using a linear mixed model implemented in Hail [22], adjusting for age, sex, genotyping chip, first ten principal components of genetic structure and the genetic relationship matrix (GRM) (Supplementary Methods). The GRM was estimated using 73K independent high-quality genotyped variants (Supplementary Methods). We performed multivariate GWAS on the biomarkers using metaCCA [14], software that performs multivariate analysis by implementing Canonical Correlation Analysis (CCA) for a set of univariate GWAS summary statistics.

The objective of CCA is to find the linear combination of the p predictor variables (X1, X2, …, Xp) that is maximally correlated with a linear combination of the q response variables (Y1, Y2, …, Yq). If we denote the respective linear combinations by

$$X^ \ast = {\boldsymbol{a}}^\prime {\mathbf{x}} = a_1x_1 + a_2x_2 + \ldots + a_px_p$$

and

$$Y^ \ast = LCP = {\boldsymbol{b}}^\prime {\boldsymbol{y}} = b_1y_1 + b_2y_2 + \ldots + b_qy_q$$

then finding the linear combination of the predictor variables that are maximally correlated with the linear combination of the response variables corresponds to finding vectors a and b that maximize

$$r = \frac{{(Xa)\prime (Yb)}}{{||Xa|| ||Yb||}} = \frac{{{\boldsymbol{a}}\prime \mathop {\sum }\nolimits_{xy} {\boldsymbol{b}}}}{{\sqrt {{\boldsymbol{a}}\prime \mathop {\sum }\nolimits_{xx} {\boldsymbol{a}}} \sqrt {{\boldsymbol{b}}\prime \mathop {\sum }\nolimits_{yy} {\boldsymbol{b}}} }}$$

where $${{\Sigma }}_{xx},{{\Sigma }}_{yy}$$ and $${{\Sigma }}_{xy}$$ represent the variance-covariance matrices of the predictor variables, response variables and both of them together, respectively. The maximized correlation r is the canonical correlation between X and Y. Multivariate GWAS is a special case of CCA with multiple response variables Y, but only one explanatory variable X, the genotypes at the variant tested.

### Novel multivariate LCP-GWAS method

To enable follow-up analyses of multivariate GWAS results, such as fine-mapping, we developed a novel method to produce linear combination phenotypes (LCP) at the single variant level by extending the functionality of metaCCA. The updated metaCCA is available online at: https://github.com/acichonska/metaCCA.

LCPs were constructed as the weighted sum of the trait residuals, where the weights (b = [b1, b2 …, bq]) were chosen to maximize the correlation between the resulting linear combination of traits and the genotypes at the variant. We determined association regions by adding 1 Mb to each variant reaching genome-wide significance (GWS; p value < 5 × 10−8) in the multivariate analysis and joining overlapping regions. We constructed LCPs for the lead variant, i.e., the variant with the smallest p value, in each of these regions, as a univariate representation of the multivariate association in that region. Next, we performed chromosome-wide LCP-GWAS for the constructed LCPs in a similar manner as for each of the biomarkers.

### Fine-mapping multivariate associations

We used FINEMAP [15, 23] on the LCP-GWAS summary statistics to identify causal variants underlying the multivariate associations. FINEMAP analyses were restricted to a ± 1 Mb region around the GWS variants from the LCP-GWAS.

We assessed variants in the top 95% credible sets, i.e., the sets of variants encompassing at least 95% of the probability of being causal (causal probability) within each causal signal conditional on other causal signals in the genomic region. Within these sets we excluded those sets that did not clearly represent one signal, determined by low minimum linkage disequilibrium (LD, r2 < 0.1). Among each of the credible sets, the variant with the highest causal probability was chosen to represent the set as the representative variant.

To validate the multivariate fine-mapping results, we also performed conventional stepwise conditional analysis for all fine-mapping regions using LCPs. We iteratively conditioned on the lead variant in the region until the smallest p value in the region exceeded 5 × 10−8.

### Identifying driver traits

We determined the traits driving the multivariate associations for the representative variants of the credible sets identified by fine-mapping using the MetaPhat software developed in-house [16]. MetaPhat determines the set of driver traits for each multivariate association by performing multivariate testing using metaCCA iteratively on subsets of the traits, excluding one trait at a time until a single trait remains. At each iteration, the trait to be excluded is the one whose exclusion leads to the highest p value for the remaining subset of traits. The driver traits are determined as a set of traits that have been removed when the multivariate p value becomes non-significant (p > 5 × 10−8). The interpretation is that the driver traits make the multivariate association significant.

### Phenome-wide association testing in FinnGen and UKBB

We performed a PheWAS in the FinnGen study for the representative variants of the credible sets identified by multivariate fine-mapping. FinnGen (https://www.finngen.fi/en) is a large biobank study that aims to genotype 500,000 Finns and combine this data with longitudinal registry data, including national hospital discharge, death, and medication reimbursement registries, using unique national personal identification numbers. FinnGen includes prospective epidemiological and disease-based cohorts as well as hospital biobank samples. A total of 176,899 samples from FinnGen Data Freeze 4 with 2444 disease endpoints were analyzed using Scalable and Accurate Implementation of Generalized mixed model (SAIGE), which uses saddlepoint approximation (SPA) to calibrate unbalanced case-control ratios [24]. Additional details and information on genotyping and imputation are provided in the Supplementary Material and contributors of FinnGen are listed in the Acknowledgements.

FinnGen disease associations with p values < 1 × 10−4 were considered significant. We tested the p value threshold by sampling 1000 allele frequency-matched sets of n variants, where n represents the number of representative variants, from 8.2 million non-coding variants and determining a null distribution of the number of FinnGen associations passing the p value threshold. We confirmed the validity of the p value threshold by comparing the observed number of FinnGen associations passing the p value threshold to the null distribution (Supplementary Fig. 3). We excluded disease endpoints within the ICD-10 (International Statistical Classification of Diseases and Related Health Problems 10th Revision) chapters XXI and XXII from PheWAS analyses, resulting in 2367 disease endpoints analyzed. To confirm whether the FinnGen disease associations of the representative variants share a common causal variant with the most significantly associated variant (i.e., variant with smallest p value in FinnGen) within the locus, and thus evaluate their importance for the disease associations, the FinnGen disease associations were conditioned on the most significantly associated variant within the locus (±0.5MB of the representative variant). Finally, we assessed replication of the disease associations in the UKBB, where associations with p values < 0.05 were considered replicated given that the direction of effects were coherent. Phecodes from the UKBB were mapped to ICD-10 diagnosis codes using the PheCode map 1.2 [25]. The NHGRI-EBI GWAS Catalog [26] was used for assessing the novelty of the observed genetic associations.

We also explored whether the fine-mapped representative variants or variants in LD with them (r2 > 0.6) had previously been reported as pQTLs in studies by Suhre et al. [5], Sun et al. [6], Emilsson et al. [27] and Sasayama [28]. Regional overlap and architecture were visualized in Target Gene Notebook [29]. To validate the overlap of our pQTL findings, we performed Bayesian colocalization analysis using the COLOC package in R [30], within 200 kb from the representative variant, for all pQTL associations from data sets with full summary statistics available.

## Results

### Comparison of multivariate and univariate GWAS of 12 inflammatory biomarkers

We first tested for genome-wide associations of 12 highly correlated inflammatory biomarkers (Table 1, Supplementary Fig. 1) measured in 6890 FINRISK study participants using both multivariate and univariate methods. Pearson correlations between the biomarkers ranged from 0.64 to 0.93, with a mean of 0.80. Out of the 11,329,225 variants tested, 190 were significantly associated using both univariate and multivariate analyses, 999 only by the multivariate analysis and two only by the univariate analysis using a Bonferroni-corrected p value threshold of 5 × 10−8/12 (Fig. 2). A total of 1189 variants reached the significance threshold in the multivariate analysis compared to only 192 in the univariate analysis, reflecting a considerable increase in statistical power achieved by the multivariate analysis. When the univariate effect sizes were all in the same direction (e.g., GP6 locus, all effects were positive), the gain in power was smaller compared to the situation where the effects were both positive and negative (e.g., F5 locus). This is as expected, as all the 12 traits were positively correlated, and it is known that the gain in power in multivariate analyses is greatest when the correlation matrix and effect sizes differ from each other [31]. Despite the increase in power, the Type I error rate of the multivariate GWAS was preserved as the corresponding genomic inflation factor λ for all variants was 1.036, with no evidence of concerning genomic inflation due to Canonical Correlation Analysis. We also assessed the Type I error rate for three minor allele frequency (maf) bins (maf < 0.01, 0.01 < maf < 0.1, and maf > 0.1) separately, with rare variants not showing noticeably more inflation than more common variants (Supplementary Fig. 4).

Within the 1189 genome-wide significant variants in the multivariate analysis, we identified 11 independently associated loci (Fig. 3 and Supplementary Fig. 5), four of which (F5, C1orf140, PDGFRB and ABO) were not detected by univariate analyses corrected for multiple testing (Fig. 3). The two variants that were significant only in the univariate analysis were both located in a locus (JMJD1C) that was found to be significant also by the multivariate analysis. Thus, no loci that were significant in the univariate analysis corrected for multiple testing went undetected by multivariate analysis. Eight of the 11 loci had previously been associated with at least one of the 12 biomarkers in the NHGRI-EBI GWAS catalog while three loci (F5, C1orf140 and PDGFRB) were novel.

### Functional coding variants

GWAS hits are generally non-coding, although concentrated in regulatory regions [32], and enrichment of functional coding variants has been seen mainly only after fine-mapping e.g., in inflammatory bowel disease [33]. We, however, observed enrichment of functional coding variants in the multivariate GWAS hits already prior to fine-mapping. Considering all genome-wide significant variants in the multivariate GWAS, we found 13 nonsynonymous or splice-region variants with at least one such variant in five of the 11 multivariate loci (C1QA, F5, SERPINE2, C6orf223, and GP6) (Fig. 3). Out of the 13 variants, 11 were missense variants, one was a splice-region variant and one a frameshift variant. Only four missense variants at two loci were significantly associated in the univariate analyses. Two of the 11 missense variants led the multivariate association at their respective loci (chr1:g.22637683G>A (rs17887074) and chr19:g.55032292G>A (rs199588110), in the C1QA and GP6 loci respectively) and were enriched (>1.5-fold) in Finns compared to non-Finnish, Swedish, Estonian Europeans (NFSEE) in the gnomAD genome reference database [34]. A total of six (46.2%) of the 13 variants were enriched in the Finnish population, highlighting the potential of utilizing isolated populations in GWAS.

We studied whether the multivariate genome-wide significant variants were enriched for missense, splice-region and frameshift variants compared to the 11.3 M variants analyzed. P values for enrichment were calculated using the χ2-test for the number of nonsynonymous and splice-region or missense variants within the genome-wide significant variants against the number of the corresponding subset of variants within all variants tested. The multivariate genome-wide significant variants were enriched for missense variants and missense, splice-region and frameshift variants (2.2-fold, p = 0.015, and 1.9-fold, p = 8.8 × 10-4, respectively).

### Fine-mapping multivariate GWAS results

To identify the causal variants of the multivariate associations, we studied the likelihood of multiple variants contributing to the association signal in the 11 associated loci using FINEMAP [23]. Our novel multivariate LCP-GWAS method based on linear combinations calculated for each locus using multivariate metaCCA results enabled fine-mapping of the multivariate results. The number of credible sets varied from one to four for the multivariate associated loci (Supplementary Table 3), resulting in a total of 19 independent sets of variants considered putatively causal. All 183 variants within the 19 credible sets are available in Supplementary Table 3 and posterior probabilities for different numbers of causal signals for each locus are available in Supplementary Table 4.

Among each of the 19 sets, the variant with the highest causal probability was chosen to represent the set as the representative variant (Table 2 and Supplementary Fig. 6). The 19 representative variants, included all except one (chr15g.101991748G>C, rs11637184 in the PCSK6 locus) of the 11 lead variants from multivariate GWAS. Highlighting the importance of fine-mapping multivariate GWAS results, one of the four representative variants (chr15:g.101339772G>A, rs111482836) in the PCSK6 locus was associated with disease in FinnGen, whereas the lead variant was not. Additionally, the 19 representative variants were further enriched for both missense variants and missense, splice-region and frameshift variants (37-fold, p = 1.3 × 10−17, and 28-fold, p = 1.4 × 10−17, respectively) compared to multivariate genome-wide significant variants (2.2-fold, p = 0.015, and 1.9-fold, p = 8.8 × 10−4, respectively), as were the 183 variants in the credible sets (3.9-fold, p = 0.050, and 2.9-fold, p = 0.050, respectively). In one of the two credible sets in the F5 locus a missense variant (chr1:g.169515529A>G, rs9332701), predicted deleterious by SIFT and probably damaging by PolyPhen, was found to be in high LD (r2 = 0.996) with the representative non-coding variant chr1:g.169505159C>T (rs61808983) with a marginally smaller causal probability (46.1% vs. 53.3%). We assessed whether the causal probabilities changed in the credible set if the LCP was generated for the missense variant rs9332701 rather than the lead variant rs61808983. This had no notable effects on the causal probabilities (46.1% vs. 48.5%, 53.3% vs. 51.5% for rs9332701 and rs61808983, respectively).

To assess the possible bias toward the lead variants more generally, we constructed LCPs for all multivariate genome-wide significant variants in the F5 locus (n = 85). For each of the variants, we compared the p value from LCP-GWAS in which the LCP was constructed for the F5 lead variant to that in which the LCP was constructed for the variant itself (Supplementary Fig. 7). LCP-GWAS results indicated no significant bias toward the lead variant, and thus, no substantial bias in the fine-mapping results, even when the LD between the variants was only moderate. In addition, we assessed how the phenotype weights used to construct LCPs correlated among variants in the same locus, and also compared to them across loci. As expected, the phenotype weights were highly correlated for variants in high LD (e.g., in the same credible set or the same locus), but not across different loci (Supplementary Fig. 8).

Fine-mapping suggested at least as many causal signals as there were conditional rounds in stepwise conditional analysis (n = 16), thus verifying the results from FINEMAP. Further, 13 of the 19 (68.4%) representative variants were also conditioned on in the conditional analysis (Supplementary Table 5). The main benefit of fine-mapping is the probabilistic quantification of possible causal configurations that contain multiple variants. Such metrics are not available in standard implementations of stepwise conditional analysis.

### Identifying driver traits

Next, we studied which traits were driving the multivariate associations in each of the 11 loci using metaPhat [16]. The number of driver traits for each of the 11 loci varied between one and all 12. The driver traits were very much in line with the univariate results; the most significantly associated biomarkers in the univariate GWAS were typically included among the driver traits (Table 2). In loci with multiple representative variants, driver traits for the variants were generally subsets of the lead variant’s driver traits, and a stronger multivariate association increased the number of driver traits. However, this relationship between multivariate p value and the number of driver traits did not hold across loci. Further, driver traits typically included all or some of the biomarkers that had previously been associated with the locus (Table 2).

### Disease implications of the multivariate loci

Finally, we tested how the 19 representative variants in the 11 loci associated with disease risk among 2367 disease endpoints defined in FinnGen. Altogether, 53 disease associations were observed with seven representative variants. Two of these variants did not lead the multivariate associations at the 11 loci and thus would have gone unnoticed without fine-mapping.

To assess the relevance of the representative variants for their disease associations in FinnGen, the disease associations were conditioned on the variant with the strongest FinnGen disease association within the locus. In 13 of the 53 FinnGen disease associations with the representative variants, the representative variant or a variant in near perfect LD (r2 > 0.95) led the association signal or remained significant after conditioning. We also tested the disease associations in the UKBB, where associations with p values < 0.05 were considered replicated given that the direction of effects were coherent (Supplementary Table 6).

In addition to disease associations, we explored whether the representative variants or variants in LD with them (r2 > 0.6) had previously been reported as pQTLs. Several reported pQTLs [5, 6, 27, 28] in the 11 loci, most of which colocalized with the multivariate biomarker associations, provided evidence for the biologically relevant functions of the representative variants (Supplementary Table 7).

Here we further discuss results for the three multivariate loci with disease associations (p < 1 × 10−4) in FinnGen that remained significant after conditioning. The variants identified by multivariate testing for which the associations became insignificant after conditioning, were regarded unnecessary for the observed disease association. Full disease association results for the 11 loci are shown in Supplementary Table 8.

### GP6 gene locus

#### Multivariate association and FinnGen disease associations

The Finnish enriched rare missense variant chr19:g.55032292G>A (rs199588110, AF = 0.33%, 3.7-fold enrichment), predicted deleterious by SIFT [35] and probably damaging by Polyphen [36], was suggested causal in the GP6 locus. In FinnGen it led the association with benign neoplasms of meninges (OR = 6.4, p = 4.9 × 10−5). The association was not replicated in the UKBB, although this may be due to impaired power as the AF of the Finnish enriched variant in the UKBB (0.036%) was roughly a tenth of its AF in FinnGen, and an inadequate match of the discovery and replication phenotypes, as UKBB phenotype definitions included all benign neoplasms of the brain and spinal cord and were not restricted to neoplasms of the meninges.

#### Driver traits

All 12 biomarkers were considered driver traits of the multivariate association. Cytokines, including many of the 12 biomarkers studied (e.g., IL-6, IL-4, PDGF-BB and VEGF-A), have been implicated in the autocrine regulation of meningioma cell proliferation and motility [37,38,39,40]. Further, higher expression levels of both PDGF-BB and VEGF occur in atypical and malignant meningiomas than in benign meningiomas [40, 41] and microvascular density regulated by VEGF has been linked with time to recurrence [42]. Several phase II clinical trials have tested therapies targeting VEGF and PDGF-BB signaling pathways as treatments for recurrent or progressive meningiomas [38] with promising results for two multifunctional tyrosine kinase inhibitors, sunitinib and PTK787/ZK 222584 that inhibit both VEGF and PDGF receptors [38, 43].

### SERPINE2 gene locus

#### Multivariate association and FinnGen disease associations

The SERPINE2 locus was the locus with the most significant association in the multivariate analysis (p < 1 × 10−324). Fine-mapping identified three independent association signals, represented by three representative variants (chr2:g.224010157G>A (rs13412535), chr2:g.224036001del (rs58116674), and chr2:g.224257750T>A (rs7578029)). One of them, the intronic lead variant rs13412535 from the multivariate analysis, increased the risk of hypertrophic scars (OR = 1.3, p = 7.5 × 10−5) and was in very high LD with the variant that led the disease association in FinnGen (chr2:g.224015781T>C, rs68066031, r2 = 0.99). The association was not replicated in the UKBB, possibly due to differences in case ascertainment as the prevalence of hypertrophic scars was 6.5 times greater in FinnGen compared the UKBB (0.350% vs. 0.053%, respectively), and had not been previously reported at gene-level. Nonetheless, the variant in question had an association with another hypertrophic skin disorder, acquired keratoderma (OR = 1.5, p = 0.02) in the UKBB. Another representative variant, the intergenic variant rs7578029 increased the risk of infections of the skin and subcutaneous tissue (OR = 1.1, p = 9.7 × 105) and was in very high LD with the variant that led the disease association in FinnGen (chr2:g.224261196C>T, rs13029443, r2 = 0.97). The association did not replicate in the UKBB, which lacked a well-matching replication phenotype.

#### Previous knowledge of gene function and driver traits

The SERPINE2 gene encodes protease nexin-1, a protein in the serpin family of proteins that inhibits serine proteases, especially thrombin, and has therefore been implicated in coagulation and tissue remodeling [44]. The gene has been associated with chronic obstructive pulmonary disease and emphysema [45]. As previously reported, SERPINE2 has been shown to inhibit extracellular matrix degradation [46] and overexpression of SERPINE2 has been shown to contribute to pathological cardiac fibrosis in mice [47]. Additionally, serine protease inhibitor genes including SERPINE2 have been noted to be heavily induced during wound healing [48]. According to GTEx the SERPINE2 gene is most highly expressed in fibroblasts. Further, inflammation plays an important role in hypertrophic scar formation and cytokines including PDGF and VEGF are dysregulated in hypertrophic scars [49]. The lead variant had genome-wide significant associations with 11 of the 12 biomarkers and all 12 were regarded as driver traits of the association.

#### pQTLs

The lead variant (chr2:g.224010157G>A, rs13412535) is a pQTL impacting one of the driver traits, PDGF-BB levels (posterior probability of shared causal variant from colocalization analysis, PP = 5.06 × 10−5), and an intronic variant chr2:g.224015781T>C (rs68066031) in high LD (r2 = 0.99) with the lead variant is a pQTL for SERPINE2 [6, 27] (PP = 0.976). PDGF is considered essential in wound repair [50] and growth factors including PDGF are considered key players in the pathogenesis of hypertrophic scars [51]. PDGF enhances pathologic fibrosis in several tissues such as skin, lung, liver, and kidney by means of mitogenic and chemoattractant actions on the principal collagen-producing cell type, myofibroblasts, as well as stimulation of collagen production [52].

### ABO gene locus

#### Multivariate association and FinnGen disease associations

An association with the ABO locus was only detected by multivariate analysis (minimum univariate p = 2.1 × 10−5 for the lead variant from multivariate analysis). Fine-mapping identified one association signal represented by the intronic lead variant chr9:g.133271182T>C (rs550057, aka rs879055593) from multivariate analysis (p = 8.5 × 10−14). It was associated with 45 endpoints in FinnGen, such as endometriosis, heart failure, and statin usage. Most of these associations resulted from LD to other stronger regional associations, however, nine remained significant after conditioning on other lead variants within the ABO locus, including a risk-increasing effect on anemias, for which rs550057 led the genome-wide significant association signal (p = 4.7 × 10−8), visual field disturbances (p < 6.5 × 10−5), and diseases of the ear and mastoid process (p = 4.8 × 10−5). Replication of only two of the nine associations (other anemias and visual field defects) could be attempted in the UKBB due to poor phenotype matching and did not replicate; however, bearing relevance to the genome-wide significant finding in anemia, rs550057 led the association with red blood cell count in the UKBB (p = 1.3 × 10−212) [53].

#### Driver traits

IL-4 was the only driver trait of the multivariate association and has been implicated in the pathogenesis of many of the diseases associated with the locus. Aplastic anemia is considered to result primarily from immune-mediated bone marrow failure and an imbalance in Type I versus Type II T-cells that secrete IL-4 among other cytokines has been reported [54]. In endometriosis, IL-4 levels have been shown to be upregulated and induce the proliferation of endometriotic stromal cells [55, 56].

#### pQTLs

The lead variant chr9:g.133271182T>C (rs550057) is a pQTL impacting the levels of four proteins: ALPI (PP = 0.999), CHST15 (PP = 0.999), FAM177A1 (PP = 0.999), and JAG1 (PP = 0.995) [6]. Two of these proteins, carbohydrate sulfotransferase 15 (CHST15) and Jagged1 (JAG1), have been implicated in the pathogenesis of diseases associated with the locus. A small-interfering RNA targeting CHST15 improved myocardial function as well as reduced cardiac fibrosis, hypertrophy, and secretion of proinflammatory cytokines in rats with chronic heart failure [57]. Upregulation of JAG1 has been reported in the endometrium of patients with endometriosis compared to controls [58]. Alagille Syndrome mainly caused by mutations in the JAG1 gene, is accompanied by congenital heart defects and varying degrees of hypercholesterolemia [59].

## Discussion

We developed a novel method for multivariate GWAS follow-up analyses and demonstrated the considerable boost in power provided by multivariate GWAS using 12 highly correlated inflammatory markers. In total, four out of 11 genome-wide significant loci were detected only by multivariate analysis when adjusting univariate GWAS for multiple testing. Multivariate analysis might also highlight more plausible candidates for causal variants than univariate analyses. For example, in the C1QA locus, the lead variant in the univariate GWAS of the driver trait TNF-β was an intronic variant in the EPHB2 gene, whereas the lead variant for the locus in the multivariate analysis was a Finnish-enriched missense variant located in the C1QA gene which has been previously associated with immunologic diseases [60]. Our multivariate analysis may point toward a plausible mechanism underlying these associations via TNF-β levels.

Although both univariate and multivariate scans have previously been applied to these biomarkers [1, 61], these studies have suffered from the lack of essential follow-up analyses due to the absence of beta estimates in multivariate summary statistics. Our novel method enables two key follow-up analyses for multivariate GWAS: fine-mapping and trait prioritization. Our method solves the problem of missing effect sizes and standard errors required for fine-mapping by an extension of metaCCA followed by LCP-GWAS. This process allows for the transformation of CCA-based multivariate GWAS results into univariate summary statistics and thus extends the use of FINEMAP and other summary statistics-based tools to multivariate GWAS. Fine-mapping complex multivariate associations allows for assessing causality of the variants within the associated loci. This has not been previously feasible. We also further describe the multivariate associations by determining the traits driving the associations using MetaPhat. This workflow allows the identification of both the variants and traits underlying the multivariate associations.

Our study also elucidates the advantage of multivariate analysis combined with large biobank-based phenome-wide screening by discovering multiple novel disease associations. For example, in the GP6 locus we observe a novel risk-increasing association between the Finnish enriched rare missense variant chr19:g.55032292G>A (rs199588110) and benign neoplasms of meninges. Altogether, a majority of the observed disease associations were for the ABO locus that was only detected by multivariate GWAS. All these associations, including a genome-wide significant association with anemia that replicated in the UKBB as an effect on red blood cell count, would have gone undetected had we used univariate GWAS. In addition to disease association discovery, our workflow promotes increasing insight into the pathophysiology underlying the associations by identifying the biomarkers driving the associations. Exploration of biological evidence including pQTLs, most of which colocalized with the multivariate biomarker associations, in the GP6, SERPINE2, and ABO loci orthogonally supports our evidence of causal variants and driver traits. For example, in the SERPINE2 locus one of the three representative variants chr2:g.224010157G>A (rs13412535) increased the risk of hypertrophic skin disorders in FinnGen and was a pQTL for PDGF-BB [6] that is considered a key player in the pathogenesis of hypertrophic scars [51], increasing evidence of the biologically relevant functions of this variant.

These methodological development and novel findings notwithstanding, our study has some limitations. First, our newly developed workflow for multivariate fine-mapping requires individual level genotype and phenotype data, problematic for some analysis settings. Additionally, the LCPs are optimized for the lead variants, potentially resulting in overestimation of the causal probability of these variants. We did not, however, see evidence of this in the F5 locus where we constructed LCPs for each variant reaching genome-wide significance in the multivariate analysis and compared the p values from LCP-GWAS when the LCPs were constructed for either the lead variant or the variant itself. Due to the regionality of the LCP-GWAS, it should be noted that LPC-GWAS summary statistics cannot be used for genome-wide methods such as heritability estimation. We also acknowledge that the credible sets we chose for follow-up may not encompass all causal signals within the multivariate associations. The credible sets excluded due to low LD may arise from multiple signals included in the same set, resulting in small LD within the set. Further, some disease associations require replication and follow-up analyses.

On the other hand, our study has many strengths. First, a prospective cohort study was used to assess deep phenotype data rarely available at large scale. Second, we are among the first to present phenome-wide results from FinnGen, a very large and well-phenotyped Finnish biobank study, and also make use of the UKBB in disease association follow-up, ensuring enough power for disease association detection. Finland has a public healthcare system and national health registries, which enable the vast and accurate phenotyping in FinnGen. Besides FinnGen, an additional advantage to performing the study in Finns is that deleterious variants are enriched in the Finnish population due to population history [21]. Furthermore, our reference panel for genotype imputation is from the same population as our discovery and follow-up data sets, which, as demonstrated also by others [62, 63], allows us to study variants that are enriched (and often unique) in the study-specific population.

In conclusion, we developed a novel workflow for multivariate GWAS discovery and follow-up analyses, including fine-mapping and identification of driver traits, and thus promote the advancement of powerful multivariate methods in genomic analyses. We demonstrate the benefit of applying this workflow by identifying novel associations and further describing previously reported associations with both biomarkers and diseases using a set of inflammatory markers. We show that compared to univariate analyses, multivariate analysis of biomarker data combined with large biobank-based PheWAS reveals a considerably increased number of novel genetic associations with several diseases.