Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries

Zekavat, Seyedeh M.; Ruotsalainen, Sanni; Handsaker, Robert E.; Alver, Maris; Bloom, Jonathan; Poterba, Timothy; Seed, Cotton; Ernst, Jason; Chaffin, Mark; Engreitz, Jesse; Peloso, Gina M.; Manichaikul, Ani; Yang, Chaojie; Ryan, Kathleen A.; Fu, Mao; Johnson, W. Craig; Tsai, Michael; Budoff, Matthew; Vasan, Ramachandran S.; Cupples, L. Adrienne; Rotter, Jerome I.; Rich, Stephen S.; Post, Wendy; Mitchell, Braxton D.; Correa, Adolfo; Metspalu, Andres; Wilson, James G.; Salomaa, Veikko; Kellis, Manolis; Daly, Mark J.; Neale, Benjamin M.; McCarroll, Steven; Surakka, Ida; Esko, Tonu; Ganna, Andrea; Ripatti, Samuli; Kathiresan, Sekar; Natarajan, Pradeep

doi:10.1038/s41467-018-04668-w

Download PDF

Article
Open access
Published: 04 July 2018

Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries

Seyedeh M. Zekavat ORCID: orcid.org/0000-0003-4026-8944^1,2,3,
Sanni Ruotsalainen⁴,
Robert E. Handsaker ORCID: orcid.org/0000-0002-3128-3547^1,5,6,
Maris Alver^8,9,
Jonathan Bloom ORCID: orcid.org/0000-0001-8497-1434^1,7,
Timothy Poterba^1,7,
Cotton Seed^1,7,
Jason Ernst ORCID: orcid.org/0000-0003-4026-7853¹⁰,
Mark Chaffin¹,
Jesse Engreitz ORCID: orcid.org/0000-0002-5754-1719¹,
Gina M. Peloso¹¹,
Ani Manichaikul¹²,
Chaojie Yang¹²,
Kathleen A. Ryan¹³,
Mao Fu¹³,
W. Craig Johnson¹⁴,
Michael Tsai¹⁵,
Matthew Budoff¹⁶,
Ramachandran S. Vasan^17,18,
L. Adrienne Cupples^11,17,
Jerome I. Rotter¹⁹,
Stephen S. Rich¹²,
Wendy Post²⁰,
Braxton D. Mitchell²¹,
Adolfo Correa ORCID: orcid.org/0000-0002-9501-600X²²,
Andres Metspalu⁹,
James G. Wilson²²,
Veikko Salomaa²³,
Manolis Kellis^1,24,
Mark J. Daly^1,5,7,
Benjamin M. Neale ORCID: orcid.org/0000-0003-1513-6077^1,5,7,
Steven McCarroll^1,5,6,
Ida Surakka⁴,
Tonu Esko ORCID: orcid.org/0000-0003-1982-6569^1,9^na1,
Andrea Ganna^1,5,7^na1,
Samuli Ripatti^1,4,25^na1,
Sekar Kathiresan ORCID: orcid.org/0000-0002-6724-032X^1,26,27,28^na1,
Pradeep Natarajan ORCID: orcid.org/0000-0001-8402-7435^1,26,27,28^na1 &
NHLBI TOPMed Lipids Working Group

Nature Communications volume 9, Article number: 2606 (2018) Cite this article

7757 Accesses
62 Citations
34 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 01 April 2020

A Publisher Correction to this article was published on 23 August 2018

This article has been updated

Abstract

Lipoprotein(a), Lp(a), is a modified low-density lipoprotein particle that contains apolipoprotein(a), encoded by LPA, and is a highly heritable, causal risk factor for cardiovascular diseases that varies in concentrations across ancestries. Here, we use deep-coverage whole genome sequencing in 8392 individuals of European and African ancestry to discover and interpret both single-nucleotide variants and copy number (CN) variation associated with Lp(a). We observe that genetic determinants between Europeans and Africans have several unique determinants. The common variant rs12740374 associated with Lp(a) cholesterol is an eQTL for SORT1 and independent of LDL cholesterol. Observed associations of aggregates of rare non-coding variants are largely explained by LPA structural variation, namely the LPA kringle IV 2 (KIV2)-CN. Finally, we find that LPA risk genotypes confer greater relative risk for incident atherosclerotic cardiovascular diseases compared to directly measured Lp(a), and are significantly associated with measures of subclinical atherosclerosis in African Americans.

Genome-wide association studies

Article 26 August 2021

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Genomic data in the All of Us Research Program

Article Open access 19 February 2024

Introduction

Lipoprotein(a), Lp(a), is a circulating lipoprotein comprised of a modified low-density lipoprotein (LDL) particle covalently bonded to apolipoprotein(a), apo(a)^1,2,3. The apo(a) protein contains an inactive protease domain, kringle V domain, and ten kringle IV domains, including an extremely polymorphic kringle IV 2 copy number (KIV2-CN)³, a large region spanning 5.5 kb, which consists of a pair of exons repeating between 5 to over 40 times per chromosome⁴. Increased KIV2-CN results in increased apo(a) size, which is inversely associated with plasma Lp(a) levels due to altered protein folding, transport, and secretion⁵. Twin studies have suggested that Lp(a) is highly heritable, with up to 90% heritability in both African and European populations^6,7,8,9,10. However, the most recent genome-wide association studies have only explained approximately half of the genetic heritability¹¹. Epidemiologic studies and genetic analyses in European and Asian populations have causally linked Lp(a) concentrations with atherosclerotic cardiovascular disease, independent of other plasma lipids including LDL cholesterol^12,13,14,15. As a result, Lp(a) has emerged as a promising therapeutic target for atherosclerotic cardiovascular diseases.

Plasma Lp(a) distributions vary significantly among ethnicities but these differences are not explained by known differential KIV2-CN distributions between the ethnicities and are posited to be related to primary sequence¹⁶. Additionally, studies suggest that apo(a) isoform and Lp(a) concentration may have differential effects on coronary heart disease (CHD) odds¹⁴; however, distinguishing isoform-independent genetic effects on Lp(a) has required separate genotyping strategies, typically qPCR¹⁷, in addition to genotyping single-nucleotide polymorphisms (SNPs). Deep-coverage (>20×) whole genome sequencing (WGS) provides the opportunity to determine the full range of genomic variation that influences Lp(a) concentration and isoform size, across the allele frequency spectrum and variant type among diverse individuals.

Here, we use deep-coverage WGS in 2284 Estonians, 2690 Finnish individuals, and 3418 African Americans to ascertain SNPs and indels across the genome, and structural variants at LPA, including KIV2-CN. We perform: (1) structural variant association analyses; (2) common variant association; (3) rare variant association in coding and non-coding sequence; and (4) Mendelian randomization (MR) analyses. Our goals are three-fold: (1) to understand the full spectrum of genetic variation influencing Lp(a) and Lp(a)-cholesterol (Lp(a)-C); (2) to compare genetic differences between Europeans and African Americans; and (3) to determine the phenotypic consequences of LPA variant classes on incident clinical events and subclinical measures (Fig. 1).

Through WGS, we observe that Lp(a) is substantially heritable in both Europeans and African Americans despite notable inter-ethnic differences in circulating biomarker concentrations. Furthermore, we use WGS to directly genotype LPA structural variation, including KIV2-CN. Through common variant and rare variant analyses, we dissect the genetic architecture of Lp(a), finding novel genetic associations and identifying sources of inter-ethnic genetic differences. Finally, using a new imputation model to estimate KIV2-CN, we show that distinct LPA variant classes differentially influence clinical and subclinical atherosclerosis.

Results

WGS and baseline characteristics

A total of 8392 participants underwent deep-coverage (mean attained 33 × coverage) WGS: 3418 African Americans from the Jackson Heart Study (JHS) as part of the NIH/NHLBI Trans-Omics for Precision Medicine (TOPMed) program, 2284 Europeans from the Estonian Biobank (EST), and 2690 Europeans from the Finland FINRISK study (FIN) (Supplementary Fig. 1). FIN WGS and whole-exome sequences were used to impute into 27,344 Finnish array data for analyses. Following quality control (Supplementary Table 1), a total of 119.4 M SNPs and 7.2 M indels were discovered across EST WGS, JHS WGS, and FIN imputation datasets analyzed (Supplementary Figs. 2, 3, Supplementary Table 2).

We obtained both Lp(a) and Lp(a)-C where available. 4767 individuals from EST and JHS WGS with Lp(a)-C available and 9272 individuals from the JHS WGS and FIN imputation dataset with Lp(a) available were included in analyses requiring these phenotypes. Lp(a)-C values were quantified using the Vertical Autoprofile (VAP) method, which measures cholesterol concentration via densitometry^18,19. Lp(a) values were quantified using two immunoassay-based methods sensitive to the entire mass of the Lp(a) particle. Median Lp(a) levels in JHS (median (IQR) 46 (24–79) mg/dL) were nearly ten times higher than in FIN (5 (2–10) mg/dL), while the Lp(a)-C distribution was similar between EST (7 (5-9) mg/dL) and JHS (7 (5–11) mg/dL) (Supplementary Table 3, Supplementary Fig. 4a, b). Finnish individuals have among the lowest Lp(a) concentrations across European populations²⁰. This may explain why we observe a 10-fold difference between JHS and FIN Lp(a) concentrations versus the 2–3 fold differences previously observed between African and European populations¹⁶. Among JHS individuals with both Lp(a) and Lp(a)-C available, the concentrations between these phenotypes were moderately correlated (Spearman correlation (R_s) = 0.46, P = 2.4 × 10⁻¹⁴³) (Supplementary Fig. 5).

Structural variant discovery and imputation of KIV2-CN

Structural variants, notably KIV2-CN, at LPA have been previously shown to influence apo(a) size and Lp(a) concentration¹⁷. From the WGS data, we used GenomeSTRiP²¹ to identify and genotype nine structural variants at the LPA locus (Fig. 2a, Supplementary Table 4), all rare except the KIV2-CN repeat. We mapped the reported 6 KIV2 repeats present in the hg19 reference genome²², finding that the KIV2-CN repeat occurs between positions chr6:161032565–161067901 with each repeat copy containing 5534–5546 base pairs and two coding exons (Supplementary Fig. 6a). The KIV2-CN (quantified as the sum of the KIV2 allelic copy number across both chromosomes) distribution is slightly different between African American (mean 38.5 (SD 7.4)) and European (mean 43.7 (SD 6.2)) ethnicities, ranging between 12.0–84.6 copies (Supplementary Fig. 6b, Supplementary Table 5). In earlier work, we validated Genome STRiP copy number estimates using ddPCR²³, which establishes general accuracy for the quantified absolute copy number. To evaluate the precision of our KIV2-CN estimates, we utilized 123 pairs of siblings from JHS that were confidently identical-by-descent at both LPA 1 Mb window haplotypes (genotype concordance >99%), and found a very strong and robust correlation between sibling pair KIV2 copy number estimates (r² = 0.989) (Supplementary Fig. 7a-d).

LPA locus variants, namely rs3798220 and rs10455872, have been previously associated with KIV2-CN^14,15. In the FIN WGS, these two SNPs account for 12% of the variance of directly genotyped KIV2-CN. To improve KIV2-CN estimation from SNPs, we developed an imputation model using 2,215 FIN with WGS and applied it to impute KIV2-CN in the 27,344 FIN with array-derived genotypes. In the FIN WGS, we applied the least absolute shrinkage and selection operator (LASSO) across high-quality (imputation quality > 0.8) variants with minor allele frequency (MAF) > 0.1% available in the FIN imputation dataset in a 4MB window around LPA, which yielded a 61-variant model to impute KIV2-CN (Supplementaary Fig. 8a). To understand the relative importance of each of these 61 variants, a random forest model was applied (Fig. 2b, Supplementary Fig. 8b). Our model ascribed greatest importance to rs10455872, a previously described SNP associated with KIV2-CN^14,15. The full 61-variant model in our validation dataset explained 60% of variation in genotyped KIV2-CN (Supplementary Data 1, Supplementary Fig. 6c, Fig. 2c). While low-frequency loss-of-function variants have been observed by us and others^24,25 within LPA, removal of these carriers did not significantly alter the relationship between KIV2-CN and Lp(a) across all individuals (P = 0.48).

We confirmed that both directly genotyped and imputed KIV2-CN were negatively associated with Lp(a)-C (−0.05 SD/CN, P < 1 × 10⁻⁶¹) and Lp(a) (−0.07 to −0.08 SD/CN, P < 1 × 10⁻¹⁹⁰), across African American and European ethnicities (Fig. 3). KIV2-CN alone explained 18% (Europeans) to 26% (African Americans) of variation in Lp(a), and for Lp(a)-C explained 14% of variation in both ethnicities. Introduction of 1/KIV2-CN to the multivariable model did not improve model fit for the relationship between KIV2-CN and Lp(a) (P = 0.16).

We sought to also determine whether combinations of summed KIV2-CN alleles equivalent to the same total had the same relationship with KIV2-CN. We observed that the relationship of homozygous KIV2-CN alleles (from 59 FIN individuals 95% homozygous-by-descent at the LPA locus) to Lp(a) was similar to the remaining association observed across all others (P = 0.21).

Common variant associations

To identify additional genomic variants associated with Lp(a) and Lp(a)-C, we performed genome-wide common variant (MAF > 0.1%) association analyses using a linear mixed model, conditioning on KIV2-CN. Association was performed at the cohort-level and followed by trans-ethnic meta-analysis. We analyzed a total of 32,695,476 variants for Lp(a)-C and 31,652,301 variants for Lp(a), identifying common variants at 3 loci at conventional genome-wide significance (P < 5 × 10⁻⁸) for Lp(a)-C at LPA (rs140570886, P = 3.3 × 10⁻³⁰), CETP (rs247616, P = 6.1 × 10⁻¹⁰), and SORT1 (rs12740374 P = 1.0 × 10⁻²¹), and 2 genome-wide significant loci for Lp(a) at LPA (rs6938647, P = 4.7 × 10⁻¹²⁹), and APOE (rs7412, P = 1.3 × 10⁻²³) (Supplementary Fig. 9-11; Supplementary Data 2, 3).

The lead SORT1 locus variant, rs12740374, has been previously causally associated with LDL cholesterol²⁶. Here, Lp(a)-C association for rs12740374 was not substantially altered conditioned on either LDL cholesterol (Fig. 4a) or apolipoprotein B (Supplementary Fig. 12). Common variants at CETP are associated with HDL cholesterol²⁷ and the lead CETP locus variant for Lp(a)-C, rs247616, is no longer significant after conditioning on HDL cholesterol (Supplementary Fig. 13). Lp(a)-C is strongly associated with HDL cholesterol (B = 0.41 SD Lp(a)-C/SD HDL, P = 2.9 × 10⁻¹⁹¹); notably, HDL and Lp(a) particles have similar densities potentially influencing Lp(a)-C measurement accuracy²⁸. Finally, rs7412 (APOE p. Arg176Cys), denoting the major APOE2 polymorphism, has been previously associated with LDL cholesterol²⁹ and recently with Lp(a) in a meta-analysis¹¹. The association of rs7412 with Lp(a) is diminished when conditioning on LDL cholesterol but remains strongly associated (before conditioning: B = -0.25 SD, P = 1 × 10⁻²³, after conditioning: B = −0.18 SD, P = 5 × 10⁻¹⁶) (Fig. 4b).

On average, LPA locus genetic variants yielding a 1 SD increase in Lp(a) yield a 0.48 SD increase in Lp(a)-C, similar to the observational correlation between the two phenotypes (Supplementary Fig. 14). Iterative conditional analyses at the LPA locus showed that, for Lp(a)-C there are 2 (JHS) and 3 (EST) independent genome-wide significant variants, (Supplementary Table 6a, b), while for Lp(a) there are 13 (JHS) and 30 (FIN) independent genome-wide significant variants (Supplementary Data 4) (Supplementary Fig. 15a, b), similar to the number of independent variants from past studies^11,17,30,31. We replicated Lp(a) associations for two known LPA loss-of-function (LOF) alleles^24,25: splice donor variant rs41272114 (B = −0.7 SD, P = 8 × 10⁻⁷⁷) and splice acceptor variant rs143431368 (B = −0.5 SD, P = 2 × 10⁻²⁶), and also discovered a novel LOF variant, a splice acceptor variant in exon 28 only observed African Americans in JHS: rs199583644 (MAF = 0.28%, B = −1.5 SD, P = 3 × 10⁻¹³).

Next, we compared inter-ethnic effects of LPA locus variants attaining sub-threshold significance (P < 1 × 10⁻⁴) in either ethnicity for Lp(a) and Lp(a)-C. Spearman rank correlation of genetic effects between the two ethnicities for Lp(a)-C was 0.38 and for Lp(a) 0.16 (Supplementary Fig. 16a, b). Moderately associated (P < 1 × 10⁻²) LPA locus variants largely private in African Americans (FIN MAF < 0.1%) had larger absolute effects across MAFs compared to such variants observed in both ethnicities (P = 3 × 10⁻³²) (Supplementary Fig. 17a, b). In comparing betas from genome-wide significant variants in African Americans with betas from the same variants in Europeans (Fig. 4c), we found the strongest inter-ethnic heterogeneity (HetP = 9.8 × 10⁻⁶⁴) at an LPAL2 intronic variant at the LPA locus (rs192873801, MAF 2.8% in JHS and 2.7% in FIN) with strongly divergent effects between the two ethnicities:+0.80 SD in JHS (P = 3.8 × 10⁻³²) and −0.61 SD in FIN (P = 2.0 × 10⁻³⁵) (Supplementary Fig. 18). We noted these variants to be on separate haplotypes for JHS and FIN (Supplementary Fig. 19). Notably, the LPA loss-of-function variant rs41272114, shows similarly strong effects in both ethnicities (HetP > 0.05).

Early family studies in Europeans and Africans have suggested the heritability of Lp(a) to be between 51% and 90%^6,7,8,9,10. A recent array-based genotyping study in KORA estimated 49%¹¹ of variance in Lp(a) from genome-wide heritability analysis of 6,002 Europeans. From WGS, we now estimate genetic heritability in African Americans and Europeans, respectively, to be 85% (SE 5%) and 75% (SE 7%) for Lp(a), and 52% (SE 7%) and 75% (SE 34%) for Lp(a)-C (Fig. 4d).

Common variant association and KIV2-CN modifier analyses

To determine if there are variants that influence the relationship between KIV2-CN and Lp(a)-C or Lp(a) concentrations, we performed variant-by-KIV2-CN interaction analyses at a 4MB window around LPA. We identified three independent modifier variants at this locus which influenced the relationship between KIV2-CN and Lp(a)-C (rs13192132, P = 1.73 × 10⁻¹⁵, rs1810126, P = 6.84 × 10⁻¹⁴, rs1740445, P = 6.35 × 10⁻⁹) (Fig. 5) and were consistent across ethnicities (Supplementary Table 7, Supplementary Fig. 20a, b). Sensitivity analyses of interactions was performed to assess for confounding from 1) haplotype effects and 2) single variants tagged through LD^32,33. All three variants show association with Lp(a)-C individually (P < 0.05), but are not correlated with KIV2-CN genotype (Pearson correlation r² < 0.1) (Supplementary Table 8). Furthermore, interaction associations persisted after conditioning on variants independently associated with Lp(a)-C (Supplementary Table 9).

Genomic context interrogation using adult liver regulatory annotations from the Roadmap Epigenome Project³⁴ showed that the top modifier variant in EST, a 3-base deletion, rs4063600 (TAGG > T, B = + 0.03 SD Lp(a)-C/CN/allele, P = 2.96 × 10⁻¹²), is in strong LD with rs13192132 (r² = 0.88) and overlies significant H3K4me3 and H3K27ac peaks (P < 1 × 10⁻²) 7,508 bases downstream of the LPA transcription start site (TSS) (Supplementary Fig. 21a). We additionally performed variant-by-KIV2-CN modifier analyses for Lp(a) using the JHS WGS (Supplementary Fig. 21b). A complete list of cohort-specific, LD-clumped significant variants are provided in Supplementary Data 5.

Rare variant analysis by coding and non-coding burden tests

Rare and low-frequency disruptive coding variants within LPA have been previously associated with Lp(a)^24,25. Here, we performed two coding rare variant analyses studies (RVAS) aggregating rare (MAF < 1%) variants which were (1) LOF or missense deleterious by in silico prediction tools³⁵, or (2) non-synonymous, within their respective genes, and performed association with Lp(a)-C, adjusting for KIV2-CN. All analyses were done separately for JHS and EST and meta-analyzed. While no genes reached significance in either analysis after accounting for multiple-hypothesis testing, we observed suggestive evidence for LPA in both coding RVAS tests (P = 7 × 10⁻⁴ for LOF and missense deleterious mutations, 1 × 10⁻⁴ for non-synonymous mutations) (Supplementary Data 6, 7, Supplementary Fig. 22a, b).

We also interrogated whether there was evidence of rare, non-coding variants aggregated within regulatory sequences uniquely detected by WGS that influence Lp(a)-C. We performed three non-coding RVAS using the variant groupings described in the Methods along with Roadmap epigenome data³⁴ from adult liver, the main tissue where LPA is expressed (Supplementary Fig. 23, Supplementary Fig. 24). The only genome-wide significant association was for an intron of SLC22A3 at 6:160851000-160854000 with Lp(a)-C (P = 4.5 × 10⁻⁸) (Supplementary Data 8-13). Similarly, rare variants in a putative regulatory domain of SLC22A3 were recently shown to be associated with Lp(a) in a sliding window analysis using low-coverage whole genomes³⁶. However, we found that conditioning on LPA’s KIV2-CN, 128 kb away, mitigated the observed association (P = 4.3 × 10⁻³, Supplementary Data 8, 9). Upon conditioning on KIV2-CN, while no sliding windows reached statistical significance, the top window was 6:160,939,500–160,942,500 (P = 1.6 × 10⁻⁴), 13 kb downstream of the LPA transcription end site and overlapping three annotated ORegAnno³⁷ CTCF binding sites (Fig. 6).

Interrogation of rare enhancer variants predicted to influence LPA expression in liver³⁸ showed nominal evidence of association with Lp(a)-C before (P = 5 × 10⁻⁵) and after (P = 1 × 10⁻³) conditioning on KIV2-CN (Fig. 6, Supplementary Fig. 25). However, other putative gene-linked rare enhancer variants at the LPA locus, including the aforementioned SLC22A3 (Supplementary Fig. 26), also demonstrate nominal associations, highlighting current challenges in both mapping associated regulatory elements to causal genes through in silico approaches and discerning the relative impacts of potentially pleiotropic regulatory elements.

Mendelian randomization

Genetic variation at the LPA locus is an optimal instrument for MR as it strongly and specifically influences circulating Lp(a) levels. Past studies have performed Lp(a) MR across clinical and metabolic traits using genetic risk scores comprised of between 1–18 variants^14,39,40. Here, we performed MR using three different genetic instruments per cohort to distinguish variant classes influencing Lp(a) phenotypes: (1) an expanded genetic risk score, “GRS,” comprised of the sum of the KIV2-CN-adjusted variant effects from LD-pruned variants in a ~4MB window around LPA with sub-threshold significance (P < 1 × 10⁻⁴); (2) a “KIV2-CN” score using the directly genotyped or imputed KIV2-CN; and (3) a combined “GRS + KIV2-CN” score combining scores from (1) and (2). Each genetic instrument was normalized such that 1 unit increase in the score was equal to 1 SD increase in Lp(a) (or Lp(a)-C). In African Americans, 235 variants were used towards the Lp(a) GRS and 39 towards the Lp(a)-C GRS (Supplementary Data 14). In Europeans, 399 variants were used towards the Lp(a) GRS and 49 towards the Lp(a)-C GRS (Supplementary Data 14). The GRS + KIV2-CN score explains 45–49% of Lp(a) variance and 20% of Lp(a)-C variance (Supplementary Fig 27, Supplementary Table 10).

Association of GRS + KIV2-CN with 10 incident clinical phenotypes from the FIN imputation dataset (N = 27,344) (Fig. 7a, Supplementary Table 11) demonstrated anticipated associations for incident cardiovascular diseases (HR 1.18/Lp(a) SD, P = 1 × 10⁻⁵), comprising incident myocardial infarction (HR 1.23/Lp(a) SD, P = 8 × 10⁻⁴), CHD (HR 1.25/Lp(a) SD, P = 7 × 10⁻⁷), and stroke (HR 1.27/Lp(a) SD, P = 1 × 10⁻³). For given effect on Lp(a), the GRS had a larger effect on incident CHD risk (HR 1.36/Lp(a) SD, P = 7.6 × 10⁻⁸) than KIV2-CN (HR 1.03/Lp(a) SD, P = 0.17). Similar trends were observed for incident myocardial infarction. While the KIV2-CN score alone was not as strongly associated with cardiovascular outcomes (P > 0.05), its estimated effect with incident MI (HR = 1.16) was similar to recent estimations in a MI case-control analysis¹⁴. Thus, power for MR using the KIV2-CN instrument may be hindered due to a limited number of incident MI cases and modest effect conferred by KIV2-CN. These results suggest that knowledge of LPA variant class genotypes may provide additional information on cardiovascular risk beyond circulating Lp(a) levels.

To determine whether LPA genomic variants influence the accumulation of subclinical cardiovascular atherosclerosis, we associated both the Lp(a) and Lp(a)-C genetic instruments with computed tomography-derived measures of atherosclerosis in the coronary arteries (CAC) and abdominal aorta (AAC) in 3221 of African ancestry and 3361 of European ancestry (Supplementary Table 12, Fig. 7b, Supplementary Fig. 28). Among African Americans without prevalent clinical atherosclerotic cardiovascular disease, the comprehensive (GRS + KIV2-CN) genetic instruments for both Lp(a) and Lp(a)-C demonstrated association with subclinical atherosclerosis in two vascular locations (coronary arteries and abdominal aorta): Lp(a) (AAC: B = 0.97, P = 7.38 × 10⁻⁴; CAC: B = 0.052, P = 0.032), and Lp(a)-C (AAC: B = 0.123, P = 6.3 × 10⁻³; CAC: B = 0.074, P = 0.039). Notably, this is the first known demonstration of Lp(a) or LPA genomic variants affecting atherosclerotic risk in African Americans. A prior study of African Americans from the Dallas Heart Study found no association between Lp(a) phenotype and subclinical measures of atherosclerosis, such as CAC⁴¹. With a larger sample size and use of a genetic instrument, our study has greater power for detecting this association among African Americans. Associations were less pronounced for European Americans between both observational and genetic instruments and subclinical atherosclerosis. The strongest association for European Americans was with Lp(a) GRS independent of KIV2-CN (CAC: B = 0.056, P = 0.027).

Discussion

We characterized the genetic architecture of Lp(a) and Lp(a)-C using deep-coverage WGS in 8,392 Europeans and African Americans across allele frequencies and classes. While we observe that Lp(a) is highly heritable in Europeans and African Americans, distinct and common genetic determinants influence concentrations. Using a comprehensive genetic instrument that separately imputes apo(a) isoform, we show that knowledge of LPA genotypes can better inform incident cardiovascular disease risk prediction than just knowledge of Lp(a) biomarker level.

These observations permit several conclusions. First, through whole-genome sequencing and imputation, we observe substantial genetic heritability of Lp(a)—85% (SE 5%) in African Americans and 75% (SE 6%) in Europeans. We leverage this observation to systematically dissect the heritable components of Lp(a) across the two ethnicities. Through single variant analysis, we find a novel locus for Lp(a)-C, SORT1, whereby the top variant (rs12740374) reduces plasma Lp(a)-C concentrations in both ethnicities and is independent of LDL cholesterol levels, thereby providing evidence for the sortilin receptor as a novel component in Lp(a)-C metabolism. Through genetic modifier analysis, we find evidence of three loci which affect the relationship between KIV2-CN and Lp(a)-C similarly across both ethnicities. We replicate evidence supporting rare coding variation at LPA influencing Lp(a); however, observed associations of aggregates of rare non-coding variation appeared to be largely explained by LPA structural variation, namely KIV2-CN.

Second, we observed high heritability in diverse ethnicities despite notable inter-ethnic differences in circulating biomarker concentrations. Upon finding that similar Lp(a) effect sizes are conferred per KIV2 copy in African Americans and Europeans, we delved further into KIV2-independent effects conferred by variants at the LPA locus. Among distinct sequence variation, we notably observed an LPAL2 intronic variant with significant yet opposing effects in each ethnicity, likely indicating influences from haplotype structure or gene-environment interactions. Altogether, LPA locus variants largely private to African Americans (FIN MAF < 0.1%) confer significantly greater absolute effect on standardized Lp(a) levels than variants observed in both ethnicities.

Third, WGS enables the detection of relevant genomic variants for Lp(a) which cannot be detected via WES or genotyping arrays. Furthermore, knowledge of such variants, given differential effects on circulating Lp(a) and differential effects on incident cardiovascular events, provides additional information regarding cardiovascular disease risk beyond circulating Lp(a).

It should be noted that several limitations to this work exist. First, we estimate total KIV2-CN, but individuals may have different KIV2-CN alleles on each chromosome⁴². Our CNV analysis of next-generation sequencing data relies on aggregate depth of coverage for genotyping, precluding our ability to determine allelic KIV2-CN. However, despite this, sensitivity analyses suggest that the sum of KIV2-CN alleles may similarly associate with Lp(a) across varied KIV2-CN allele combinations. Additionally, the strongest SNP in our KIV2-CN imputation model is rs10455872, whose association with KIV2-CN has been well-described previously¹⁷, and our KIV2-CN estimate is robustly associated with Lp(a) phenotypes as expected. Second, we only assess one non-European cohort; however, it has been observed that there are distinct Lp(a) distributions in other ethnicities which may uncover additional loci and sources of genetic heterogeneity. Furthermore, given the strong influence of ancestry on Lp(a), adjustment of LPA locus ancestry may improve power for genetic association. Indeed, prior analyses of African Americans suggest that genome-wide estimations of ancestry are correlated with LPA locus ancestry estimations⁴³. Third, while in silico prediction tools for non-coding regions identify putative regulatory sequence, they are limited in their ability to (1) determine disruptive mutations, and (2) link regulatory regions to genes.

In summary, we characterize the shared and unique genetic determinants of Lp(a) using whole genome sequences in African Americans and Europeans. Additional knowledge of the complement of these determinants better informs cardiovascular disease risk prediction than biomarker alone.

Methods

Study participants

Please refer to Supplementary Note 1 for study participant details. All study participants provided written and informed consent in accordance with respective institutional review boards for each of the participating study cohorts.

WGS and variant calling

Sequencing was performed at one of two sequencing centers, with all members within a cohort sequenced at the same center. The JHS WGS individuals were sequenced at University of Washington Northwest Genomics Center (Seattle, WA) as part of the as a part of the Phase 1 NIH/NHLBI Trans-Omics for Precision Medicine (TOPMed) program. The Finnish and Estonian WGS individuals were sequenced at the Broad Institute of Harvard and MIT (Cambridge, MA). Target coverage was >30× for JHS (mean attained 37.1), >20× for EST (mean attained 30.4), and >20× for FIN (mean attained 29.8).

TOPMED phase 1 BAM files were harmonized by the TOPMed Informatics Research Center (Center for Statistical Genetics, University of Michigan, Hyun Min Kang, Tom Blackwell and Goncalo Abecasis). In brief, sequence data were received from each sequencing center in the form of bam files mapped to the 1000 Genomes hs37d5 build 37 decoy reference sequence. Processing was coordinated and managed by the ‘GotCloud’ processing pipeline⁴⁴. Samples with DNA contamination >3% (estimated using verifyBamId software⁴⁵) and <95% of the genome covered at least 10× were filtered out. The JHS WGS used for analysis are from the “freeze 3a” genotype callsets of the variant calling pipeline performed using the software tools in the following repository: https://github.com/statgen/topmed_freeze3_calling, with variant detection performed by vt discover2 software tool⁴⁶.

WGS for FINRISK and the Estonian Biobank were performed using the Illumina HiSeqX platform at the Broad Institute of Harvard and MIT (Cambridge, MA). Libraries were normalized to 1.7 nM, constructed, and sequenced on the Illumina HiSeqX with the use of 151-bp paired-end reads for WGS and output was processed by Picard to generate aligned BAM files (to hg19)^47,48. Variants were discovered using the Geome Analysis Tookit (GATK) v3 HaplotypeCaller according to Best Practices⁴⁹. Finland and Estonia WGS samples were jointly called.

Whole-genome sequence sample quality control

The following three approaches were used by the TOPMed Genetic Analysis Center to identify and resolve sample identity issues in JHS: (1) concordance between annotated sex and biological sex inferred from the WGS data, (2) concordance between prior SNP array genotypes and WGS-derived genotypes, and (3) comparisons of observed and expected relatedness from pedigrees.

Additional measures for quality control of JHS, Finland, and Estonia were performed using the Hail software package (https://github.com/hail-is/hail)⁵⁰. Samples were filtered by contamination (>3.0% for JHS, >5.0% for Finland and Estonia), chimeras >5%, GC dropout >4, raw coverage (<30× for JHS, <19× for Finland and Estonia), and indeterminate genotypic sex or genotypic/phenotypic sex mismatch (Supplementary Table 1).

WGS genotype and variant quality control

The variant filtering in JHS was performed by (1) first calculating Mendelian consistency scores using known familial relatedness and duplicates, and (2) training SVM classifier between the known variant sites (positive labels) and the Mendelian inconsistent variants (negative labels). Two additional hard filters were applied: (1) Excess heterozygosity filter (EXHET), if the Hardy–Weinberg disequilbrium P-value was less than 1 × 10⁻⁶ in the direction of excess heterozygosity; (2) Mendelian discordance filter (DISC), with three or more Mendelian inconsistencies or duplicate discordances observed from the samples. Genotypes with a depth <10 were excluded, prior to filtering variants with >5% missingness.

Variants for Finland and Estonia were initially filtered by GATK Variant Quality Score Recalibration. Additionally, genotypes with GQ <20, DP <10 or >200, and poor allele balance (homozygous with <0.90 supportive reads or heterozygous with <0.20 supportive reads) were removed. Variants within low complexity regions were removed across all samples⁵¹. Variants with >20% missing calls, quality by depth <2 (SNPs) or <3 (indels), InbreedingCoeff <−0.3, and pHWE <1 × 10⁻⁹ were filtered out.

Finnish imputation and quality control

The imputation of the FINRISK samples⁵² was done utilizing population specific reference panel of 2690 high-coverage whole-genome and 5093 high-coverage whole-exome sequences with IMPUTE2⁵³ that allows the usage of two panels at the same time. Before phasing and imputation, the data was QCed using following criteria: exclude samples with obscure sex, missingness (>5%), excess heterozygosity (+-4sd), non-European ancestry and SNPs with low call-rate (>2% missing), low HWE P-value (<1e-6), minor allele count (MAC) <3 (in case Zcalled⁵⁴) or MAC <10 (if only called using Illumina GenCall). The haplotypic phase was determined using SHAPEIT2.0⁵⁵ prior to imputation. The FINRISK samples have been genotyped using multiple different genotyping chips, for which the QC, phasing and imputation was done in multiple chip-wise batches.

Lp(a) and Lp(a)-C phenotypes

Serum Lp(a)-C was measured in both EST and JHS via density gradient ultracentrifugation (Vertical Auto Profile [VAP], Atherotech).

Lp(a) was measured in JHS using a Diasorin nephelometric assay on a Roche Cobas FARA analyzer (Roche Diagnostics Corporation, Indianapolis, IN, USA), which measures Lp(a) mass by immunoprecipitin analysis using the SPQTM Antibody Reagent System of DiaSorin (DiaSorin Inc., Stillwater, MN 55082-0285). Turbidity produced by the antigen–antibody complexes was measured using the Roche Modular P Chemistry Analyzer. In FIN, Lp(a) was measured from serum stored at –70 °C using a commercially available latex immunoassay on an Architect c8000 system (Quantia Lp(a), Abbott Diagnostics).

Lp(a)-C and Lp(a) were inverse-rank normalized separately by cohort for analysis.

Conventional lipid phenotypes

Conventional lipoprotein cholesterols (HDL, LDL, TG, Total Cholesterol) and proteins (ApoB, ApoAI) were measured in EST and JHS by the VAP assay (where LDL refers to directly measured LDL, and not calculated). In FIN, these lipoproteins were measured via NMR as described in the MR methods below. In FIN, LDL cholesterol was either calculated by the Friedwald equation when triglycerides were <400 mg/dl or directly measured. Given the average effect of statins, when statins were present, total cholesterol was adjusted by dividing by 0.8 and LDL cholesterol by dividing by 0.7, as previously done⁵⁶. All lipids were inverse-rank normalized separately by cohort in analysis.

KIV2-CN estimation from WGS data

Genome STRiP²¹ version 2.00.1710 was used to estimate KIV2-CN in the LPA gene. Specifically, we ran Genome STRiP read-depth genotyping on the hg19 interval 6:161032614–161067851 using the following custom settings to capture an aggregate read-depth signal over every base position: -P depth.minimumMappingQuality:0, without specifying any of the usual genome masks.

After genotyping, we estimated the number of KIV2 protein domains from the raw copy number estimate by dividing the VCF genotype field CNF by the info field GSM1 and then estimating the KIV2 copy number by

$${\mathrm{KIV2 - CN = }}\left( {{\mathrm{CNF/GSM1}}} \right) \ast {\mathrm{6}}{\mathrm{.354}} - {\mathrm{0}}{\mathrm{.708}}$$

where 6.354 is derived from the number of full copies of the repeating unit represented on the hg19 reference genome and −0.708 is to adjust to the KIV2 units as visualized in Supplementary Fig. 6a, removing the outermost flanking exons that are part of the KIV1 and KIV3 (which are picked up in Genome STRiP due to their homology with the exons within the KIV2 domain).

Evaluation of KIV2-CN precision

To evaluate the precision of our measurements of KIV2 copy number, we utilized 123 pairs of siblings from JHS that were confidently IBD2 (identical-by-descent on both haplotypes) at the LPA locus. To identify these sibling pairs, we interrogated the hg19 interval 6:160,450,001–161,590,000 (0.5 Mb upstream and downstream of the LPA gene) and computed the concordance of SNP genotypes in this interval between all sequenced sibling pairs. We classified all sibling pairs with less than 1% genotype discordance as confidently IBD2 at the LPA locus and compared IBD2 sibling KIV2-CNs.

KIV2-CN Imputation

We split the FIN WGS into one training dataset comprised of two thirds of the samples (1477 samples) and one validation dataset (738 samples), and used the least absolute shrinkage and selection operator (LASSO), a machine-learning regression analysis method, using variants (using --indep-pairwise 50 5 0.25 in PLINK⁵⁷) in a 4MB window around LPA imputed with high-quality (imputation quality >0.8) and MAF >0.001 in the FIN dataset. After applying 10-fold cross validation to find the optimal lambda (degree of shrinkage), the LASSO model selected 61 variants which minimized the mean squared error (Supplementary Fig. 8a). These 61 variants were also used in a random forest model to quantify the relative importance of each variant in the model (Supplementary Fig. 8b, Fig. 2b).

Principle component analysis (PCA)

To visualize PCs across all three cohorts against each other, a panel of approximately 16,000 ancestry informative markers⁵⁸ (AIMs) identified across six continental populations⁵⁹ was chosen to derive principal components (PCs) of ancestry for all samples that passed quality control. Principal component analysis was performed using EIGENSTRAT, using suggested quality control criteria⁶⁰ (Supplementary Fig. 3). Separately, within-cohort PCA was performed for use as covariates in analysis.

Variant annotation

Variants were annotated with Hail⁵⁰ using annotations from Ensembl’s Variant Effect Predictor (VEP), ascribing the most severe, canonical consequence and gene to each variant⁶¹. For non-coding regions in adult liver cells (E066), we used the Reg2Map HoneyBadger2-intersect³⁴ at strong (P < 1 × 10⁻¹⁰) DNase I hypersensitive regions (https://personal.broadinstitute.org/meuleman/reg2map/HoneyBadger2-intersect_release/).

Variants overlapping putative enhancers and promoters from the 25-state chromatin model³⁴ at this link were annotated and used in the single variant results annotations (Supplementary Data 2, 3), as well as grouping rare variants in the “sliding window” and “by distance” non-coding rare variant studies. Variants within 1MB of a known locus from the main lipids (LDL, HDL, TG, TC), as listed in Supplementary Data 15, were annotated as “KnownLocus_rsID” and “KnownLocus_Gene” within the single variant summary results files in Supplementary Data 2, 3.

Single variant association

Single variant analysis for EST and JHS WGS was performed using Hail’s linear mixed-model regression⁵⁰ for associating each variant site with inverse normal transformed Lp(a) and Lp(a)-C within each cohort. All analyses were adjusted for KIV2-CN, age, sex, and an empirically derived kinship matrix to account for both familial and more distant relatedness⁶². To create the kinship matrix, regions of high-complexity known to have high LD were removed (as in the EPACTS make-kin --remove-complex flag); these regions included: 5:44000000–52000000, 6:24000000–36000000, 8:8000000–12000000, 11:42000000–58000000, and 17:40000000–43000000. Ten-fold random down-sampling of variants was performed to further reduce variant counts for fast processing-time.

For the FIN imputation dataset, single variant analysis was performed using SNPTEST (v2.5.2), using KIV2-CN, age, sex, fasting > 10 h, and adding PC1-10 as covariates to account for population structure due to absence of kinship matrix.

To ensure robust results, we only performed single variant analysis for variants with a MAF >0.001 within either cohort. Summary statistics for JHS and FIN for Lp(a) and JHS and EST for Lp(a)-C, for the corresponding inverse-rank normalized phenotypes, were meta-analyzed across cohorts using METAL⁶³, while also calculating heterogeneity statistics. Statistical significance alpha of 5 × 10⁻⁸ was used for these analyses.

Additionally, for the LPA locus, iterative conditional association analysis was performed by cohort. Iterative conditioning was performed until P > 5 × 10⁻⁸ was attained.

Heritability analyses

Heritability analyses in EST WGS (for Lp(a)-C) and JHS WGS (for both Lp(a) and Lp(a)-C) were performed using Hail’s linear mixed-model regression heritability estimate⁵⁰, described here https://hail.is/hail/hail.VariantDataset.html?highlight=lmm#hail.VariantDataset.lmmreg. Several filters were applied before variants were used in the kinship matrix. First, genome-wide variants underwent two-fold LD pruning as previously described via BOLT-REML⁶⁴, using variants with MAF > 0.001 and missingness < 1% with maximum LD r^2 =0.9 (PLINK⁵⁷ commands used: --maf 0.001 --geno 0.01 --indep-pairwise 50 5 0.9). Regions of high-complexity were removed as previously described for single variant analysis. Ten-fold random down-sampling of variants was performed to further reduce variant counts for feasible analysis processing-time. For the heritability estimates provided, 6,370,696 variants were used towards the kinship matrix in EST Lp(a)-C analysis, 1,897,407 variants in JHS Lp(a)-C analysis, and 1,894,291 variants in the JHS Lp(a) analysis. Baseline covariates used in the model, performed separately by cohort, included age, sex, fasting >10 h, and for EST, sequencing batch. A separate heritability estimate was also derived additionally conditioning on KIV2-CN.

For the FIN imputation dataset, variants were similarly limited, filtering to variants with MAF > 0.001, imputation quality > 0.8, and applying two-fold LD-pruning and removal of complex regions as described above (though the ten-fold down-sampling was not applied to keep the variant count on the same order of magnitude as in the WGS heritability analyses). A total of 3,088,864 variants were used towards heritability analysis, which was performed using BOLT-REML. Covariates used in the analysis included age, sex, fasting >10 h, and PC1-10. A separate heritability estimate was also derived additionally conditioning on KIV2-CN. For Lp(a), heritability analysis additionally conditioning on both KIV2-CN and the KIV2-CN-independent GRS using in MR was performed. BOLT-REML was also applied towards the Lp(a) heritability analysis in JHS, arriving at the same heritability estimates as Hail (data not shown).

KIV2-CN modifier analysis

Variant-by-KIV2-CN interaction analysis in the WGS was performed at a ~4MB window (6:158532140–162664257) around LPA to identify variants, which modify the relationship between directly genotyped KIV2-CN and Lp(a)-C (for EST and JHS) and Lp(a) (for JHS only). Variants with minor allele count >20 (by cohort) were included in analyses. The following interaction model was performed:

$${\mathrm{Lp}}\left( {\mathrm{a}} \right){\mathrm{ - C\sim KIV2 - CN + Variant + KIV2 - CN\times Variant + covariates}}$$

Where the interaction effect and P-value corresponds to the term: “KIV2-CN × Variant”. Cohort-specific analyses were performed and for Lp(a)-C, EST and JHS interaction results were meta-analyzed using METAL⁶³. Using the full interaction results, three top modifier variants were identified (rs13192132, rs1810126, and rs1740445) that were genome-wide significant upon meta-analysis (P < 5 × 10⁻⁸), in linkage equilibrium (r² < 0.1) across both ethnic backgrounds, and had replicating interaction effect directions in both ethnicities. To determine the cohort-specific Bonferroni significance threshold, LD clumping was performed on the full interaction results separately by cohort using the following PLINK⁵⁷ flags: --clump-kb 500 --clump-p1 1 --clump-p2 1 --clump-r2 0.25. In JHS, 1373 LD-pruned variants were identified, leading to a significance threshold of P = 3.64 × 10⁻⁵. In EST, 566 LD-pruned variants were identified, leading to a significance threshold of P = 8.83 × 10⁻⁵. Clumped variants with interaction p values surpassing the Bonferroni threshold are provided by cohort and phenotype in Supplementary Data 5. Overlap with methylation and acetylation marks was visualized using data from Roadmap for E066 adult liver cells at http://egg2.wustl.edu/roadmap/data/byFileType/alignments/consolidated/. Liver ATAC-seq data was downloaded from the ENCODE data portal (accession ENCFF893CSN). FASTQ files were adapter-trimmed and aligned to hg19 with bowtie2, and duplicates reads and reads with MAPQ <30 were removed.

Previous publications of variant-by-variant interactions have recommended performing sensitivity analyses to ensure significant interactions identified are not (1) due to the variants being in LD on the same haplotype and (2) mitigated by a separate third variant which explains the entire association^32,33. In particular, the most recent study by Fish et al.²⁸ recommended that variant-by-variant interactions be performed using un-correlated variants (LD r² < 0.6). Thus, we checked the correlation of each of the three top identified variants with KIV2-CN by cohort (Supplementary Table 8), finding that these variants are indeed not correlated with KIV2-CN (Pearson correlation r² < 0.1). Furthermore, variants not associated (P > 0.05) with the phenotype are suggested to be removed, under the hypothesis that they may represent weak marginal effects from a true underlying interaction. Indeed, our three top Lp(a)-C interaction variants are all individually associated with Lp(a)-C (Supplementary Table 9). Lastly, conditional analysis has been suggested to ensure that the interaction model is not mitigated by a separate third variant that explains the interaction. Thus, we performed conditional analysis on the top three interaction models, conditioning on the previously identified variants from single variant analysis (reported in Supplementary Table 9) found to be conditionally independently associated with Lp(a)-C in each cohort. As seen in Supplementary Table 9, conditional analysis does not fully mitigate any of the identified interaction associations. Details on additional supplementary analysis performed imputing KIV2-CN using variants from the Illumina OmniQuad genotyping array is provided in Supplementary Note 3.

Rare variant coding and non-coding association analyses (RVAS)

Please refer to the Supplementary Note 4 for details on the coding and non-coding grouping schemes used. We tested the association of the aggregate of the aforementioned groupings with each lipid trait using the mixed-model Sequence Kernal Association Test (SKAT) implementation in EPACTS to account for bidirectional effects.⁶² Analyses were adjusted for age, sex, fasting >10 h, sequencing batch (just used in Estonia), and empiric kinship. Groups with at least two rare variants and combined MAF >0.001 across all aggregated variants in a given cohort were included in meta-analysis. P values were meta-analyzed using Fisher’s method. Statistical significance for each RVAS test was based on the number of groups tested and is provided in the headers of Supplementary Data 6–13.

Mendelian randomization

We developed three genetic instruments per cohort. The first instrument used was a genetic risk score, “GRS,” comprised of variants in a ~4MB window around LPA (6:158532140–162664257) with sub-threshold significance (P-value < 1 × 10⁻⁴), using variant effect sizes from the KIV2-CN conditioned single variant analysis and performing LD clumping in plink using the following parameters: --clump-kb 500 --clump-p1 0.0001 --clump-p2 1 --clump-r2 0.25. This resulted in 399 variants for Lp(a) GRS in FIN, 235 variants for Lp(a) GRS in JHS, 39 variants for Lp(a)-C GRS in JHS, and 49 variants for Lp(a)-C GRS in EST (Supplementary Data 14). The second instrument used was a “KIV2-CN” score using the directly genotyped or imputed KIV2-CN. The third instrument used was a combined “GRS + KIV2-CN” score combining scores from (1) and (2). Each of the three scores were inverse rank normalized and adjusted such that 1 unit increase in the score is equal to 1 SD increase in Lp(a) (or Lp(a)-C, depending on how the instrument was adjusted). The multiplicative factors used to adjust each score are provided in Supplementary Table 10.

Please refer to Supplementary Note 2 for details on additional MESA, FHS, and OOA participants used in subclinical atherosclerosis instrumental variable analyses. The Lp(a) GRS for Europeans in MESA and FHS was based off of the FIN Lp(a) GRS, the Lp(a) GRS for African Americans in MESA and JHS was based off of the JHS Lp(a) GRS, the Lp(a)-C GRS for Europeans in MESA, FHS, and OOA was based off of the EST Lp(a)-C GRS, and the Lp(a)-C GRS for African Americans in MESA and JHS was based off of the JHS Lp(a)-C GRS.

Please refer to Supplementary Note 5 for details on incident events and subclinical measures used. For incident clinical events, a cox proportional hazards test was performed, finding the association between each incident event and each of the genetic instruments, as well as observational Lp(a). For the quantitative subclinical measures, linear regression was performed, finding the association between each inverse-rank normalized phenotype and each of the genetic instruments, as well as inverse-rank normalized Lp(a) and Lp(a)-C (where available). Covariates used in all analyses included the first five principal components of genetic ancestry, age, sex, if the individual was fasting >10 h. Statistical significance was defined for the 10 FIN incident clinical events and two subclinical atherosclerosis traits using a Bonferroni significance threshold was based on the number of outcome phenotypes analyzed (P = 0.005 and 0.025, respectively).

Data availability

Individual-level genotype and phenotype information for TOPMed studies are available in dbGAP (JHS: phs000964, FHS: phs000974, MESA: phs001416, OOA: phs000956). Summary-level list of genotypes and genotype counts are available on the BRAVO server (https://bravo.sph.umich.edu/). The Finnish WGS and array genotype data can be accessed through THL Biobank (https://thl.fi/fi/web/thl-biobank). The WGS data at Estonian Genome Center, University of Tartu can be accessed via Estonian Biobank (www.biobank.ee).

Change history

21 July 2018
The HTML version of this Article was updated shortly after publication to remove extraneous text that was inadvertently inserted into the legend for Supplementary Data 8 during the production process. The PDF version was correct from the time of publication.
23 August 2018
The original version of this article contained an error in the name of the author Ramachandran S. Vasan, which was incorrectly given as Vasan S. Ramachandran. This has now been corrected in both the PDF and HTML versions of the article.
01 April 2020
An amendment to this paper has been published and can be accessed via a link at the top of the paper.

References

Tsimikas, S. & Hall, J. L. Lipoprotein(a) as a potential causal genetic risk factor of cardiovascular disease: a rationale for increased efforts to understand its pathophysiology and develop targeted therapies. J. Am. Coll. Cardiol. 60, 716–721 (2012).
CAS PubMed Google Scholar
Utermann, G. The mysteries of lipoprotein(a). Science 246, 904–910 (1989).
ADS CAS PubMed Google Scholar
Berglund, L. & Ramakrishnan, R. Lipoprotein(a): an elusive cardiovascular risk factor. Arterioscler. Thromb. Vasc. Biol. 24, 2219–2226 (2004).
CAS PubMed PubMed Central Google Scholar
Kraft, H. G., Kochl, S., Menzel, H. J., Sandholzer, C. & Utermann, G. The apolipoprotein (a) gene: a transcribed hypervariable locus controlling plasma lipoprotein (a) concentration. Hum. Genet 90, 220–230 (1992).
CAS PubMed Google Scholar
Lanktree, M. B., Anand, S. S., Yusuf, S., Hegele, R. A. & Investigators, S. Comprehensive analysis of genomic variation in the LPA locus and its relationship to plasma lipoprotein(a) in South Asians, Chinese, and European Caucasians. Circ. Cardiovasc Genet 3, 39–46 (2010).
CAS PubMed Google Scholar
Lamon-Fava, S. et al. The NHLBI Twin Study: heritability of apolipoprotein A-I, B, and low density lipoprotein subclasses and concordance for lipoprotein(a). Atherosclerosis 91, 97–106 (1991).
CAS PubMed Google Scholar
Austin, M. A. et al. Lipoprotein(a) in women twins: heritability and relationship to apolipoprotein(a) phenotypes. Am. J. Hum. Genet. 51, 829–840 (1992).
CAS PubMed PubMed Central Google Scholar
Schmidt, K., Kraft, H. G., Parson, W. & Utermann, G. Genetics of the Lp(a)/apo(a) system in an autochthonous Black African population from the Gabon. Eur. J. Hum. Genet. 14, 190–201 (2006).
CAS PubMed Google Scholar
Scholz, M. et al. Genetic control of lipoprotein(a) concentrations is different in Africans and Caucasians. Eur. J. Hum. Genet. 7, 169–178 (1999).
ADS CAS PubMed Google Scholar
Mooser, V. et al. The Apo(a) gene is the major determinant of variation in plasma Lp(a) levels in African Americans. Am. J. Hum. Genet 61, 402–417 (1997).
CAS PubMed PubMed Central Google Scholar
Mack, S. et al. A genome-wide association meta-analysis on lipoprotein(a) concentrations adjusted for apolipoprotein(a) isoforms. J Lipid Res 58(9), 1834–1844 (2017).
CAS PubMed PubMed Central Google Scholar
Kraft, H. G. et al. Apolipoprotein(a) kringle IV repeat number predicts risk for coronary heart disease. Arterioscler. Thromb. Vasc. Biol. 16, 713–719 (1996).
CAS PubMed Google Scholar
Sandholzer, C. et al. Apo(a) isoforms predict risk for coronary heart disease. A study in six populations. Arterioscler. Thromb. 12, 1214–1226 (1992).
CAS PubMed Google Scholar
Saleheen, D. et al. Apolipoprotein(a) isoform size, lipoprotein(a) concentration, and coronary artery disease: a mendelian randomisation analysis. Lancet Diabetes Endocrinol. 5(7), 524–533 (2017).
CAS PubMed PubMed Central Google Scholar
Clarke, R. et al. Genetic variants associated with Lp(a) lipoprotein level and coronary disease. N. Engl. J. Med. 361, 2518–2528 (2009).
CAS PubMed Google Scholar
Kraft, H. G. et al. Frequency distributions of apolipoprotein(a) kringle IV repeat alleles and their effects on lipoprotein(a) levels in Caucasian, Asian, and African populations: the distribution of null alleles is non-random. Eur. J. Hum. Genet. 4, 74–87 (1996).
CAS PubMed Google Scholar
Lanktree, M. B. et al. Determination of lipoprotein(a) kringle repeat number from genomic DNA: copy number variation genotyping using qPCR. J. Lipid Res. 50, 768–772 (2009).
CAS PubMed PubMed Central Google Scholar
Kulkarni, K. R., Garber, D. W., Marcovina, S. M. & Segrest, J. P. Quantification of cholesterol in all lipoprotein classes by the VAP-II method. J. Lipid Res. 35, 159–168 (1994).
CAS PubMed Google Scholar
Kulkarni, K. R. Cholesterol profile measurement by vertical auto profile method. Clin. Lab. Med. 26, 787–802 (2006).
PubMed Google Scholar
Waldeyer, C. et al. Lipoprotein(a) and the risk of cardiovascular disease in the European population: results from the BiomarCaRE consortium. Eur. Heart J. 38, 2490–2498 (2017).
CAS PubMed PubMed Central Google Scholar
Handsaker, R. E., Korn, J. M., Nemesh, J. & McCarroll, S. A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269–276 (2011).
CAS PubMed PubMed Central Google Scholar
Noureen, A., Fresser, F., Utermann, G. & Schmidt, K. Sequence variation within the KIV-2 copy number polymorphism of the human LPA gene in African, Asian, and European populations. PLoS ONE 10, e0121582 (2015).
PubMed PubMed Central Google Scholar
Handsaker, R. E. et al. Large multiallelic copy number variations in humans. Nat. Genet. 47, 296–303 (2015).
CAS PubMed PubMed Central Google Scholar
Lim, E. T. et al. Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet. 10, e1004494 (2014).
PubMed PubMed Central Google Scholar
Kyriakou, T. et al. A common LPA null allele associates with lower lipoprotein(a) levels and coronary artery disease risk. Arterioscler. Thromb. Vasc. Biol. 34, 2095–2099 (2014).
CAS PubMed Google Scholar
Musunuru, K. et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466, 714–719 (2010).
ADS CAS PubMed PubMed Central Google Scholar
Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
CAS PubMed PubMed Central Google Scholar
Yeang, C., Clopton, P. C. & Tsimikas, S. Lipoprotein(a)-cholesterol levels estimated by vertical auto profile correlate poorly with Lp(a) mass in hyperlipidemic subjects: implications for clinical practice interpretation of Lp(a)-mediated risk. J. Clin. Lipidol. 10, 1389–1396 (2016).
PubMed PubMed Central Google Scholar
Surakka, I. et al. The impact of low-frequency and rare variants on lipid levels. Nat. Genet. 47, 589–597 (2015).
CAS PubMed PubMed Central Google Scholar
Li, J. et al. Genome- and exome-wide association study of serum lipoprotein (a) in the Jackson Heart Study. J. Hum. Genet. 60, 755–761 (2015).
PubMed Google Scholar
Lu, W. et al. Evidence for several independent genetic variants affecting lipoprotein (a) cholesterol levels. Hum. Mol. Genet. 24, 2390–2400 (2015).
CAS PubMed PubMed Central Google Scholar
Fish, A. E., Capra, J. A. & Bush, W. S. Are interactions between cis-regulatory variants evidence for biological epistasis or statistical artifacts? Am. J. Hum. Genet. 99, 817–830 (2016).
CAS PubMed PubMed Central Google Scholar
Wood, A. R. et al. Another explanation for apparent epistasis. Nature 514, E3–E5 (2014).
CAS PubMed PubMed Central Google Scholar
Roadmap Epigenomics, C. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Google Scholar
Kim, S., Jhong, J. H., Lee, J. & Koo, J. Y. Meta-analytic support vector machine for integrating multiple omics data. BioData Min. 10, 2 (2017).
PubMed PubMed Central Google Scholar
Morrison, A. C. et al. Practical approaches for whole-genome sequence analysis of heart- and blood-related traits. Am. J. Hum. Genet. 100, 205–215 (2017).
CAS PubMed PubMed Central Google Scholar
Lesurf, R. et al. ORegAnno 3.0: a community-driven resource for curated regulatory annotation. Nucleic Acids Res. 44, D126–D132 (2016).
CAS PubMed Google Scholar
Liu, Y., Sarkar, A., Kheradpour, P., Ernst, J. & Kellis, M. Evidence of reduced recombination rate in human regulatory domains. Genome Biol. 18, 193 (2017).
PubMed PubMed Central Google Scholar
Emdin, C. A. et al. Phenotypic characterization of genetically lowered human lipoprotein(a) levels. J. Am. Coll. Cardiol. 68, 2761–2772 (2016).
CAS PubMed PubMed Central Google Scholar
Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat. Commun. 7, 11122 (2016).
ADS CAS PubMed PubMed Central Google Scholar
Guerra, R. et al. Lipoprotein(a) and apolipoprotein(a) isoforms: no association with coronary artery calcification in the Dallas Heart Study. Circulation 111, 1471–1479 (2005).
CAS PubMed Google Scholar
Marcovina, S. M., Hobbs, H. H. & Albers, J. J. Relation between number of apolipoprotein(a) kringle 4 repeats and mobility of isoforms in agarose gel: basis for a standardized isoform nomenclature. Clin. Chem. 42, 436–439 (1996).
CAS PubMed Google Scholar
Deo, R. C. et al. Single-nucleotide polymorphisms in LPA explain most of the ancestry-specific variation in Lp(a) levels in African Americans. PLoS ONE 6, e14581 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Jun, G., Wing, M. K., Abecasis, G. R. & Kang, H. M. An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data. Genome Res. 25, 918–925 (2015).
CAS PubMed PubMed Central Google Scholar
Jun, G. et al. Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am. J. Hum. Genet. 91, 839–848 (2012).
CAS PubMed PubMed Central Google Scholar
Tan, A., Abecasis, G. R. & Kang, H. M. Unified representation of genetic variants. Bioinformatics 31, 2202–2204 (2015).
CAS PubMed PubMed Central Google Scholar
Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinforma. 11, 11 10 1–11 10 33 (2013).
Google Scholar
Ganna, A. et al. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat. Neurosci. 19, 1563–1565 (2016).
CAS PubMed PubMed Central Google Scholar
Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinformatics 30, 2843–2851 (2014).
CAS PubMed PubMed Central Google Scholar
Vartiainen, E. et al. Thirty-five-year trends in cardiovascular risk factors in Finland. Int. J. Epidemiol. 39, 504–518 (2010).
MathSciNet PubMed Google Scholar
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
PubMed PubMed Central Google Scholar
Goldstein, J. I. et al. zCall: a rare variant caller for array-based genotyping: genetics and population analysis. Bioinformatics 28, 2543–2545 (2012).
CAS PubMed PubMed Central Google Scholar
Delaneau, O., Zagury, J. F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).
CAS PubMed Google Scholar
Peloso, G. M. et al. Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks. Am. J. Hum. Genet. 94, 223–232 (2014).
CAS PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
CAS PubMed PubMed Central Google Scholar
Hoggart, C. J. et al. Control of confounding of genetic associations in stratified populations. Am. J. Hum. Genet. 72, 1492–1504 (2003).
CAS PubMed PubMed Central Google Scholar
Libiger, O. & Schork, N. J. A method for inferring an individual’s genetic ancestry and degree of admixture associated with six major continental populations. Front. Genet. 3, 322 (2012).
PubMed Google Scholar
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
CAS PubMed Google Scholar
McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).
CAS PubMed PubMed Central Google Scholar
Kang, H. M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).
CAS PubMed PubMed Central Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
CAS PubMed PubMed Central Google Scholar
Loh, P. R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Please refer to Supplementary Note 6 for Acknowledgements.

Author information

These authors jointly supervised this work: Tonu Esko, Andrea Ganna, Samuli Ripatti, Sekar Kathiresan, Pradeep Natarajan.

Authors and Affiliations

Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
Seyedeh M. Zekavat, Robert E. Handsaker, Jonathan Bloom, Timothy Poterba, Cotton Seed, Mark Chaffin, Jesse Engreitz, Manolis Kellis, Mark J. Daly, Benjamin M. Neale, Steven McCarroll, Tonu Esko, Andrea Ganna, Samuli Ripatti, Sekar Kathiresan & Pradeep Natarajan
Yale School of Medicine, New Haven, CT, 06510, USA
Seyedeh M. Zekavat
Department of Computational Biology & Bioinformatics, Yale University, New Haven, CT, 06510, USA
Seyedeh M. Zekavat
Institute for Molecular Medicine, University of Helsinki, Helsinki, Finland
Sanni Ruotsalainen, Ida Surakka & Samuli Ripatti
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
Robert E. Handsaker, Mark J. Daly, Benjamin M. Neale, Steven McCarroll & Andrea Ganna
Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
Robert E. Handsaker & Steven McCarroll
Analytic and Translational Genetics Unit, Boston, MA, 02142, USA
Jonathan Bloom, Timothy Poterba, Cotton Seed, Mark J. Daly, Benjamin M. Neale & Andrea Ganna
Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
Maris Alver
Estonian Genome Center, Tallinn, Estonia
Maris Alver, Andres Metspalu & Tonu Esko
Department of Biological Chemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
Jason Ernst
Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, USA
Gina M. Peloso & L. Adrienne Cupples
Center for Public Health Genomics, University of Virginia, Charlottesville, VA, 22904, USA
Ani Manichaikul, Chaojie Yang & Stephen S. Rich
Program in Personalized and Genomic Medicine, Division of Endocrinology, Diabetes & Nutrition, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
Kathleen A. Ryan & Mao Fu
Department of Biostatistics, School of Public Health and Community Medicine, University of Washington, Seattle, WA, 98195, USA
W. Craig Johnson
Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, 55455, USA
Michael Tsai
Division of Cardiology, Harbor-UCLA Medical Center, Los Angeles Biomedical Research Institute, Los Angeles, CA, 90509, USA
Matthew Budoff
NHLBI Framingham Heart Study, Framingham, MA, 20892, USA
Ramachandran S. Vasan & L. Adrienne Cupples
Sections of Preventive medicine and Epidemiology, and cardiovascular medicine, Departments of Medicine and Epidemiology, Boston university Schools of Medicine and Public health, Boston, MA, 02118, USA
Ramachandran S. Vasan
Departments of Pediatrics and Medicine, The Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, CA, 90509, USA
Jerome I. Rotter
Division of Cardiology, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
Wendy Post
Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
Braxton D. Mitchell
Department of Medicine, University of Mississippi Medical Center, Jackson, MS, 39216, USA
Adolfo Correa & James G. Wilson
National Institute for Health and Welfare, Helsinki, Finland
Veikko Salomaa
Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, MA, 02139, USA
Manolis Kellis
Department of Public Health, Faculty of Medicine, University of Helsinki, Helsinki, Finland
Samuli Ripatti
Department of Medicine, Harvard Medical School, Boston, MA, 02115, USA
Sekar Kathiresan & Pradeep Natarajan
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, 02114, USA
Sekar Kathiresan & Pradeep Natarajan
Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
Sekar Kathiresan & Pradeep Natarajan
New York Genome Center, New York, NY, 10013, USA
Namiko Abe, Karen Bunting, Bo-Juen Chen, Soren Germer, Tanja Smith & Michael Zody
University of Michigan, Ann Arbor, MI, 48109, USA
Goncalo Abecasis, Larry Bielak, Thomas Blackwell, Jeffrey Curtis, Sayantan Das, Matthew Flickinger, Xiaoqi (Priscilla) Geng, Min A Jhun, Hyun Min Kang, Sharon Kardia, Seunggeun Shawn Lee, Jonathon LeFaive, Keng Han Lin, Patricia Peyser, Christopher Scheller, Ellen Schmidt, Jennifer Smith, Peter VandeHaar, Cristen Willer, Wei Zhao, Xiang Zhou & Sebastian Zoellner
Massachusetts General Hospital, Boston, MA, 02114, USA
Christine Albert, Emelia Benjamin, Patrick Ellinor, Steven Lubitz & Lu-Chen Weng
Wake Forest Baptist Health, Winston-Salem, NC, 27157, USA
Nicholette (Nichole) Palmer Allred, David Herrington, Yongmei Liu & Beverly Snively
Children’s Hospital of Philadelphia, University of Pennsylvania, Philadelphia, PA, 19104, USA
Laura Almasy
University of Pennsylvania, Philadelphia, PA, 19104, USA
Laura Almasy
Emory University, Atlanta, GA, 30322, USA
Alvaro Alonso, Rich Johnston, Larry Phillips & Zhaohui Qin
University of Maryland, Baltimore, MD, 21201, USA
Seth Ament, Amber Beitelshees, Christy Chang, Coleen Damcott, Scott Devine, Da-Wei Gong, Yue Guan, Daniel Harris, Elliott Hong, Michael Kessler, Joshua Lewis, Patrick McArdle, May E. Montasser, Jeff O’Connell, Tim O’Connor, Afshin Parsa, James Perry, Toni Pollin, Robert Reed, Shabnam Salimi, Amol Shetty, Elizabeth Streeten, Carole Sztalryd, Simeon Taylor, Huichun Xu, Rongze Yang & Norann Zaghloul
University of Washington, Seattle, WA, 98195, USA
Peter Anderson, Joshua Bis, Ingrid Borecki, Jennifer Brody, Jai Broome, Colleen Davis, Leslie Emery, Stephanie M. Fullerton, Stephanie Gogarten, Ben Heavner, Susan Heckbert, Deepti Jain, Jill Johnsen, Alyna Khan, Stephanie Krauter, Cathy Laurie, Cecelia Laurie, David Levine, Deborah Nickerson, Ulrike Peters, Sam Phillips, Bruce Psaty, Alex Reiner, Ken Rice, Josh Smith, Nicholas Smith, Nona Sotoodehnia, Adrienne Stilp, Adam Szpiro, Timothy A. Thornton, David Tirschwell, Fei Fei Wang, Bruce Weir, Kayleen Williams, Quenna Wong & Xiuwen Zheng
University of Mississippi, Jackson, MS, 38677, USA
Pramod Anugu, Lynette Ekunwe, Yan Gao, Michael Hall, Hao Mei, Nancy Min, Solomon Musani & Stanford Mwasongwe
National Institutes of Health, Bethesda, MD, 20892, USA
Deborah Applebaum-Bowden, Rebecca Beer, Weiniu Gan, Cashell Jaquish, Julie Mikulla, Mollie Minear, George Papanicolaou & Pankaj Qasba
Johns Hopkins University, Baltimore, MD, 21218, USA
Dan Arking, Dimitrios Avramopoulos, Emily Barron-Casella, Terri Beaty, Diane Becker, Lewis Becker, Ferdouse Begum, James Casella, Kimberly Jones, Barry Make, Rasika Mathias, Rakhi Naik, Ingo Ruczinski, Steven Salzberg, Margaret Taub, Dhananjay Vaidya & Lisa Yanek
University of Kentucky, Lexington, KY, 40506, USA
Donna K Arnett
Duke University, Durham, NC, 27708, USA
Allison Ashley-Koch & Marilyn Telen
University of Alabama, Birmingham, AL, 35487, USA
Stella Aslibekyan, Bertha Hidalgo, Marguerite Ryan Irvin, Merry-Lynn McDonald & Hemant Tiwari
Stanford University, Stanford, CA, 94305, USA
Tim Assimes, Sean David, Chris Gignoux, Marco Perez & Hua Tang
University of Wisconsin Milwaukee, Milwaukee, WI, 53211, USA
Paul Auer
Cleveland Clinic, Cleveland, OH, 44195, USA
John Barnard & Mina Chung
University of Colorado, Denver, CO, USA, 80204
Kathleen Barnes, Jonathan Cardwell, Sameer Chavan, Michelle Daya, John Hokanson, Greg Kinney, Ethan Lange, Leslie Lange, Susan Mathai, Julia Powers Becker, Meher Preethi Boorgula, Nicholas Rafaels, Pamela Russell, David Schwartz, Aniket Shetty, Tarik Walker, Avram Walts & Ivana Yang
Columbia University, New York, NY, 10027, USA
R. Graham Barr
Boston University, Boston, MA, 02215, USA
Emelia Benjamin, Honghuang Lin & Kathryn Lunetta
Fundação de Hematologia e Hemoterapia de Pernambuco - Hemope, Recife, 52011-000, Brazil
Marcos Bezerra
University of Texas Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA
John Blangero, Joanne Curran & Michael Mahaney
University of Texas Health, Houston, TX, 77225, USA
Eric Boerwinkle, Myriam Fornage, James Hixson & Degui Zhi
National Jewish Health, Denver, CO, 80206, USA
Russell Bowler, James Crapo, Tasha Fingerlin, Sara Penchev, Elizabeth Regan & Snow Xueyan Zhao
Medical College of Wisconsin, Milwaukee, WI, 53226, USA
Ulrich Broeckel
University of California, San Francisco, San Francisco, CA, 94143, USA
Esteban Burchard & Ryan Hernandez
Women’s Health Initiative, Seattle, WA, 98109, USA
Cara Carty, Jeff Haessler, Simin Liu & Lesley Tinker
University of California, Los Angeles, Los Angeles, CA, 90095, USA
Richard Casaburi, Carolyn Crandall & Karol Watson
Brigham & Women’s Hospital, Boston, MA, 02115, USA
Daniel Chasman, Michael Cho, Dawn DeMeo, Leanna Farnam, Craig Hersh, Laura Kaufman, Meryl LeBoff, JoAnn Manson, Margaret Parker, Dandi Qiao, Susan Redline, Phuwanat Sakornsakolpat, Edwin Silverman, Tamar Sofer, Jody Sylvia, Emily Wan, Scott Weiss & Carla Wilson
University of Virginia, Charlottesville, VA, 22903, USA
Wei-Min Chen, Charles Farber, Josyf C Mychaleckyj & Aakrosh Ratan
Los Angeles Biomedical Research Institute, Los Angeles, CA, 90502, USA
Yii-Der Ida Chen, Xiuqing Guo, Kevin Sandow & Kent Taylor
The Broad Institute, Cambridge, MA, 02142, USA
Seung Hoan Choi, Stacey Gabriel, Lauren Margolin & Carolina Roselli
National Taiwan University, Taipei, 10617, Taiwan
Lee-Ming Chuang
University of Vermont, Burlington, VT, 05405, USA
Elaine Cornell, Peter Durda & Russell Tracy
Blood Systems Research Institute UCSF, San Francisco, CA, 94118, USA
Brian Custer & Shannon Kelly
University of Illinois at Chicago, Chicago, IL, 60607, USA
Dawood Darbar
Mayo Clinic, Rochester, MN, 55905, USA
Mariza de Andrade
Vanderbilt University, Nashville, TN, 37235, USA
Michael DeBaun, Dan Roden & M. Benjamin Shoemaker
University of Cincinnati, Cincinnati, OH, 45220, USA
Ranjan Deka
Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
Ron Do, Bruce Gelb, Eimear Kenny, Ruth Loos, Girish Nadkarni & Michael Preuss
University of North Carolina, Chapel Hill, NC, 27599, USA
Qing Duan, Nora Franceschini, Yun Li, Kari North & Laura Raffield
University of Texas Rio Grande Valley School of Medicine, Edinburg, TX, 78539, USA
Ravi Duggirala & Juan Manuel Peralta
Washington University in St Louis, St Louis, MO, 63130, USA
Susan Dutcher, Lucinda Fulton, C. Charles Gu, D. C. Rao, Karen Schwander & Yun Ju Sung
Brown University, Providence, RI, 02912, USA
Charles Eaton, Simin Liu & Stephen McGarvey
Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
Margery Gass, Jeff Haessler, Charles Kooperberg, Ulrike Peters & Alex Reiner
University of Pittsburgh, Pittsburgh, PA, 15260, USA
Mark Gladwin, Ryan L Minster, Frank Sciurba, Daniel E. Weeks & Yingze Zhang
Yale University, New Haven, CT, 06520, USA
David Glahn & Nicola Hawley
University of Texas Rio Grande Valley School of Medicine, San Antonio, TX, 78229, USA
Harald Goring
Tulane University, New Orleans, LA, 70118, USA
Jiang He
University of Iowa, Iowa City, IA, 52242, USA
Karin Hoth & Robert Wallace
National Health Research Institute Taiwan, Zhunan Township, 350, Taiwan
Chao (Agnes) Hsiung
Blood Works Northwest, Seattle, WA, 98105, USA
Haley Huston, Jill Johnsen, Barbara Konkle & Sarah Ruuska
Taichung Veterans General Hospital Taiwan, Taichung City, 407, Taiwan
Chii Min Hwu, Wen-Jane Lee & Wayne Hui-Heng Sheu
Ohio State University Wexner Medical Center, Columbus, OH, 43210, USA
Rebecca Jackson
NIH National Heart, Lung, and Blood Institute, Bethesda, MD, 98106, USA
Andrew Johnson, Dan Levy & James Luo
Albert Einstein College of Medicine, New York, NY, 20892, USA
Robert Kaplan & Sylvia Smoller
Loyola University, Maywood, IL, 10461, USA
Holly Kramer
Harvard School of Public Health, Boston, MA, 98104, USA
Christoph Lange
George Washington University, Washington, 60153, USA
Lisa Martin
Harvard University, Cambridge, MA, 02115, USA
Sean McFarland, Dmitry Prokopenko & Vijay Sankaran
University of Arizona, Tucson, AZ, 20052, USA
Deborah A Meyers
Howard University, Washington, 02138, USA
Sergei Nekhai
University at Buffalo, Buffalo, NY, 85721, USA
Heather Ochs-Balcom
University of Minnesota, Minneapolis, MN, 20059, USA
James Pankow & Scott Vrieze
Northwestern University, Chicago, IL, 14260, USA
Laura Rasmussen-Torvik
Harvard Medical School, Boston, MA, 55455, USA
Christine Seidman
Baylor College of Medicine, Houston, TX, 60208, USA
Vivien Sheehan
UMass Memorial Medical Center, Worcester, MA, 98107, USA
Brian Silver
Baylor College of Medicine, Ann Arbor, MI, 02115, USA
Daniel Taliun
University of Colorado at Boulder, Boulder, CO, 77030, USA
Scott Vrieze
Henry Ford Health System, Detroit, MI, 01655, USA
L. Keoki Williams

Authors

Seyedeh M. Zekavat
View author publications
You can also search for this author in PubMed Google Scholar
Sanni Ruotsalainen
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Handsaker
View author publications
You can also search for this author in PubMed Google Scholar
Maris Alver
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Bloom
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Poterba
View author publications
You can also search for this author in PubMed Google Scholar
Cotton Seed
View author publications
You can also search for this author in PubMed Google Scholar
Jason Ernst
View author publications
You can also search for this author in PubMed Google Scholar
Mark Chaffin
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Engreitz
View author publications
You can also search for this author in PubMed Google Scholar
Gina M. Peloso
View author publications
You can also search for this author in PubMed Google Scholar
Ani Manichaikul
View author publications
You can also search for this author in PubMed Google Scholar
Chaojie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen A. Ryan
View author publications
You can also search for this author in PubMed Google Scholar
Mao Fu
View author publications
You can also search for this author in PubMed Google Scholar
W. Craig Johnson
View author publications
You can also search for this author in PubMed Google Scholar
Michael Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Budoff
View author publications
You can also search for this author in PubMed Google Scholar
Ramachandran S. Vasan
View author publications
You can also search for this author in PubMed Google Scholar
L. Adrienne Cupples
View author publications
You can also search for this author in PubMed Google Scholar
Jerome I. Rotter
View author publications
You can also search for this author in PubMed Google Scholar
Stephen S. Rich
View author publications
You can also search for this author in PubMed Google Scholar
Wendy Post
View author publications
You can also search for this author in PubMed Google Scholar
Braxton D. Mitchell
View author publications
You can also search for this author in PubMed Google Scholar
Adolfo Correa
View author publications
You can also search for this author in PubMed Google Scholar
Andres Metspalu
View author publications
You can also search for this author in PubMed Google Scholar
James G. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Veikko Salomaa
View author publications
You can also search for this author in PubMed Google Scholar
Manolis Kellis
View author publications
You can also search for this author in PubMed Google Scholar
Mark J. Daly
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin M. Neale
View author publications
You can also search for this author in PubMed Google Scholar
Steven McCarroll
View author publications
You can also search for this author in PubMed Google Scholar
Ida Surakka
View author publications
You can also search for this author in PubMed Google Scholar
Tonu Esko
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Ganna
View author publications
You can also search for this author in PubMed Google Scholar
Samuli Ripatti
View author publications
You can also search for this author in PubMed Google Scholar
Sekar Kathiresan
View author publications
You can also search for this author in PubMed Google Scholar
Pradeep Natarajan
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

NHLBI TOPMed Lipids Working Group

Namiko Abe
, Goncalo Abecasis
, Christine Albert
, Nicholette (Nichole) Palmer Allred
, Laura Almasy
, Alvaro Alonso
, Seth Ament
, Peter Anderson
, Pramod Anugu
, Deborah Applebaum-Bowden
, Dan Arking
, Donna K Arnett
, Allison Ashley-Koch
, Stella Aslibekyan
, Tim Assimes
, Paul Auer
, Dimitrios Avramopoulos
, John Barnard
, Kathleen Barnes
, R. Graham Barr
, Emily Barron-Casella
, Terri Beaty
, Diane Becker
, Lewis Becker
, Rebecca Beer
, Ferdouse Begum
, Amber Beitelshees
, Emelia Benjamin
, Marcos Bezerra
, Larry Bielak
, Joshua Bis
, Thomas Blackwell
, John Blangero
, Eric Boerwinkle
, Ingrid Borecki
, Russell Bowler
, Jennifer Brody
, Ulrich Broeckel
, Jai Broome
, Karen Bunting
, Esteban Burchard
, Jonathan Cardwell
, Cara Carty
, Richard Casaburi
, James Casella
, Christy Chang
, Daniel Chasman
, Sameer Chavan
, Bo-Juen Chen
, Wei-Min Chen
, Yii-Der Ida Chen
, Michael Cho
, Seung Hoan Choi
, Lee-Ming Chuang
, Mina Chung
, Elaine Cornell
, Carolyn Crandall
, James Crapo
, Joanne Curran
, Jeffrey Curtis
, Brian Custer
, Coleen Damcott
, Dawood Darbar
, Sayantan Das
, Sean David
, Colleen Davis
, Michelle Daya
, Mariza de Andrade
, Michael DeBaun
, Ranjan Deka
, Dawn DeMeo
, Scott Devine
, Ron Do
, Qing Duan
, Ravi Duggirala
, Peter Durda
, Susan Dutcher
, Charles Eaton
, Lynette Ekunwe
, Patrick Ellinor
, Leslie Emery
, Charles Farber
, Leanna Farnam
, Tasha Fingerlin
, Matthew Flickinger
, Myriam Fornage
, Nora Franceschini
, Stephanie M. Fullerton
, Lucinda Fulton
, Stacey Gabriel
, Weiniu Gan
, Yan Gao
, Margery Gass
, Bruce Gelb
, Xiaoqi (Priscilla) Geng
, Soren Germer
, Chris Gignoux
, Mark Gladwin
, David Glahn
, Stephanie Gogarten
, Da-Wei Gong
, Harald Goring
, C. Charles Gu
, Yue Guan
, Xiuqing Guo
, Jeff Haessler
, Michael Hall
, Daniel Harris
, Nicola Hawley
, Jiang He
, Ben Heavner
, Susan Heckbert
, Ryan Hernandez
, David Herrington
, Craig Hersh
, Bertha Hidalgo
, James Hixson
, John Hokanson
, Elliott Hong
, Karin Hoth
, Chao (Agnes) Hsiung
, Haley Huston
, Chii Min Hwu
, Marguerite Ryan Irvin
, Rebecca Jackson
, Deepti Jain
, Cashell Jaquish
, Min A Jhun
, Jill Johnsen
, Andrew Johnson
, Rich Johnston
, Kimberly Jones
, Hyun Min Kang
, Robert Kaplan
, Sharon Kardia
, Laura Kaufman
, Shannon Kelly
, Eimear Kenny
, Michael Kessler
, Alyna Khan
, Greg Kinney
, Barbara Konkle
, Charles Kooperberg
, Holly Kramer
, Stephanie Krauter
, Christoph Lange
, Ethan Lange
, Leslie Lange
, Cathy Laurie
, Cecelia Laurie
, Meryl LeBoff
, Seunggeun Shawn Lee
, Wen-Jane Lee
, Jonathon LeFaive
, David Levine
, Dan Levy
, Joshua Lewis
, Yun Li
, Honghuang Lin
, Keng Han Lin
, Simin Liu
, Yongmei Liu
, Ruth Loos
, Steven Lubitz
, Kathryn Lunetta
, James Luo
, Michael Mahaney
, Barry Make
, JoAnn Manson
, Lauren Margolin
, Lisa Martin
, Susan Mathai
, Rasika Mathias
, Patrick McArdle
, Merry-Lynn McDonald
, Sean McFarland
, Stephen McGarvey
, Hao Mei
, Deborah A Meyers
, Julie Mikulla
, Nancy Min
, Mollie Minear
, Ryan L Minster
, May E. Montasser
, Solomon Musani
, Stanford Mwasongwe
, Josyf C Mychaleckyj
, Girish Nadkarni
, Rakhi Naik
, Sergei Nekhai
, Deborah Nickerson
, Kari North
, Jeff O’Connell
, Tim O’Connor
, Heather Ochs-Balcom
, James Pankow
, George Papanicolaou
, Margaret Parker
, Afshin Parsa
, Sara Penchev
, Juan Manuel Peralta
, Marco Perez
, James Perry
, Ulrike Peters
, Patricia Peyser
, Larry Phillips
, Sam Phillips
, Toni Pollin
, Julia Powers Becker
, Meher Preethi Boorgula
, Michael Preuss
, Dmitry Prokopenko
, Bruce Psaty
, Pankaj Qasba
, Dandi Qiao
, Zhaohui Qin
, Nicholas Rafaels
, Laura Raffield
, D. C. Rao
, Laura Rasmussen-Torvik
, Aakrosh Ratan
, Susan Redline
, Robert Reed
, Elizabeth Regan
, Alex Reiner
, Ken Rice
, Dan Roden
, Carolina Roselli
, Ingo Ruczinski
, Pamela Russell
, Sarah Ruuska
, Phuwanat Sakornsakolpat
, Shabnam Salimi
, Steven Salzberg
, Kevin Sandow
, Vijay Sankaran
, Christopher Scheller
, Ellen Schmidt
, Karen Schwander
, David Schwartz
, Frank Sciurba
, Christine Seidman
, Vivien Sheehan
, Amol Shetty
, Aniket Shetty
, Wayne Hui-Heng Sheu
, M. Benjamin Shoemaker
, Brian Silver
, Edwin Silverman
, Jennifer Smith
, Josh Smith
, Nicholas Smith
, Tanja Smith
, Sylvia Smoller
, Beverly Snively
, Tamar Sofer
, Nona Sotoodehnia
, Adrienne Stilp
, Elizabeth Streeten
, Yun Ju Sung
, Jody Sylvia
, Adam Szpiro
, Carole Sztalryd
, Daniel Taliun
, Hua Tang
, Margaret Taub
, Kent Taylor
, Simeon Taylor
, Marilyn Telen
, Timothy A. Thornton
, Lesley Tinker
, David Tirschwell
, Hemant Tiwari
, Russell Tracy
, Dhananjay Vaidya
, Peter VandeHaar
, Scott Vrieze
, Tarik Walker
, Robert Wallace
, Avram Walts
, Emily Wan
, Fei Fei Wang
, Karol Watson
, Daniel E. Weeks
, Bruce Weir
, Scott Weiss
, Lu-Chen Weng
, Cristen Willer
, Kayleen Williams
, L. Keoki Williams
, Carla Wilson
, Quenna Wong
, Huichun Xu
, Lisa Yanek
, Ivana Yang
, Rongze Yang
, Norann Zaghloul
, Yingze Zhang
, Snow Xueyan Zhao
, Wei Zhao
, Xiuwen Zheng
, Degui Zhi
, Xiang Zhou
, Michael Zody
& Sebastian Zoellner

Contributions

S.M.Z., P.N., A.G., I.S., and S.K. designed the study; S.M.Z., S.R., R.E.H., G.M.P., A.M., C.Y., and K.A.R. performed the analyses; S.M.Z., S.R., R.E.H., P.N., S.K., S.R., R.E.H., M.A., M.J.D., I.S., A.G. performed interpretation of data; S.M.Z., P.N., A.G., and S.K. drafted the manuscript; S.M.Z., P.N., A.G., S.K., I.S., R.E.H., M.A., V.S., J.G.W., S.R., and M.J.D. revised the manuscript; J.B., T.P., and C.S. provided Hail software support; J.Ernst, J.Engreitz, and M.C. provided other technical support; and A.C., A.M., V.S., M.K., M.J.D, J.G.W., B.M.N., S.M., I.S., T.E., S.R., S.K., M.T., M.B., V.S.R., L.A.C., W.C.J., J.I.R., S.S.R., W.P., B.D.M., M.F., and P.N. provided administrative or material support.

Corresponding authors

Correspondence to Sekar Kathiresan or Pradeep Natarajan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Supplementary Data 6

Supplementary Data 7

Supplementary Data 8

Supplementary Data 9

Supplementary Data 10

Supplementary Data 11

Supplementary Data 12

Supplementary Data 13

Supplementary Data 14

Supplementary Data 15

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zekavat, S.M., Ruotsalainen, S., Handsaker, R.E. et al. Deep coverage whole genome sequences and plasma lipoprotein(a) in individuals of European and African ancestries. Nat Commun 9, 2606 (2018). https://doi.org/10.1038/s41467-018-04668-w

Download citation

Received: 08 December 2017
Accepted: 15 May 2018
Published: 04 July 2018
DOI: https://doi.org/10.1038/s41467-018-04668-w

This article is cited by

GeneToCN: an alignment-free method for gene copy number estimation directly from next-generation sequencing reads
- Fanny-Dhelia Pajuste
- Maido Remm
Scientific Reports (2023)
Metabolic Risk Profiles for Hepatic Steatosis Differ by Race/Ethnicity: An Elastography-Based Study of US Adults
- Kali Zhou
- Jennifer L. Dodge
- Norah A. Terrault
Digestive Diseases and Sciences (2022)
Analyses of biomarker traits in diverse UK biobank participants identify associations missed by European-centric analysis strategies
- Quan Sun
- Misa Graff
- Laura M. Raffield
Journal of Human Genetics (2022)
Investigation of a nonsense mutation located in the complex KIV-2 copy number variation region of apolipoprotein(a) in 10,910 individuals
- Silvia Di Maio
- Rebecca Grüneis
- Stefan Coassin
Genome Medicine (2020)
The Brazilian Initiative on Precision Medicine (BIPMed): fostering genomic data-sharing of underrepresented populations
- Cristiane S. Rocha
- Rodrigo Secolin
- Iscia Lopes-Cendes
npj Genomic Medicine (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

WGS and baseline characteristics

Structural variant discovery and imputation of KIV2-CN

Common variant associations

Common variant association and KIV2-CN modifier analyses

Rare variant analysis by coding and non-coding burden tests

Mendelian randomization

Discussion

Methods

Study participants

WGS and variant calling

Whole-genome sequence sample quality control

WGS genotype and variant quality control

Finnish imputation and quality control

Lp(a) and Lp(a)-C phenotypes

Conventional lipid phenotypes

KIV2-CN estimation from WGS data

Evaluation of KIV2-CN precision

KIV2-CN Imputation

Principle component analysis (PCA)

Variant annotation

Single variant association

Heritability analyses

KIV2-CN modifier analysis

Rare variant coding and non-coding association analyses (RVAS)

Mendelian randomization

Data availability

Change history

21 July 2018

23 August 2018

01 April 2020

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

NHLBI TOPMed Lipids Working Group

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links