Introduction

Lp(a) is a lipoprotein formed by the assembly of LDL particles and a carbohydrate-rich protein, apolipoprotein(a) (apo(a)), which has a high degree of structural homology with plasminogen.

The existence of a moderately strong association of Lp(a) levels with atherosclerosis and coronary heart disease (CHD), independent of the standard vascular risk factors, is now clearly established.1, 2, 3 A meta-analysis on 5436 CHD cases observed over 10 years found that people with Lp(a) levels in the top third of baseline measurement are at ≈70% increased risk of CHD compared with those in the bottom third.4

Several hypotheses exist about the underlying mechanisms by which Lp(a) contributes to the pathogenesis of atherosclerosis. Recently, Tsimikas et al5 advanced the hypothesis that, in the setting of enhanced oxidative stress, the binding of proinflammatory oxidised phospholipids by Lp(a) may, in part, mediate the atherogenicity of Lp(a).

The distribution of Lp(a) levels has been found to vary greatly among populations. In African–Americans, levels of Lp(a) are significantly higher when compared to those of European ancestry.6 Levels of Lp(a) are highly heritable and almost entirely genetically controlled; several studies estimated the heritability of this trait >0.757, 8, 9, 10 and one study >0.9.8

The apolipoprotein(a) (LPA) locus on chromosome 6q26–27 seems to explain most of the trait's variance and almost all the genome-wide scans investigating the Lp(a) trait obtained a very significant result at this locus.6, 8, 10, 11 According to Broeckel,8 in a genome-wide scan on 513 western European families aimed at identifying chromosomal regions linked to myocardial infarction (MI) and related risk factors, 73% of the overall Lp(a) heritability is explained by the LPA locus. In that study, the region including the LPA locus obtained a LOD score of 26.99.8 Similarly, in a genome-wide scan of three different ethnic populations (non-Hispanic whites, African Americans and Hispanics), the LPA locus showed the strongest evidence for linkage with LOD scores of 18.62, 14.27 and 12.97 respectively.6 Thirty-four ‘size polymorphism’ alleles have been resolved at this locus and the length of the apo(a) isoform is inversely correlated with Lp(a) plasma concentrations and the risk of CHD.

Nevertheless, other (unlinked) genes are highly likely to contribute to the variance of Lp(a) levels. Broeckel8 found a novel linkage on chromosome 1q23, which seemed to explain 16% of the variance in Lp(a), but at least two subsequent studies have failed to replicate that finding.6, 11

Our aim was to perform, on a large cohort of European ancestry collected through the PROCARDIS coronary heart disease study, a genome-wide variance component linkage analysis to localise genomic regions influencing Lp(a) levels, taking into account the effect of the LPA locus on chromosome 6.

Methods

Research design

The PROCARDIS study is a European collaborative project to map susceptibility genes for coronary artery disease (CAD). Ascertainment criteria for PROCARDIS probands were MI or symptomatic acute coronary syndrome (SACS; on the assumption that the latter represents a similar pathological process), according to modified WHO diagnostic criteria,12, 13 occurring before the age of 65 years. The study recruited 2036 families with multiple CAD siblings. A detailed description of inclusion criteria as well as of the main findings of the PROCARDIS study has been reported elsewhere.14 DNA was extracted from a 9 ml EDTA-anticoagulated sample of frozen whole blood using a standard technique. Genotyping has been performed using the ABI MD10 (Applied Biosystems, Warrington, UK) set of microsatellite markers for genome-wide screening.15 Ethics committee approval was obtained and all participants gave written informed consent. Families were recruited in four European countries, namely Germany, Italy, Sweden and UK.

Phenotyping

All participants filled out a standardized questionnaire that focused on the patient's baseline characteristics, cardiovascular risk factors, lifestyle and a detailed list of medications. Several biochemical measurements, such as lipids, lipoprotein subclasses, fibrinogen, insulin, pro-insulin and HbA1c have been collected during the study. Lipids and lipoprotein subclasses were determined by an NMR technique (LipoScience, Raleigh, USA).16

Although the main purpose of the PROCARDIS project is the identification of new susceptibility genes for CAD, phenotypes relevant to cardiovascular disease have been examined in our study with the aim to detect associated QTLs. Serum Lp(a) serum concentration was analysed as a continuous quantitative trait in a QTL genome-wide linkage analysis. Lp(a) was measured by an immunoturbimetric assay on a Hitachi 917 Roche automated clinical chemistry analyzer.17 From the PROCARDIS database, we selected all informative sibships in which Lp(a) levels had been measured in multiple siblings. The sibships include individuals with and without a personal history of CAD as the purpose of this analysis is the identification of loci associated to a quantitative trait. Use of lipid-lowering treatment (statins, fibrates or omega 3 preparations) was determined by participants' responses to questions about regular medication.

Genotyping

For the QTL analysis of Lp(a) levels, 493 markers at a mean spacing of 7 cM for each of the 22 autosomes have been analysed (MD10 set). Marker allele frequencies were computed from the original data and inter-marker distances were based on the Rutgers sex-averaged map.18 Following this initial genome screen, 40 additional markers on chromosomes 3, 11 and 17 were genotyped to supplement the recovery of identity-by-descent (IBD) information over that extracted by the MD10 marker set (as reported in a supplementary table). Genotypic data were checked before statistical analysis using RELPAIR19 to confirm relationships and by PEDCHECK20 for inheritance inconsistencies.

Statistical methods

Descriptive analysis

Summary statistics of the sample under investigation were calculated. Univariate and multivariate linear regression analyses were performed to test for factors associated with levels of Lp(a), using SAS 8.02. Covariates (age, sex, body mass index, systolic and diastolic blood pressure, triglycerides, total cholesterol, diabetes, lipid-lowering treatment) were originally chosen according to their biological influence on the trait and to their significant associations in a preliminary univariate regression analysis (P-value <0.05). Sex, age and lipid-lowering treatment, statistically significant in the multivariate model implemented in SOLAR, were included in the quantitative analysis and retained in the final linkage analysis. In a quantitative analysis of Lp(a), the proportion of the total variance attributable to covariates and the covariate-adjusted heritability were estimated using a procedure implemented by SOLAR.21

Variance component linkage analysis

A variance component linkage analysis was performed as implemented by the SOLAR computer program.21 Briefly, the variance component approach consists of fitting a linear mixed model, which involves estimating the trait mean and three components of variance: (a) an additive monogenic component linked to the region of interest or locus-specific effect; (b) a ‘polygenic’ component that incorporates overall familial effects or residual additive genetic effects; and (c) an ‘environmental’ component that incorporates effects unique to the individual, that is covariate effects, and individual-specific random environmental factors.22 To test the null hypothesis of no linkage to a QTL, the likelihood of a model with the additive genetic variance due to the QTL equal to zero was compared with a model with the additive variance due to the QTL equal to that estimated by SOLAR. Lp(a) has been studied as a continuous variable and has been log-transformed (natural logarithm) to adjust for non-normality because the variance component approach is sensitive to certain distributional assumptions.23 The method requires an estimate of the expected genetic covariances between relatives as a function of the IBD sharing at a given chromosomal location. While SOLAR estimates multipoint IBD sharing probabilities, the exact IBD has been calculated using MERLIN24 and then imported into SOLAR. Genetic locations for the marker loci are scaled in Haldane centimorgan (cM). Multipoint variance component linkage analysis was carried out at constant increments of 1 cM along all the autosomes.

Oligogenic linkage analysis

A multipoint oligogenic linkage analysis to identify multiple loci affecting the variation in Lp(a) was carried out.25 In this sequential strategy, the genome was repeatedly scanned for linkage and the chromosomal location that yielded the largest marginal LOD score was retained for further conditional analyses. The conditional modelling approach is analogous to a stepwise forward-selection procedure used in linear regression and other statistical analyses. Ab initio, a conventional single-locus variance-components analysis is used to identify the strongest linkage signal. At each subsequent step, QTLs identified in the previous step are fixed in location and the conditional oligogenic QTL method scans for further QTL. The QTL-specific effects at all loci included in the model are (re)estimated at each step. A stopping-criterion of LOD scores 1.9 was used to terminate the stepwise procedure.26 Oligogenic analyses are expected to yield more accurate estimates of the relative effect of each locus and to increase the power to detect linkages. This approach allowed other promising regions of linkage to be highlighted after adjustment for the variance attributed to the LPA locus as well as to other linkage peaks.

Simulation analysis

The pointwise (nominal) significance of the LOD scores was empirically assessed by simulation of the conditional linkage analysis process by using a procedure in SOLAR.21 A total of 10 000 replicates of a fully informative marker, completely unlinked to the QTL influencing the trait, have been performed to obtain the empirical distribution of the LOD scores and to determine the pointwise P-value associated with the original LOD score. To estimate the significance level of the resulting LOD scores, taking into account the number of chromosomes scanned and the density of the markers, we calculated the genome-wide P-values using the Gaussian approximation described by Feingold.27

Regression-based linkage analysis

To complement the variance components linkage analysis, a model-free regression method was applied which is generally robust to selection and distributional issues.28 This method regresses the estimated IBD sharing between non-inbred relative pairs on the squared sums and differences of their trait value. Lp(a) trait was log-transformed and then adjusted for sex, age and lipid-lowering treatment by reference to an external epidemiological sample (PROCAM project http://www.chd-taskforce.de/slidekits.htm). An ascertainment correction was implemented by specifying the trait-mean and variance in the unselected PROCAM cohort (mean=13.7 mg/dl s.d.=18.8 mg/dl) in the MERLIN-REGRESS analysis.

Results

Pedigree data extracted from the PROCARDIS database consisted of 1812 families with 4012 individuals genotyped and checked for inheritance inconsistencies, having data available for the serum Lp(a) concentration and for the covariates retained in the analysis. Table 1 summarizes the phenotypic characteristics of the individuals included in the study.

Table 1 Phenotypic characteristics of the 4012 individuals

Lp(a) levels ranged from 1 to 361 mg/dl with a mean value of 52.7 mg/dl and a standard deviation of 50.1 mg/dl, whereas the median value was 31.0 (IQR=14.5–82.0) mg/dl. Values of skewness and kurtosis for Lp(a) were 1.4 and 2.0 and decreased respectively to −0.4 and −0.1 for the log-transformed Lp(a) that was used for the linkage scan. The following statistically significant covariates have been used for adjustment of the Lp(a) trait: sex, age and lipid-lowering treatment. In order to show how Lp(a) levels vary with sex, age and treatment, parameter estimates and standard errors obtained from the multivariate quantitative regression analysis are reported in Table 2. The polygenic heritability for Lp(a) was estimated by SOLAR to be 0.91±0.03, whereas the proportion of variance owing to covariates was equal to 0.01.

Table 2 Parameter estimates and Standard Errors (SE) from the multivariate quantitative regression analysis of logarithm of Lp(a) implemented by SOLAR

Multipoint oligogenic linkage analysis to identify multiple loci affecting the variation in Lp(a) was carried out. As expected the highest LOD score in the initial genome scan was found on chromosome 6 at the same location as the LPA locus, which is known to be positioned in the region 6q26–q27 at a physical distance of 161 Mb (Figure 1). The maximum LOD score observed in our analysis was 108.3 at 189 cM (location expressed using Haldane cM) flanked by markers D6S1581 and D6S1599 at a physical distance of 160.2–162.8 Mb. The total additive genetic variance explained by this locus was 0.74±0.03. After adjusting for the effects of the locus detected on chromosome 6, a high LOD score was detected on chromosome 13 between D13S156 and D13S265 (LOD 7.0, maximum at D13S170). In the third step of the conditional linkage procedure, a significant linkage peak was found on chromosome 11 (LOD 3.5) between markers D11S902 and D11S904. Further regions of tentative linkage were observed on chromosome 15 (LOD 2.9) between D15S131 and D15S205 and on chromosome 19 (LOD 2.7) between D19S571 and D19S418. We also observed a LOD score equal to 1.5 at 170 cM, flanked by markers D1S498 and D1S484 (148.1–157.6 Mb), close to a new locus influencing Lp(a) levels identified at marker D1S1679 (159.1 Mb) by Broeckel.8 LOD plots of the main findings are reported in Figure 2 with maximum LOD score values highlighted.

Figure 1
figure 1

Oligogenic linkage analysis: multipoint LOD plot for logarithm of Lp(a) on chromosome 6 (location expressed using Haldane cM).

Figure 2
figure 2

Oligogenic linkage analysis: multipoint LOD plots for logarithm of Lp(a) on chromosomes 11, 13, 15 and 19. Maximum LOD scores (≥1.9) are reported (location expressed using Haldane cM).

To assess the significance level of our linkage results we estimated the empirical P-values by using a simulation of 10 000 replicates. Pointwise P-values as well as locus specific heritabilities are reported in Table 3. Furthermore, the Gaussian approximation proposed by Feingold27 suggests these linkages attain genome-wide significance on chromosome 13 (P<0.0001) and on chromosome 11 (P=0.0139). Regression-based approach implemented by MERLIN-REGRESS detected QTL on chromosome 6 (LOD=43.4) and chromosome 13 (LOD=1.9).

Table 3 Results of the oligogenic analysis of variation in Lp(a) levels. Physical location, LOD score with associated empirical pointwise P-value and locus specific heritability (H2)

Discussion

Complex traits such as serum Lp(a) are known to be influenced by environmental and genetic factor. We have performed a genome-wide screen on a large cohort of informative families to search for genetic components influencing Lp(a) levels. As expected, we found a high heritability of the trait of about 91% which supports the existence of a strong genetic component also taking into account environmental covariates.

A highly significant linkage was detected on chromosome 6 at 189 cM (LOD 108.3) flanked by markers D6S1581 and D6S1599 (160.2–162.8 Mb) at the same location as the apolipoprotein(a) gene, which is on chromosome 6q26–27 at a physical location of 161 Mb (map positions according to http://genome.ucsc.edu/, May 2004).

Then, we extended the analysis to an oligogenic model taking into account, in the sequential scanning of the genome, the effect of the locus detected on chromosome 6 and of other major loci detected in our analysis. We have estimated genome-wide significance thresholds that are appropriate for this linkage experiment with a Gaussian approximation. Significant conditional linkage was found on chromosome 13q22–31 as well as on chromosome 11p15. Two further tentative linkages were identified on chromosomes 15q23–25 and 19q13.4.

Interestingly, the 13q22.1–13q31.1 locus between D13S156 and D13S265 (73.5–89.2 Mb) found in our genome-wide linkage analysis for Lp(a) overlaps a region that modulates LDL cholesterol in patients with familial hypercholesterolemia mapped to D13S156–D13S158 (73.5–102.8 Mb), as reported by Knoblauch29 and further supported by analysis of healthy twins that produced a maximum LOD score at marker D13S1241 (96.3 Mb) for a QTL influencing LDL-cholesterol.29 In PROCARDIS, pretreatment LDL cholesterol values were not available and the mean cholesterol was lower than in the general population rendering any modulating genetic effects inaccessible (Table 1). In contrast, mean Lp(a) is higher in PROCARDIS and thus the effect of genetic modulation remains visible. It may turn out that the linkage results at this locus are due to the same underlying gene, which could be a candidate target for manipulating LDL-cholesterol and Lp(a).

A study by Barkley6 also reported strong signals on chromosome 19 at 30 and 47 cM (Kosambi), in a clearly different position from that obtained in our study at 97 cM (Haldane). However, as a confirmatory result, we found a LOD score equal to 1.5 between markers D1S498 and D1S484 (148.1–157.6 Mb), approximately at the same location as the new locus identified by Broeckel8 and colleagues on chromosome 1q23 at marker D1S1679 (159.1 Mb) on a sample of 513 families for a total of 1406 individuals (LOD=3.81). It is reasonable to consider our result as evidence for replication of this finding. Interestingly, the population in Broeckel's study was also ascertained on the basis of pre-mature coronary heart disease.

The population here investigated was, in fact, previously selected on the basis of a CAD diagnosis, as the primary aim of the PROCARDIS study was the investigation of susceptibility genes for CAD. Eighty percent of the individuals analysed were affected by CAD and Lp(a) levels in our sample are quite high when compared to those of the general population.30 Although ascertainment bias may not bias the ‘evidence for linkage’, it might decrease the power to detect genetic effects, because it tends to decrease the correlation among relatives and increases the rate of Type II error.31 The ascertainment correction implemented in SOLAR is designed for extended pedigrees recruited through a single proband with an extreme quantitative phenotype. Ascertainment in our affected sib pairs depends on clinical phenotypes in at least two sibs and only indirectly on quantitative intermediate traits such as Lp(a). Consequently, it was not possible to make an appropriate adjustment for ascertainment in the SOLAR analysis. However, an alternative regression-based linkage approach (MERLIN-REGRESS) has been devised to detect linkage robustly in selected samples. By applying this method, epidemiological information regarding the distribution of Lp(a) levels in an unselected population (PROCAM) was incorporated into the linkage analysis. The MERLIN-REGRESS analysis confirmed the same locations on chromosome 6 and 13, although the LOD scores were considerably smaller. The regression-based linkage analysis takes into account the ascertainment issue, but it does not implement the oligogenic (conditional) analysis model (which can be fitted in SOLAR). Our results suggest that important loci can emerge in an oligogenic analysis that are not apparent in single-locus analyses; a finding that is consistent with experiences of applying multivariate regression models in general statistical analysis. We conclude that each analysis can provide complementary insights into QTL architecture and that further methodological developments are needed to efficiently analyse human linkage data with complex ascertainment.

Our findings provide new and important information about genomic regions involved in the quantitative variation of Lp(a). The lack of consistency with loci linked in some previous studies might suggest that the genetic components of the Lp(a) trait are complex to detect, involving many genes with different effects. Moreover our study population, involving a total of more than 4000 individuals, the majority affected by CAD and having different environmental backgrounds, may differ from those of previous similar genome-scans.

Finally, these results provide further insights into the oligogenic control of circulating Lp(a) levels that will encourage candidate gene studies that aim to determine the molecular basis of the underlying QTL.