Introduction

Coronary artery disease (CAD) is the most common cause of death in Western societies, as well as in emerging market economies.1, 2 Epidemiological studies have consistently shown an association between low levels of high-density lipoprotein cholesterol (HDL-C) and the prevalence of CAD.3, 4, 5 HDL-C has a key role in reverse cholesterol transport, mobilizing cholesterol from peripheral tissues to the liver, a mechanism contributing to the cardio-protective effect of HDL-C.

Several studies have shown modest to high estimates of heritability (0.24–0.83) for plasma HDL-C levels,6, 7 underlining the importance of genetic factors. However, until recently, only a few specific genes had been identified (see Dastani et al6 for review). Rare mutations in the ATP-binding cassette A1 (ABCA1), lecithin:cholesterol acyltransferase (LCAT) and apolipoprotein A-I (APOA1) genes, and other genes involved in HDL metabolism, have been reported in individuals with low levels of HDL-C.8, 9 However, these account for 26% of familial forms of HDL deficiency in Quebec.10, 11 Identifying novel genes that regulate HDL-C levels may provide further insights into lipoprotein metabolism and potential pathways for pharmacological modulation of HDL-C.

In this study, we examined the genetic determinants of low HDL-C in well-characterized families and individuals of French-Canadian descent. Specifically, we performed genome-wide linkage followed by association analyses of single-nucleotide polymorphisms (SNPs) located on chromosome 16q23-24 with the HDL-C trait, and examined the expression of a number of candidate genes in this region.

Methods

Subjects

Probands for the QUE study sample with HDL-C levels less than the 5th percentile, (age/sex specific) based on the Lipid Research Clinics population studies data book, as previously described,12 were selected from the Preventive Cardiology/Lipid Clinic of the McGill University Health Center, Royal Victoria Hospital (Montreal, Quebec, Canada). The exclusion criteria for probands were the following: severe hypertriglyceridemia (>10 mmol/l), cellular cholesterol or phospholipid efflux defect in skin fibroblasts13 or the presence of known ABCA1 or ApoA1 mutations based on exonic sequencing. A total of 11 families out of 29 recruited multigenerational French-Canadian families, each exhibiting at least three individuals with a low HDL in consecutive generations (n=412 subjects), were selected for genome-wide linkage analysis. The size of the families varied between 4 and 109 individuals, with a median of 41. Family members were sampled after a 12-h fast and a discontinuation of lipid modifying medications for at least 2 weeks. A total of 136 cases (HDL <5%) and 218 controls (HDL >25%) derived from the QUE families and additional CAD patients were selected for an association study. These additional CAD patients consisted of unrelated French-Canadian patients with premature CAD.14 All subjects provided informed consent for plasma and DNA sampling, isolation and storage. The Research Ethics Board of the McGill University Health Centre approved the research protocol. The Saguenay–Lac St-Jean (SLSJ) sample composed of 410 phenotyped individuals from 61 nuclear families from the SLSJ region of Quebec who were ascertained as a part of ongoing family-based studies of CAD or type II diabetes. Both studies required that at least two siblings be affected. Biochemical measurements were determined after a 3-week medication washout period for all hypocholesterolemic and antihypertensive drugs. The SLSJ case–control sample was predominantly derived from the families and consisted of 94 individuals with an HDL-C <5th percentile and 94 individuals with an HDL-C >40th percentile. Table 1 summarizes the subjects for this study.

Table 1 Subjects in this study

Biochemical measurements

High-density lipoprotein and triglycerides were measured using standardized techniques as described previously.13

Genome-scan markers and data quality control

The genotyping for QUE families was performed by DeCode Genetics (Reykjavik, Iceland) with 485 autosomal microsatellite markers from the Decode map (an average marker density of 6.8 cM) on 412 subjects from the QUE families (n=11, mean family size=41). The genome scan of 410 phenotyped individuals from 61 SLSJ families was executed as described in Rioux et al15 by using either 312 or 396 microsatellite markers, with an average marker density of 11.9 and 9 cM, respectively. These markers were modified versions of the Cooperative Human linkage Center Screening Set version 6.0 (http://gai.nci.nih.gov/cgi-bin/CHLCMapsChr?v4.recmin.v6.weberscreen.sexave), which were supplemented with additional Genethon markers to increase the information content.16 Overall, 34 markers were common between the two marker sets used; however, there were no markers in common on chromosome 16.

For both the SLSJ and QUE studies, the software GRR17 (http://www.sph.umich.edu/csg/abecasis/GRR/) was used to identify mispaternities and sample mix-ups. In total, three mispaternities and one sample mix-up were identified and appropriate adjustments were carried out before further analysis.

Sequencing and genotyping

Sequencing of all 39 known genes in the narrowed region was carried out using the Thermo-Sequenase BigDye Direct Cycle-Sequencing kit (Applied Biosystems Inc., Streetsville, ON, Canada). We also selected 200 tagging SNPs (tSNPs) that spanned the region from 55574820 to 77244563 bp (NCBI, build 36.1) corresponding to the linkage peak on chromosome 16q13-23.1. The selected SNPs were based on r2 <0.8 between tSNPs and a minor allele frequency >0.05 from the European samples of the HapMap project in genes that were differentially expressed after cholesterol or 22-hydroxycholesterol stimulation of human fibroblasts, using Affymetrix (Santa Clara, CA, USA) microarray data (Supplementary Table 1). The SNPs were genotyped in the SLSJ study sample using the SNPstream Genotyping System (Beckman Coulter, Inc., Mississauga, ON, Canada). We also genotyped 51 tSNPs in the QUE sample in a parallel study as previously described.18

Haplotype construction

Ten families with a positive LOD score (LOD >0) on chromosome 16 were chosen for haplotype analysis. Haplotypes were constructed with the program Merlin19 (http://www.sph.umich.edu/csg/abecasis/Merlin/index.html) using both microsatellite markers and SNPs based on HDL-C <5% as the affected status. The haplotype analysis for SNPs in unrelated subjects from the SLSJ and the QUE study sample was performed using Haploview v4.118, 20 (http://www.broadinstitute.org/haploview/haploview).

RNA extraction and RT-PCR

Skin fibroblasts were available for three out of four probands from the QUE families with the highest linkage scores. These and three control cell lines were cultured on the basis of a standard protocol.13 Total RNA was prepared from cultured fibroblasts using the RNeasy mini RNA extraction kit (Qiagen, Mississauga, ON, Canada) according to the manufacturer's instructions. Total RNA (100 ng) in a 14-μl reaction was reverse transcribed using the cDNA Qiagen Kit according to the manufacturer's protocols. Real-time quantitative PCR assays were carried out using the Quantitect SYBR Green PCR kit (Qiagen) on an ABI PRISM 7300 Sequence Detection System (Applied Biosystems). Amplifications were carried out in a 96-well plate with 50 μl reaction volumes and 40 amplification cycles (94°C, 15 s; 55°C, 30 s; 72°C, 34 s). Experiments were carried out in triplicate and the mRNA expression was taken as the mean of three different experiments. The expression of each gene was normalized to succinate dehydrogenase expression levels. Fold changes relative to controls were determined using the ΔΔCt method. A mean fold change (2−ΔΔCt) and the SEM value were determined and significance was determined with a two-tailed t-test.

Statistical analyses

SIMWALK2 (version 2.91) (http://www.genetics.ucla.edu/software/download?package=2) was used to estimate the multipoint inheritance vectors in QUE families.21, 22 Genome scans were performed using the variance components approach as implemented in SOLAR (version 4.2.0) (http://www.sfbr.org/departments/genetics_detail.aspx?p=37). For multipoint linkage analysis, the HDL-C phenotype was adjusted for triglycerides, sex and age × sex using SOLAR. Association tests in families were conducted using the Mendel (version 9.0.023) (http://www.genetics.ucla.edu/software/mendel) with the option for association given linkage. Case–control association analyses were carried out with Haploview. Heritability of HDL-C was analyzed with the method implemented in the SOLAR software package with the ‘ascertainment correction’ option. Although we realize that using this correction may still lead to biased results, the heritability we report for our samples is consistent with previous reports (see refs. 6, 7 and results section).

Results

In QUE families, heritability of HDL-C was estimated to be 73% (P=9. 89 × 10−16) in genotyped individuals. Sex, age × sex and triglycerides were significant contributors (P<0.05) to the variance of HDL-C in this study sample. The heritability estimate for SLSJ families was 49% (P=6.55 × 10−22). Sex and triglycerides were significant covariates in this study sample. The results of the linkage analysis for HDL-C in the SLSJ and QUE studies are shown in Supplementary Figure 1.

In the genome scan for the QUE study sample, we identified four chromosomal regions that may harbor QTL for HDL-C (LOD scores >1). We observed an LOD score of 2.61 at 98 cM (based on the Decode Map24) near the marker D16S515 in chromosome 16q23-24. In the SLSJ study sample, the highest multipoint LOD score (2.96) was observed at 86 cM on chromosome 16, located 1.25 cM away from D16S2624 in chromosome 16q22-23. Chromosomal regions with a multipoint LOD score ≥1 in either of the two genome scans are summarized in Table 2. Figure 1 shows the multipoint LOD scores on chromosome 16 for the SLSJ and QUE study samples, as well as two previously reported findings in the region.

Table 2 Whole genome-scan results with LOD score >1
Figure 1
figure 1

Multipoint LOD scores for HDL-C on chromosome 16 in SLSJ and QUE study samples, as well as previously reported findings in the region. The filled line and dashed line correspond to results from QUE samples near marker D16S515 at 16q23-24 and to results from SLSJ samples near marker D16S2624 at 16q22-23. • reported a peak in Mexican-American families near markers D16S2624-D16S518 at 16q22.3-23.1.26 ▪ reported a peak in Finnish and Dutch families near marker D16S3096 at 16q22.1-24.3.27

Haplotype construction identified a 25.5 cM (18 Mb) region on 16q23-24 that segregated with the low HDL-C phenotype in the four QUE families with the highest HDL-C LOD scores. The segregating region was further restricted by additional microsatellite markers from 25.5 cM to 18.1 cM (7.8 Mb) in the family with the highest LOD score (Figure 2a). Using the UCSC genome browser (http://genome.ucsc.edu/), we identified 39 genes residing within this 7.8-Mb region.

Figure 2
figure 2

Linked region for HDL-C on chromosome 16q21-24. (a) The position of microsatellite markers with the interval distance (cM) between them is indicated and below the line are the actual physical locations (Mb). Red markers are genome-wide scan markers and the other markers are from subsequent fine mapping. Position of the LOD score peak is shown in the box. Family identification numbers are indicated on the left side and segregating haplotypes for each family are shown. The positions of CETP, LCAT and CHST6 are indicated by arrowheads. The distance between arrows A and B corresponds to the initially identified linked region, which was narrowed down to the region between A and C. (b) Other known genes in the region of CHST6 are shown. Inside the box is the five SNP haplotype that had an increased frequency. Families indicated in bold carry the R162G mutation in CHST6.

In the SLSJ study sample, we tested for the association between HDL-C affection and 200 tSNPs selected from genes in the region, which were differentially expressed in human fibroblasts treated with cholesterol or 22-hydrocholesterol and thus seemed to be sterol regulated (microarray data not shown). One SNP, rs11646677, which was located within the shared haplotype from QUE families, showed evidence of association with HDL-C affection (P=0.016) (Supplementary Figure 2).

We sequenced all coding regions of the 39 genes located in this region (Figure 2a). The LCAT gene was excluded from the candidate region by a recombination event (subject no. 407, Supplementary Figure 3a) and by sequencing in all QUE probands. We identified several nonsynonymous variants within the 39 genes. These were genotyped in family members and checked for segregation. We found three nonsynonymous variants within the CHST6 gene and its homolog CHST5, located in the same region. One of the variants in the CHST6 gene, R162G, segregated with HDL-C in the four QUE families that had the highest linkage score (Supplementary Figure 3a, b, c and d). In the association analysis of these families and of additional families recruited later using Mendel with the option for association given linkage, we observed a P value of 0.015 for this variant. However, an association study in 136 cases and 218 controls did not confirm this finding (data not shown). This SNP and four other telomeric SNPs (spanning from 74070744 to 74822026 bp build 36.1) comprise a haplotype (Figure 2b) that is found in the four families but that had a haplotype frequency of only 0.05 in our unrelated samples from Quebec.

The nonsynonymous variants found in all the sequenced genes are listed in Table 3. Segregation analyses with these SNPs narrowed down the region to 6.6 Mb in the family with the highest LOD score.

Table 3 Non-synonymous variants identified by sequencing

As none of the other nonsynonymous variants identified by sequence analysis segregated in our families, we investigated the potential influence of regulatory variants on HDL-C levels using RT-PCR analyses of nine genes within the 6.6-Mb region, between the genes CHST6 and WWOX. Three of those genes (ADAMTS18, CLEC3A and CNTNAP4) were not significantly expressed in fibroblasts. The results of those genes that are expressed in fibroblasts are shown in Figure 3. The expression of CHST6 and KIAA1576 genes was found to be increased in cases when compared with controls (P<0.001). CHST5 and WWOX were not differentially expressed, whereas the NUDT7 gene had a lower expression in cases compared with controls, but this was not statistically significant.

Figure 3
figure 3

mRNA expression levels of select candidate genes found in the linked region. The mRNA expression of the genes (normalized to succinate dehydrogenase) from the skin fibroblasts of three probands are compared with the mean mRNA expression from three controls. The mRNA expression level was taken as the mean fold change (2−ΔΔCt). The SEM values of triplicate experiments are shown. *indicates significant P value <0.05.

Discussion

The inverse relation between HDL-C plasma levels and CAD has been well described in multiple epidemiological studies. Although environmental influences have an important role in determining serum HDL-C levels, genetics has a major role. This has been underscored by the discovery and identification of several genes involved in HDL metabolism.6 We performed two genome-wide scans in large families with probands ascertained for type II diabetes, CHD or low HDL-C. We obtained suggestive evidence for linkage25 on chromosome 16 from both of these French-Canadian study samples. The region on chromosome 16q22-24 has been associated with HDL-C in several studies. Specifically, two previous studies, one involving Mexican Americans26 and the other using Finnish and Dutch family samples,27 provided evidence of linkage to this region on chromosome 16 and both were located <12 cM from the peaks observed in our analysis of Quebec families. Indeed, at least eight previous genome-wide scan studies have identified chromosomal loci for HDL-C on chromosome 16 with LOD scores >1.28, 29, 30, 31, 32, 33, 34, 35 In addition, Mehrabian et al34 found a QTL for HDL-C on mouse chromosome 8 at genetic markers D8Mit12 (LOD=3.6) and D8Mit14 (LOD=3.5), which is syntenic to human chromosomal region 16q22.1-16q24. Moreover, in a study of mouse strains C57BL/6J and 129S1/SvImJ, Ishimori et al35 identified a suggestive QTL on chromosome 8 (44 cM, LOD 2.6), which is consistent with the locus identified by Mehrabian et al.34 Thus, these mouse QTLs for HDL-C provide additional confirmation of the involvement of this region of human chromosome 16 in HDL-C metabolism.

The LCAT and CETP genes, two well-known genes involved in HDL metabolism, are located near chromosome 16q23-24. LCAT mediates the formation of cholesteryl esters within HDL particles, which allows the maintenance of a gradient of free cholesterol from the plasma membrane to HDL particles and ensures a net movement of cholesterol from the cell to HDL. We sequenced the LCAT gene in the probands of QUE families and no coding variants were identified. CETP promotes the transfer of cholesteryl esters from HDL particles to triglyceride-rich (VLDL and IDL) particles in exchange for triglycerides.36, 37, 38 Many association studies have revealed a significant association between CETP gene polymorphisms and levels of plasma HDL-C (reviewed in de Grooth et al39 Thompson et al40). Fine mapping of the 16q23-24 region with 12 additional microsatellites markers excluded both the LCAT and CETP genes from the linked region.

Our previous analysis of common variants in the region showed an association between an intronic SNP in the WWOX gene (rs2548861) and HDL-C in the QUE sample (P=0.02 using a dominant model).18 However, this SNP explains only 1.5% of the variance in HDL-C concentrations at the population level18 and is not likely to be a major determinant of HDL-C in families that show a strong pattern of segregation.

Using both microsatellite markers and SNPs, we were able to identify a minimal haplotype in one linked family of 6.6 Mb on chromosome 16q23-24. It is noteworthy that a genome-wide association study from the Diabetes Genetics Initiative (http://www.broad.mit.edu/diabetes/) also showed evidence of association (P value <0.0001) with six SNPs (rs8057477, rs4888535, rs6564359, rs7404386, rs3924497 and rs8049365) between positions 75307621–75376819, all residing within the linked region. We sequenced all coding and exon–intron boundaries of the 39 genes found within this region. One variant in CHST6 showed evidence of co-segregation with low HDL-C and was investigated further. CHST6 belongs to the sulfotransferase family. It catalyzes the transfer of sulfate to position 6 of nonreducing N-acetylglucosamine (GlcNAc) residues of keratin.41 It is located on the Golgi apparatus membrane and is mainly expressed in the cornea and brain but also in the spinal cord and trachea. Defects in the CHST6 gene are the cause of macular corneal dystrophy (MIM: 217800). In this study, CHST6 was considered a good candidate gene for low HDL-C, as ocular manifestations have been previously associated with other gene defects causing a low HDL-C phenotype (for example, LCAT, APOA1). However, there are currently no published data characterizing the lipid profile of patients with macular corneal dystrophy or directly relating the function of this gene to lipid metabolism.

We resequenced all other genes in the 6.6-Mb region and found no segregating variants but could not rule out variants found in the noncoding regions of these loci, which could have an effect on the transcription of these genes. Therefore, we examined the expression of genes in three probands of families linked to this region. An increase in the level of CHST6 expression was found in these probands and we hypothesize that it might contribute to HDL metabolism. However, our RT-PCR results should be interpreted with caution as fibroblasts may not be the most appropriate cell model. Furthermore, we cannot exclude the possibility that the observed linkage and association at this locus may be caused by another nearby gene.

In a proteomic analysis of HDL by Rezaee et al42, C-type lectin domain family 3 member A (CLEC3A) has been identified as an HDL-associated protein. The gene coding for the CLEC3A protein also resides in the fine-mapped region on chromosome 16q23-24. Sequencing of this gene in probands revealed one nonsynonymous (Q197L) variant in one family. However, this variant did not segregate with low HDL-C in this family and, as CLEC3A is not expressed in fibroblasts, we were unable to investigate the differential expression level of its mRNA.

In summary, we have fine-mapped a region on chromosome 16q23-24 that is likely to harbor a gene for low plasma levels of HDL-C. Recently, genome-wide association studies have been carried out for lipid traits, including HDL-C.43, 44 Although these studies have revealed polymorphisms associated with the variation in HDL-C level in a number of European populations, such variants do not account for a significant fraction of their variance and are not likely to be the cause of extreme levels of HDL-C that segregate in some families. More likely, multiple rare genetic variants that have a large effect lead to the extreme levels of HDL-C observed in families. These may sometimes be found in the same genes as the common variants that affect HDL-C, but perhaps not always. Therefore, the identification of the causal genes for more severe forms of low HDL-C may contribute to our understanding of the regulation of HDL-C levels and subsequently provide new therapeutic targets for CAD prevention. With recent advances in sequencing technology, it should be feasible to resequence this particular region in its entirety to identify the variants that effect HDL-C levels.

Conflict of interest

The authors declare no conflict of interest.