Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (~35,000 samples) with the population-specific reference panel created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value <6.61 × 10−4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted to be deleterious. The frequency of this ABCA6 variant is 3.65-fold increased in the Dutch and its effect (βLDL-C=0.135, βTC=0.140) is estimated to be very similar to those observed for single variants in well-known lipid genes, such as LDLR.


Supplementary tables
Supplementary Table 1. Baseline characteristics for the discovery cohorts.    19 11,198,502 C T 0.134 -0.198    Supplementary Table 6. Replication of the loci identified by Teslovich et al. [1] in the metaanalysis of the discovery cohorts.
β is the effect size of the effect allele per mmol per Liter.  [2] in the metaanalysis of the discovery cohorts. β is the effect size of the effect allele per mmol per Liter
Erasmus Rucphen Family (ERF) Study. The ERF study has been described in detail previously [3] . A total of approximately 3,000 participants descend from 22 couples who lived in the Rucphen region in The Netherlands in the 19th century. The 2,755 individuals with genotype data and lipid measurements were included in the current analysis.
Lifelines. LifeLines [4] is a multi-disciplinary prospective population-based cohort study examining in a unique three-generation design the health and health-related behaviours of 165,000 persons living in the North East region of The Netherlands. It employs a broad range of investigative procedures in assessing the biomedical, socio-demographic, behavioural, physical and psychological factors which contribute to the health and disease of the general population, with a special focus on multimorbidity and complex genetics. This study only includes the individuals of which both genotype and lipid measurements was available.

Leiden Longevity Study (LLS).
The LLS has been designed to investigate biomarkers of healthy ageing and longevity [5] and has been described in detail previously [6] . It is a family-based study consisting of 1,671 offspring of 421 nonagenarian sibling pairs of Dutch descent, and their 744 partners.

Netherlands Twin Register and Netherlands Study of Depression and Anxiety (NTR-
NESDA). The sample used in the analyses in this study consisted of 5,764 participants of the Netherlands Twin Register (NTR). NTR participants are ascertained because of the presence of twins or triplets in the family and consist of multiples, their parents, siblings and spouses. Twins are born in all strata of society and NTR represents a general sample from the Dutch population. Age ranged between 12 and 89 (median 39), and 62.4% was female [7,8] .
The other 1,816 samples originated from the NESDA cohort with available phenotype data.
NESDA is a longitudinal study focusing on the course and consequences of depression and anxiety disorders. Subjects for NESDA were recruited from three sources, namely the general population, mental health organizations and general practices. The vast majority of NESDA subjects is selected for depression and anxiety, but the sample also includes healthy controls without lifetime psychiatric disorders. Age ranged between 18 and 65 in NESDA (median 43), and the proportion of females was 66.1% [9] . For all analysis, we excluded one monozygotic twin per pair. Additional corrections for family resemblance are analysis specific, and described where appropriate. Lipids were measured from fasting blood samples following standard protocols as described in Willemsen et al. [8,10] .

Prevention of Renal and Vascular End stage Disease study (PREVEND). This is an ongoing
prospective study investigating the natural course of increased levels of urinary albumin excretion and its relation to renal and cardiovascular disease. Details of the protocol have been described elsewhere [11] (www.prevend.org). Blood samples were obtained in the morning hours. Red blood cell measurements were performed at the 2nd visit (about 4.2 years from baseline).

Prospe ctive Study of Pravastatin in the Elderly at Risk (PROSPER). A detailed description
of the study has been published elsewhere [12][13][14] . PROSPER was a prospective multicenter randomized placebo-controlled trial to assess whether treatment with pravastatin diminishes the risk of major

Rotterdam Study cohort I (RS-I).
The Rotterdam Study is an ongoing prospective populationbased cohort study, focused on chronic disabling conditions of the elderly. The study comprises an outbred ethnically homogenous population of Dutch Caucasian origin. The rationale of the study has been described in detail elsewhere [15] . In summary, 7,983 men and women aged 55 years or older, living in Ommoord, a suburb of Rotterdam, the Netherlands, were invited to participate in the first phase. Fasting blood samples were taken during the participant's third visit to the research center.

Rotterdam Study cohort II (RS-II).
The Rotterdam Study cohort II prospective populationbased cohort study comprises 3,011 residents aged 55 years and older from the same district of Rotterdam. The rationale and study designs of this cohort is similar to that of the RS-I [15] . The baseline measurements, including the fasting HDL measurements, took place during the first visit.

Rotterdam Study cohort III (RS-III).
The Rotterdam Study cohort III prospective populationbased cohort study comprised 3,932 residents aged 45 years and older from the same district of Rotterdam. The rationale and study designs of this cohort is similar to that of the RS-I [15] . The baseline measurements, including the fasting HDL measurements, took place during the first visit.

Cardiovascular Health Study (CHS).
The CHS is a population-based cohort study of risk factors for CHD and stroke in adults greater than or equal to 65 years years conducted across four field centers [16] .
The original predominantly Caucasian cohort of 5,201 persons was recruited in 1989-1990 from random samples of the Medicare eligibility lists; subsequently, an additional predominantly African-American cohort of 687 persons were enrolled for a total sample of 5,888. DNA was extracted from blood samples drawn on all participants at their baseline examination in 1989-1990. In 2007 genotyping was performed at the General Clinical Research Center's Phenotyping/Genotyping Laboratory at Cedars-Sinai using the Illumina 370CNV BeadChip system on 3,980 CHS participants who were free of CVD at baseline, consented to genetic testing, and had DNA available for genotyping.
CHS was approved by institutional review committees at each site, the subjects gave informed consent, and those included in the present analysis consented to the use of their genetic information for the study of cardiovascular disease. All studies received appropriate ethical approval, and all participants gave informed consent.

CROATIA-Korcula
Family Heart Study (FamHS). The collection of phenotypes and covariates as well as clinical examination have been previously described for the Family Heart Study [17] . In brief, the FamHS began in 1992 with the ascertainment of 1,200 families, half randomly sampled and half selected because of an excess of coronary heart disease (CHD) or risk factor abnormalities as compared with age-and sexspecific population rates.  [18] . The Third Generation(N = 4,095) consists mostly of the children of the Offspring cohort, and was enrolled in 2002 to 2005 [19] . All participants were examined every 4-8 years. DNA for surviving participants was collected in the late 1990s and early 2000s (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005). Cholesterol and genetic data from 3,464 Offspring subjects and 3,569 Third Generation subjects contribute to this paper.

Generation Scotland (GS). The Generation Scotland: Scottish Family Health Study
(GS:SFHS) is a collaboration between the Scottish Universities and the NHS, funded by the Chief Scientist Office of the Scottish Government. GS:SFHS is a family-based genetic epidemiology cohort with DNA, other biological samples (serum, urine and cryopreserved whole blood) and sociodemographic and clinical data from ~24,000 volunteers, aged 18-98 years, in ~7,000 family groups.
Participants were recruited across Scotland, with some family members from further afield, from 2006-2011. Most (87%) participants were born in Scotland and 96% in the United Kingdom or Ireland. The cohort profile has been published [20] . GS:SFHS operates under appropriate ethical approvals, and all participants gave written informed consent.

Multi-Ethnic Study of Atherosclerosis (MESA Whites).
MESA is a study of the characteristics of subclinical cardiovascular disease (disease detected non-invasively before it has produced clinical signs and symptoms) and the risk factors that predict progression to clinically overt cardiovascular disease or progression of the subclinical disease. MESA researchers study a diverse, population-based sample of 6,814 asymptomatic men and women aged 45-84. Thirty-eight percent of the recruited participants are white, 28% African-American, 22% Hispanic, and 12% Asian, predominantly of Chinese descent, as well as 2,128 additional individuals from 594 families recruited through MESA Family by utilizing the existing MESA framework, yielding 3,026 sibpairs divided between African Americans and Hispanic-Americans. Participants were recruited from six field centers across the United States: Wake Forest University, Columbia University, Johns Hopkins University, University of Minnesota, Northwestern University and University of California -Los Angeles. For current investigation, analysis of the MESA cohort was restricted to those participants who selfidentified as White.
Orkney Complex Disease studies (ORCADES). The ORCADES is a family-based, crosssectional community study of the genetics of complex traits, based in the Orkney Isles in Scotland [21] .

Prospective Study of Pravastatin in the Elderly at Risk (PROSPER).
For the replication only the Scottish (PROSPER-Scottish) and the Irish (PROSPER-Irish) samples of the PROSPER study were used. The full description of PROSPER can be found at the section of the discovery cohorts.

Genotyping and imputations.
Discovery cohorts. In