Genetic contribution to lipid levels in early life based on 158 loci validated in adults: the FAMILY study

The contribution of polymorphisms associated with adult lipids in early life is unknown. We studied 158 adult lipid polymorphisms in 1440 participants (544 children, 544 mothers and 324 fathers) of the Family Atherosclerosis Monitoring In early life (FAMILY) birth cohort. Total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C) and triglycerides (TG) measurements were collected at birth, 3 and 5 years of age. Polymorphisms were genotyped using the Illumina Cardio-Metabochip array. Genotype scores (GS) were calculated for TC, HDL-C, LDL-C and TG. Linear and mixed-effects regressions adjusted for sex, age and population stratification were performed. The GS was associated with LDL-C level at 3 and 5 years (β = 0.017 ± 0.003, P = 2.9 × 10−8; β = 0.020 ± 0.003, P = 5.7 × 10−9) and from birth to 5 years (β = 0.013 ± 0.003, P = 2.6 × 10−7). The GS was associated with TC level at 3 and 5 years (β = 0.009 ± 0.002, P = 9.1 × 10−7; β = 0.009 ± 0.002, P = 7.7 × 10−6). CETP rs3764261 was associated with the HDL-C level from birth to 5 years (β = 0.064 ± 0.014, P =  7.4 × 10−6). AMPD3 rs2923084 was associated with the HDL-C level at 5 years (β = 0.096 ± 0.024, P = 9.7 × 10−5). Known loci associated with blood lipids in adults are associated with TC, LDL-C and HDL-C, but not TG in early life. Genetically predisposed children may benefit from early lipid lowering preventative strategies.


Results
The phenotypic characteristics of the study population are summarized in Table 1. Linear regressions and a linear mixed-effect model were used to assess the effects of the child's SNPs (Supplementary Tables 3-6) on four lipids levels. It should be noted that no statistically significant associations were found between SNPs/GS and TG levels after the Bonferroni correction. All the statistically significant and nominal associations of child's SNPs/GS with TC, LDL-C, HDL-C, and TG are available in Tables 2 and 3.

Age-dependent genetic effect in children.
We compared the beta values across the different times of measurement (i.e. birth, 3 years and 5 years of age) for the SNPs/GS that presented statistically significant associations with lipid traits (Supplementary Table 7). The TC GS showed an increase of its beta values between birth, 3 years and 5 years of age (P 0-3 = 5.9 × 10 −4 and P 0-5 = 9.6 × 10 −4 , respectively). The LDL-C GS also showed an increase of its beta values between birth and 3 years of age and also between birth and 5 years of age (P 0-3 = 1.9 × 10 −2 and P 0-5 = 4.1 × 10 −3 , respectively). On the contrary, the CETP rs3764261 SNP did not show any significant increase of its beta-values between birth, 3 and 5 years of age (P 0-3 = 0.31, P 3-5 = 0.34, P 0-5 = 0.21, respectively). The AMPD3 rs2923084 SNP effect's between birth and 3 year of age was not statistically significant (P 0-3 = 6.50 × 10 −2 ) whereas significant differences were observed between year 3 and 5 and at birth and 5 years (P 0-3 = 2.30 × 10 −2 and P 0-5 = 3.71 × 10 −5 , respectively).
Comparison of the genetic effects in child and adult populations. We compared the beta values obtained in children using linear mixed-effect models and those obtained by the Global Lipids Genetics Consortium in adults (Table 4). Of the six SNPs nominally associated with TC in children, adult beta values were significantly higher for rs12292921 (APOA1) and children beta values were higher for rs12452315 (OSBPL7) and rs2277862 (FER1LA). In regards to the two SNPs associated with TG only, rs7248104 (INSR) showed a significantly larger effect in adults in comparison with children (Table 4). With respect to HDL-C, two SNPs (rs2606736 (ATG7) and rs4148008 (ABCA8)) out of the six showed a significant difference in their beta values, the effects being smaller in children when compared to adults. Lastly, rs12292921 (APOA1) and rs12452315 (OSBPL7) out of the eight SNPs nominally associated with LDL-C showed a significantly smaller and higher effect, respectively, in adults from the Global Lipids Genetics Consortium in comparison to children from FAMILY.

Comparison of the variance explained by the 158 SNPs in child and adult populations.
We computed the variances explained by the SNPs used in the GS at 3 times of measurements for the four traits and we compared our variances with the theoretical variance calculated from the summary statistics of the the Global Lipids Genetics Consortium data in adults 8 . Results are available in Table 5. In adult populations, we found that the 69 SNPs associated with TC explained 6.4% of the theoretical variance, 7.1% for the 73 SNPs associated with HDL-C, 7.4% for the 59 SNPs associated with LDL-C and 4.2% for the 40 SNPs associated with TG. Child SNPs explained almost all the adult variance for TC and LDL at year 3 and year 5 whereas SNPs associated with HDL and triglycerides explained only a small part of the adult variance. At birth, SNPs associated with all the lipid traits explained only a negligible part of the adult variance.

Discussion
We explored the associations of 158 SNPs detected in adults that reached genome-wide statistically significant level of association (P < 5 × 10 −8 ) with HDL-C, LDL-C, TC, and TG levels in a population with predominantly European ancestry from the FAMILY birth cohort. This is the first report to study lipid SNPs associated to adults in an early child longitudinal birth cohort. We found a statistically significant association of GS with LDL-C levels at 3 and 5 years of age and from birth to 5 years of age by using a mixed effect model. We also found an association of GS with TC levels at 3 and 5 years of age. A statistically significant association was identified between AMPD3 rs2923084 and HDL-C level at 5 years of age, as well as an association of CETP rs3764261 with HDL-C from birth to 5 years of age when using the mixed-effect model. The AMP enzyme coded by AMPD3 interacts with lipids in cytosolic regulation process 13 . CETP acts in cholesterol transport as a central effector between HDL-C and apolipoprotein B 14,15 . This role makes CETP a relevant molecular target for novel lipid modifying drugs 15 . Interestingly, the effect size of several SNPs and the GS are different in diverse age windows (i.e. values at birth compared to those at 5 years of age), suggesting there is an overall variable effect of genetic factors on blood lipid levels in early life. The nominal evidence of association observed for the GS of HDL-C and TG can be explained by the modest power of our study and differential heritability of lipid levels in early childhood. Our data are consistent with reports that demonstrate an increase in heritability values of lipid traits across adolescence 16 . Hypothesis-generating GWAS for lipid traits in younger populations have yet to be reported. Applying these approaches in diverse ethnic populations may shed further light on the genetic architecture of lipids.

Strengths and Limitations.
Our study has several strengths. This is the first study to examine SNPs that affect lipid levels at birth and early childhood. Secondly, we presented an exhaustive list of lipid associated SNPs (n = 158) extracted from the Illumina Cardio-metabochip. Lastly, the familial-based design also provided excellent quality control for genotyping data by assessing Mendelian inconsistencies. The limitations of this study include the modest size of the sample that may have decreased the power to detect associations with small effect sizes and/or low risk allele frequencies ( Supplementary Figures 1-4). The power of our study was computed to be 80% using QUANTO software. All the range of allele frequencies (5% to 95%) and 4 different beta effects were assessed. In our study, we have the power to detect beta effect upper of 0.08 mg/dl for allele frequencies from 10% to 90%. Another limitation of this study is the lack of the replication, but there are only few studies using children in this age range with lipid level measurements and genetic data. The follow-up duration in this report (birth to 5 years of age) prevents the assessment of genetic effects of the lipid-associated SNPs later on in childhood and adolescence. It should be noted that since FAMILY is an ongoing longitudinal cohort, an opportunity to reanalyze the data at a later stage of life is possible. The availability of atherosclerosis measurements at 10-years of age through intima media thickness makes this perspective even more relevant. Lastly, we used cord blood lipid measurements at birth since fasting samples from newborns are not available. Use of cord blood may have biased our interpretation of the data since the phenotype is not defined in the same way at all-time points.

Conclusion
In conclusion, we demonstrate that 158 loci known to be associated with blood lipids in adults are also associated with TC, LDL-C and HDL-C amongst children from birth to 5 years of age. Our results suggest that genes predisposing to abnormal lipid levels in adults already have an impact during the first years of life. The discovery that genetic variants modulate LDL-C early in life is striking as high LDL-C levels have been shown to be causally associated with future coronary artery disease events 17 . This has some importance in public health especially in the context of earlier stages of CVD prevention strategies in order to lower blood lipid levels in genetically at risk subgroups. Those with a genetic risk for CVD may be identified through a family history of dyslipidemia or through individualized genetic medicine approaches, once an exhaustive list of lipid associated genetic variants is  completed. The practice of administrating lipid lowering drugs restricted to adults may be challenged as our findings suggest that childhood preventative strategies would be beneficial in order to prevent CVD more efficiently.

Methods
Study population. The details of the FAMILY study have been described in a previous publication 18 .
FAMILY is an ongoing birth cohort study that includes mothers, fathers and children with a planned follow-up of 10 years. Over the last 7 years, 859 families (901 babies, 259 siblings, 857 mothers and 530 fathers) have been enrolled into the FAMILY study and followed longitudinally. In this study, we excluded offspring from multiple births (as the twin status has a strong impact on birth weight and postnatal catch-up) and siblings of "index" children due to familial relatedness and phenotypic issues (i.e. absence of phenotypic data at birth). Following these exclusion criteria, 544 mothers, 352 fathers and 544 children had DNA extracted and were successfully genotyped. Phenotypic characteristics of these individuals are displayed in Table 1    Genotyping. Genomic DNAs were extracted from buffy coats for all the participants. The genotyping was performed using the Illumina Cardio-Metabochip (San Diego, CA, USA). This array has been designed by seven consortia studying cardiac, metabolic and anthropometric traits. A selection of 196,725 SNPs for 23 different traits was made. The design and SNP selection of the array are detailed elsewhere 22 . We selected SNPs that reached genome-wide statistically significant level of association (P < 5 × 10 −8 ) for TC, LDL-C, HDL-C or TG in at least one population of European ancestry that were available in the Cardio-Metabochip array. An independent search by S.C. and S.R.d.P. allowed the extraction of the lipid-associated SNPs from two databases (HuGE Navigator 23 and NHGRI GWAS Catalog 24 ) or by a manual literature search in the Pubmed database using the following key words: "Genome wide association study", "lipoprotein cholesterol", "high-density lipoprotein cholesterol", "total cholesterol" and "triglycerides". For SNPs that were not available in the Cardio-Metabochip, we identified proxy SNPs using the Broad Institute website tool SNAP (SNP Annotation and Proxy Search) as well as the Cardio-Metabochip file provided by Illumina using chromosome and chromosomal positions of SNPs from the NCBI Human Genome Browser 25,26 . We used the following criteria to select proxy SNPs: 1) SNPs included in the Cardio-Metabochip 2) r 2 > 0.95 in European population data issued from the 1000 Genomes Project 27 , 3) selection of a coding non-synonymous SNP if available in the proxy list, otherwise, the SNP located closest to the GWAS lead SNP was selected. Linkage disequilibrium (LD) between the selected SNPs was evaluated by using SNAP 25 in European population data of the 1000 Genomes Project 27 . We discarded 96 SNPs that displayed r 2 > 0.2 with another SNP in the list. In total, we selected 158 lipid-associated polymorphisms (Supplementary Table 1 and Supplementary Figure 5). Standard quality control was used to assess the quality of the genotyping. Twenty-six individuals who displayed SNP missing rates >10% for the Metabochip were discarded. All 158 SNPs displayed call rates >97% and obeyed to Hardy Weinberg Equilibrium (P ≥1 × 10 −6 , Supplementary Table 2). As an additional quality control procedure, we analyzed the Mendelian transmission patterns of the 158 SNPs. We found recurrent Mendelian inconsistencies in 5 pedigrees. After excluding the 5 non-biological fathers from the analysis, only two Mendelian distortions was observed in the sample of 158 SNPs, which therefore successfully passed the quality control test. We also tested the self-reported ethnicity using principal component analysis EIGENSTRAT 28 . We found that 92.8% of mothers, 89.3% of fathers and 91.1% of children had European ancestry.

Statistical analyses.
We coded genotypes as 0, 1 and 2, depending on the number of copies of the lipid increasing alleles. Four genotype scores (GS) were calculated by summing the alleles of 69, 59, 73, 40 SNPs for TC, LDL-C, HDL-C and TG, respectively. We used a unweighted GS as recommended by previous studies 29,30 . Missing values were imputed using the method of the mean in the calculation of the GS. This imputation was performed for each SNP individually using the arithmetic average of the coded genotypes observed for all the successfully genotyped individuals. Associations between child SNP/GS and lipid measurements were assessed using linear regression at each time of measurement. Our model was adjusted for sex, age and the 10 first principal components. The linear model can be calculated using: ln(Lipid level) ~SNP + sex + age + PCAs (+parental SNP) + residuals.
The linear mixed-effect regression model was utilized to account for the longitudinal nature of the data (3 lipid measurements across the follow up). In addition, the results of the linear mixed model allow an assessment of the association between SNP/GS across the follow up period. For each trait, the unit of the beta effect and standard deviation are the natural logarithm of the lipid level trait in mg/dl per year and it could be equivalent to a mean variation of level of trait across all the follow-up. We used the intercept as random effects for the linear mixed-effect regression model and sex, age and the 10 first principal components of EIGENSTRAT analysis as fixed effects 28 . The linear mixed model can be simplified to: ln (Lipid level) ~SNP + se + PCAs (+parental SNP) + residual + age + (1|ID). In both models, we used the 10 first principal components as covariates to account for population structure.
Due to their skewedness distribution, the traits were transformed using a natural logarithm (Supplementary Figure 6). All the regression analyses were performed using the free software R 3.0.1. using the packages "lme4" for the linear mixed-effect model.
The Hardy-Weinberg equilibrium was tested using a Chi-square test in combination with permutations and bootstrapping. Mendelian incompatibilities were checked using PLINK. Two-tailed p-values are presented in this manuscript. TC, LDL-C, HDL-C and TG measurements at different times are highly correlated with each other. Similarly, the different statistical tests (linear regression and mixed-model regression) performed in this study are not independent from each other. We therefore only accounted for the number of genetic markers (n = 158) and the number of measurements (n = 3 at birth, 3 and 5 years) while applying Bonferroni's correction for multiple testing. We acknowledged the over-conservative nature of the Bonferroni test and the strong prior evidence of association of these SNPs with lipid levels, based on previous GWAS in adult populations. P < 1.05 × 10 −4 (0.05/ (158 × 3) was therefore considered as statistically significant. The effect sizes for offspring SNPs were compared at birth, 3 years and 5 years of age using a Z-test. To compare with the child SNPs effect size with the adults we performed a quartile normalization as performed in Willer et al. 8 . Variances in our dataset were computed using GCTA using the same covariates as linear models 31 . Theoretical variance from the consortium summary data were compute using the following formula: β 1 2 × Var (SNP 1 ) + β 2 2 × Var (SNP 2 ) + …, where β is the SNP effect and Var (SNP) is the genotypic variance (2 × MAF × (1 − MAF)).