Large meta-analysis of genome-wide association studies identifies five loci for lean body mass

Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10−8) or suggestively genome wide (p < 2.3 × 10−6). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.


Supplementary Tables
Supplementary Table 1: Discovery and replication meta-analyses for all 16 SNPs taken into replication including replication samples of European ancestry and replication samples of both European and non-European ancestry (bold P-values indicate SNPs successfully replicated).             Table 9).

KBioSciences
The majority of the studies participating in the replication stage had genotyping performed by K-Biosciences. All SNPs genotyped by K-Biosciences (www.kbioscience.co.uk) used a competitive allele specific PCR (KASPar) assay. A Ychromosome specific assay was evaluated in all samples. We performed a standardized quality control for the K-Bioscience genotyping including: 1) visual checking of the genotype clustering of the results; 2) exclusion of samples with >20% of genotypes missing (no genotype for 7 or more of the SNPs) -82 samples (4.6%); 3) exclusion of SNPs with a call rate <90% -0 SNPs; 4) HWE check to retain only SNPs with a p-value > 0.00147 -all SNPs.

Uppsala SNP and Seq Technology Platform -ULSAM/PIVUS
The ULSAM and PIVUS samples (n=2,213) were genotyped for 34 SNPs using the Illumina Golden Gate assay. 3 Allele signal intensities were read-out by the Illumina BeadXpress system and converted into genotypes using the software

Kupio Sequenom -METSIM
Genotyping of SNPs was performed using the Sequenom iPlex Gold SBE assay at the University of Eastern Finland. The Sequenom iPlex call rate was 90.2-96.9%, and the discordance rate was 0% among 4.2% DNA samples genotyped in duplicate.

BioServe Biotechnologies, Ltd. -Women's Health Initiative
The Women's Health Initiative samples (n=2,406 non-Hispanic women of European descent) were genotyped for all replication SNPs using the Sequenom iPLEX platform, which is based on multiplexed PCR followed by a mass extend reaction that produces an allele-specific extension product. After resin clean-up the extension products are spotted onto chips, and a laser hits the products in a MALDI-TOF mass spectrometer which measures the mass of the extension products. Automated allele calling is carried out in real-time using Sequenom's Typer 3.4 software to convert the mass of the extension product to an allele call. The thresholds were all automated. Quality control was achieved by typing internal positive control samples of known genotypes with no template controls and by QC of replicate samples. 90 samples ~3% of the total were repeated. In group 1, for 14 SNPs there was 0% discordance and in group 2 for 8 SNPs there was 1% discordance. Genotyping plates were reviewed for results from positive-and negative-DNA control wells that are organized in specific patterns to assist in the QC process and to ensure correct plate orientations during processing and data review. In initial assay development, DNAs from 20 individuals from Coriell's Polymorphism Discovery Resource were used. SNP assays with genotype call rates of < 80% are excluded or redesigned. In SNPs with a high minor allele frequency the distributions of homozygote major, hetero and homozygote minor alleles were examined.

Sex-specific analyses
Given the known sexual dimorphism of lean body mass, and evidence from variance decomposition studies that this might reflect sex-specific genetic effects, 4 we performed sex-specific meta-analyses of all successfully replicated SNPs to examine potential differences in association results between men and women. Furthermore, since two different methods (DXA and BIA) were used to measure lean mass, for successfully replicated associations, we also conducted meta-analysis in DXA and

Technique-specific analyses
Two body composition techniques were employed by various participating studies. The first technique was dual energy x-ray absorptiometry (DXA). DXA uses an x-ray tube that produces two energy peaks. The ratio of soft-tissue attenuation (R ST ) at the two energies is measured. The attenuation of pure fat (R F ) and of bone-free lean tissue (R L ) are known from both theoretical calculations and human experiments. Given a subject's R ST and the known R S for fat and lean, one can solve two equations (one at each x-ray energy) with two unknowns to calculate the proportion of fat and lean tissue in each pixel.
Bioimpedance analysis (BIA) is another approach to estimating body composition. It relies on the geometrical relationship between: 1. Impedance of the body (composed of the resistance of the fat free mass and the reactance produced by the capacitance of cellular membranes, tissue interfaces and nonionic tissues 2. Height of the subject

Volume of fat free mass
Further refinements of BIA have produced equations using reactance and resistance to predict appendicular skeletal muscle mass. 5 Cohorts with access to the reactance and resistance data for their BIA measures calculated the appendicular lean mass using the equation developed by Kyle. 6 Genome wide association analyses with whole body and appendicular lean mass from cohorts with BIA were meta-analyzed separately from cohorts with whole body lean mass and appendicular lean mass derived from DXA. Formal tests for heterogeneity by the technique used to measure whole body and appendicular lean mass were performed using I 2 statistics implemented in the METAL package.
Cohorts participating in this project had measures of body composition obtained using either DXA (up to 21,074 subjects) or BIA (up to 17,218 subjects). Based on previous work showing a high correlation between the two phenotypes, 7, 6 the primary analysis combined all discovery cohorts regardless of technique used to measure lean mass. Nevertheless, we also stratified the discovery meta-analysis by technique of lean mass measurement, and performed secondary technique-specific meta-analyses for the eight successfully replicated associations. In general there was no evidence of heterogeneity, with all effect sizes being in the same direction and of similar magnitude for DXA vs. BIA (Supplementary Supplementary Table 3).

Replication and meta-analyses with all replication samples
See Methods section of manuscript.

Candidate genes reported previously
We looked at all associations between 1,440 SNPs in the four candidate genes previously reported to be associated with lean mass including GREM1, CNTF, GLYAT, TRH, and PRDM16. After correcting for multiple testing using a false discovery rate of 0.05, none of the SNPs in any of the genes were significantly associated with either whole body or appendicular lean mass.

Expression Analyses
Muscle tissues: was completed on 60 Caucasian subjects from the total study (representing 5 men and 5 women from each of treatment 6 groups) according to manufacturer instructions. Three samples did not pass quality control measures for either cRNA input to microarray or microarray result quality using standard limits. Array data were quantified using the PLIER algorithm in Expression Console (Affymetrix). Genomic DNA was isolated from blood or skeletal muscle using DNeasy kits (Qiagen, Valencia, CA) and amplified using REPLI-g mini kits (Qiagen). Genotypes were generated from HumanOmni5-Quad BeadChips, using to perform the imputation. SNPs with fewer than 5 allele counts and r 2 hat (an index of imputation quality from the Minimac package) < 0.3 were excluded from eQTL analyses. For cis-eQTL analysis we 1) performed factor analysis via PEER 16  SNPs were obtained using a custom-designed Axiom array (Affymetrix, Santa Clara, CA) designed to capture variants with minor allele frequency ≥0.05 observed in 296 Pima Indians with whole genome sequencing data. Genotypes of 1 SNP (rs4842924) were imputed based on data from the Axiom chip, using the whole-genome sequencing data of 296 Pima Indians as a reference in the program Minimac. 15 The associations between 3 SNPs and 5 transcripts (for a total of 5 SNP-transcript pairs) were tested using linear regression analyses based on additive genetic models, adjusting for effects of age and population admixture. Analyses were modeled using proc mixed command to include the empirically estimated identity-by-decent as a random effect in the SAS program (Cary, NC). 18 : A total of 75 chest wall muscle biopsies were obtained from patients undergoing thoracic surgery for lung and cardiac diseases. Tissue collection, RNA and DNA isolation, expression profiling, and DNA genotyping have been described previously. 18 A gene expression profile with Agilent custom array (Agilent Santa Clara, CA) and genome-wide genotyping of Illumina SNP genotyping arrays were available on all 75 patients. All gene expression levels were adjusted for age, sex, race, and study center. We estimated cis-eQTL of the 5 replicated GWAS SNPs or their proxies (SNPs in LD with r 2 ≥ 0.7) with selected transcripts within 500 kb of the SNP position. The Kruskal-Wallis test was used to determine the associations between adjusted expression levels and genotypes. were available on 400 children from families recruited through a proband with asthma. The detailed study design was described elsewhere. 20 We also profiled expression levels using the Illumina Human 6 BeadChips on additional 550 children from the UK (recruited from families with atopic dermatitis probands). These individuals were genotyped using Illumina HumanHap300

Chest wall muscle biopsies from patients undergoing thoracic surgery for lung and cardiac diseases
Genotyping Beadchip. Inverse normal transformation was used to normalize the skewed distribution in both samples. MACH Primary osteoblasts were derived from trabecular bone of proximal femora obtained from donors undergoing total hip replacement. Tissue collection, RNA and DNA isolation, expression profiling, and DNA genotyping have been described in detail. 23 All gene expression levels were adjusted for sex and year of birth. We studied the cis-eQTL of the 5 replicated GWAS SNPs or their proxies (SNPs in LD with r 2 ≥ 0.7) with selected transcripts within 500 kb of the SNP position. The linear regression model implemented in PLINK 24 was used to determine association between adjusted expression levels and genotypes.

SNP Annotation and Enrichment analysis of human tissue-specific regulatory elements of GWAS loci
For coding variants, we predicted their function by PolyPhen-2. 25 For all variants, we annotated potential regulatory functions of our replicated GWAS SNPs and loci based on experimental epigenetic evidence including DNase hypersensitive sites, histone modifications, and transcription factor binding sites in human cell lines and tissues from the ENCODE Project and the Epigenetic Roadmap Project. We first examined GWAS loci for GWAS lead SNPs and SNPs in high LD (r 2 ≥ 0.8) with GWAS lead SNPs. We then identified potential enhancers and promoters in the GWAS loci across 127 healthy human tissues/normal cell lines available in the ENCODE Project and the Epigenetic Roadmap Project from the HaploReg4 web browser 26 using ChromHMM. 27 To evaluate if replicated GWAS loci were enriched with regulatory elements in skeletal muscle tissue, we performed a hypergeometric test to examine whether estimated tissue-specific promoters and enhancers in each GWAS locus were enriched in 8 relevant skeletal muscle tissues/cell lines vs non-skeletal muscle tissues (119 tissues/cell lines).
The permutation (100,000 permutations) with minimum p-value approach was performed to correct for multiple testing.
Permutation p-values < 0.05 were considered statistically significant. In addition, we also performed enrichment analyses in smooth muscle tissues/cells, fat tissue, brain tissues, blood cells and gastrointestinal tract tissues. The 8 skeletal muscle relevant tissues/cells were excluded when conducting enrichment analyses for other tissue types.

RNA-Seq analysis of gene expression in young versus old men, comparison with GWAS
Skeletal muscle biopsies from the vastus lateralis were taken from 19 healthy young (mean age 21 years, range 19-29 years, SD 2.7) and 18 healthy older (mean age 69 years, range 64-79 years, SD 4.6) male individuals using established procedures. 28,29 We used only males to maximize the likelihood of detecting differential gene expression using RNA sequencing (RNA-Seq). None of the subjects were performing regular physical exercise nor carrying out any form of exercise in the 48 hours preceding the biopsy, nor had they undergone an orthopedic procedure (i.e., joint arthroplasty) in the leg where the biopsy was taken. All subjects were studied between 0800h -1200h in the fasted state with no caffeine or alcohol intake for 24 hours before the biopsy, and none of the subjects had significant medical disorders (diabetes, nerve or muscle disease, hypercholesterolemia requiring statins, or cardiovascular disease (other than hypertension requiring at most one medication)) or smoked. Samples were taken under ethical approval from the Hamilton Health Sciences Research Ethics board (IRB # 03-267, 05-376, 09-148) and all subjects provided informed, written consent. RNA was extracted from the tissue using standard protocols, and subjected to RNA-Seq using 50bp single reads on a HiSeq2000 using protocols recommended by the manufacturer. Data analysis was performed with the RNA-Seq workflow module of the systemPiperR package 30 available on Bioconductor. 31 Quality reports were generated with the seeFastq function. RNA-Seq reads were mapped with the splice junction aware short read alignment suite Bowtie2/Tophat2 32,33 against the H.sapiens genome sequence from Ensembl (Release 83). For the alignments, we used default parameters of Tophat2 optimized for mammalian genomes. Raw expression values in the form of gene-level read counts were generated with the summarizeOverlaps function. 34 We counted only reads overlapping exonic regions of genes, discarding reads mapping to ambiguous regions of exons from overlapping genes. Given the nonstranded nature of RNA-Seq libraries, the read counting was performed in a non-strand-specific manner. Of the 60,670 Ensembl IDs mapped, we then performed analysis of differentially expressed genes through use of the multtest package, 35 using the Benjamini-Hochberg method to correct for multiple testing. As our sample size was ~ 20 per group, we permuted the t-statistic 500,000 times to compare gene expression in the young group versus the old group, to minimize the likelihood that we would identify false positives.
We examined whether any SNPs in LD (r 2 >0.8) with our replicated SNP associations with whole body or appendicular LM were differentially expressed in young versus old muscles. Of 1,420 genes differentially expressed with age with a p-value of less than 0.05, none were in or near the five significant GWAS loci. Thus, there was no significant evidence of a relationship between the GWAS loci and the differentially expressed genes with age.

CoLaus:
The design of the CoLaus study has been described previously. 39    2004. 82 The women were all 25 years of age at inclusion. Initially 2,394 women were invited and after excluding subjects who were pregnant at the time of the baseline investigation or during the previous 12 months, a total of 1,061 women participated in the study. All participants answered a detailed questionnaire regarding their general health; BMD and body composition was assessed by DXA. The data reported in this analysis is based on women for whom genotype and body composition data was available, corresponding to 1001 women. All participants gave written informed consent and the Lund University Ethics Committee approved the study. The study was performed according to the principles of the Helsinki declaration. have been performed on average two years after the baseline investigation and is also performed at the latest investigation.
The present analysis is based on participants with DNA samples and a total lean mass measurement by DXA at 72 years of age (n=856).

Relationship between Insulin Sensitivity and Cardiovascular Disease (RISC):
The primary objective of the RISC study is to establish whether insulin resistance predicts the development of atherosclerosis as measured by cIMT. 84 Other objectives are to determine whether insulin resistance predicts the deterioration of CVD risk markers, onset of diabetes, and obesity.
The study has also been designed to determine the genetic and environmental contributions to insulin resistance.
Understanding the relative importance of genetic and environmental factors and how they interact is critically important for the development of specific treatments and for the identification of high-risk individuals. Further, the study aims to develop a novel method to identify more easily insulin resistant subjects in clinical practice. In brief, participants were recruited from the local population at 19 centers in 14 countries in Europe, according to the following inclusion criteria: either sex, age between 30 and 60 years, and clinically healthy, stratified by sex and age according to 10-year age groups. Initial exclusion criteria were: treatment for obesity, hypertension, lipid disorders or diabetes, pregnancy, cardiovascular or chronic lung disease, weight change of 5 kg or more in last 6 months, cancer (in last 5 yr), and renal failure. Exclusion criteria after screening were: arterial blood pressure 140/90 mm Hg or higher; fasting plasma glucose 7.0 mmol/liter or greater; 2-h plasma glucose [on a 75-g oral glucose tolerance test (OGTT)] 11.0 mmol/liter or greater; total serum