Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10−9) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10−14). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10−9) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10−11). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.
Thyroid hormones have fundamental but diverse physiological roles in vertebrate physiology, ranging from induction of metamorphosis in amphibians to photoperiodic regulation of seasonal breeding in birds1. In humans, they are essential for adult health and childhood development2,3 and levothyroxine is one of the commonest drugs prescribed worldwide. Clinically, thyroid function is assessed by measuring circulating concentrations of free thyroxine (FT4) and the pituitary hormone thyrotropin (TSH); the complex inverse relationship between them renders TSH the more sensitive marker of thyroid status4. Even small differences in TSH and FT4, within the normal population reference range, are associated with a wide range of clinical parameters, including blood pressure, lipids and cardiovascular mortality, as well as obesity, bone mineral density and lifetime cancer risk5.
Twin and family studies estimate the heritability of TSH and FT4 as up to 65%6. Genome-wide association studies (GWAS) identified common variants associated with TSH and FT47,8,9; in a recent HapMap-based meta-analysis10, we identified 19 loci associated with TSH and 4 with FT4. However, these accounted for only 5.6% of the variance in TSH and 2.3% in FT4. Therefore, most of the heritability of these important traits remains unexplained.
The unidentified genetic component of variance might be explained by common variants poorly tagged by markers assessed in previous studies, or those with small effects. However, rarer variants within the minor allele frequency (MAF) spectrum might also account for a substantial proportion of the missing heritability as has been proposed for many polygenic traits11. These variants, although individually rare (MAF<1%), are collectively frequent, and while their effects may be insufficient to produce clear familial aggregation, effect sizes for individual variants are potentially much greater than those observed for common variants. In addition, a greater understanding of the relative proportion of thyroid function explained by common variants is now possible with the availability of whole-genome sequencing (WGS) and this is essential to refine future research and analysis strategies when appraising the genetic architecture of thyroid function.
In this study, the first to utilize WGS to examine the genetic architecture of TSH and FT4, we perform single-point association analysis in two discovery cohorts in the UK10K project with WGS data available and a meta-analysis using genome wide association data (GWAS) with deep imputation from five additional data sets. We report three new loci associated with thyroid function in healthy individuals, undertake quantitative trait loci and DNA methylation analyses to further study these relationships and undertake genome-wide complex trait analyses (GCTA)12 to assess the contributions of common variants (MAF≥1%) to variance in thyroid function. We also explore whether there is a shared polygenic basis between TSH and FT4. In individuals with WGS data, we perform sequence kernel-based association testing (SKAT) analysis to identify regions of the genome where rare variants have the strongest association with thyroid function and identify a novel locus associated with FT4. The results demonstrate that WGS-based analyses can identify rare functional variants and associations derived from rare aggregates. Larger meta-analyses of studies with WGS data are now required to identify additional common and rare variants, which may explain the missing heritability of thyroid function.
Single-point association analysis
In the discovery study, using a meta-analysis of WGS data from the Avon Longitudinal Study of Parents and Children (ALSPAC) and TwinsUK cohorts (N=2,287) analysing up to 8,816,734 markers (Supplementary Tables 1 and 2; Supplementary Methods), we find associations at two previously described loci for TSH. These are NR3C2 (rs11728154; MAF=21.0%, B=0.21, s.e.=0.037, P=8.21 × 10−9; r2=0.99 with the previously reported rs10028213) and FOXE1 (rs1877431; MAF=39.5%, B=−0.19, s.e.=0.030, P=2.29 × 10−10; r2=0.99 with the previously reported rs965513). We find one borderline signal (between P=5.0 × 10−08 and P=1.17 × 10−08) at a novel locus FAM222A (rs11067829; MAF=18.3%, B=0.210, s.e.=0.038, P=3.73 × 10−8; Supplementary Figs 1a and 2; Supplementary Table 3). No variants show genome-wide significant association for FT4 (Supplementary Figs 1a and 3).
In a meta-analysis of the discovery cohorts and five additional cohorts, we find associations for 13 SNPs at 11 loci for TSH (N=16,335) of which 11 loci have been identified previously and 4 SNPs at 4 loci for FT4 (N=13,651) of which 3 have been identified previously (Table 1; Figs 1a–c,2a,b and 3; Supplementary Figs 1b and 3–6).
To determine whether our identified associations at established loci represented previous association signals, we analysed the linkage disequilibrium (LD) between the strongest associated variants from this study and those from our previous study10 (Supplementary Table 4). The top variants from loci in both studies were in strong LD (r2>0.6), apart from MBIP and FOXE1, although these were in strong LD with variants previously associated with TSH by others8. Two SNPs associated with TSH in our study are novel, one at SYN2 (rs310763; MAF=23.5%, B=0.082, s.e.=0.014, P=6.15 × 10−9; Fig. 1a–c). SYN2 is a member of a family of neuron-specific phosphoproteins involved in the regulation of neurotransmitter release with expression in the pituitary and hypothalamus (http://biogps.org/#goto=genereport&id=6854). We also identify one novel variant at PDE8B (MAF=10.4%, B=−0.145, s.e.=0.019, P=5.94 × 10−14) in linkage equilibrium (r2=0.002, D′=0.17) with the previously described variant rs6885099 (ref. 10) and independent from our top SNP rs2046045 (P=1.93 × 10−11) after conditional analysis. In the overall meta-analysis, we are unable to replicate the association between FAM222A and TSH in the discovery analysis (B=0.014, s.e.=0.015, P=0.378); however, we observe evidence of heterogeneity between cohorts (test for heterogeneity P=4.70 × 10−6; Supplementary Table 5), so potentially this locus may find support in future WGS studies.
In our meta-analysis, we also identify four SNPs associated with FT4, three at previously established loci (DIO1, LHX3 and AADAT; Table 1; Fig. 3; Supplementary Figs 1b, 4e and 6; Supplementary Table 4). We find a novel uncommon variant at B4GALT6/SLC25A52 associated with FT4 (rs113107469; MAF=3.20%, B=0.225, s.e.=0.037, P=1.27 × 10−9; Fig. 2a). B4GALT6 is in the ceramide metabolic pathway, which inhibits cyclic AMP production in TSH-stimulated cells. However, the B4GALT6 signal (rs113107469) is in weak LD (r2<0.1, D′=0.66) with the Thr139Met substitution (rs28933981; MAF=0.4%) and it may therefore be a marker for this functional change in TTR. The Thr139Met substitution was associated with FT4 levels in our single-point meta-analysis (P=2.14 × 10−11), however, was not originally observed as the MAF was lower than our 1% threshold. Conditional analysis of the TTR region using rs28933981 as the conditioning marker in the ALSPAC WGS cohort reveals no evidence of association between rs113107469 in B4GALT6 and FT4 (P=0.124; Fig. 2b). Analysis using direct genotyping in the ALSPAC WGS and replication cohorts confirms the effect of the Thr139Met substitution on FT4 levels. Here, 0.79% of children were heterozygous for the Thr139Met substitution, which is positively associated with FT4 (B=1.70, s.e.=0.17, 95% CI 1.37, 2.03, P=3.89 × 10−24). In the ALSPAC replication data set, rs113107469 in B4GALT6 was also positively associated with FT4 (P=0.0002); however, when conditioned on the Thr139Met substitution there was no longer any evidence of association (P=0.20). The Thr139Met substitution also appears to be functional: this mutation has increased protein stability compared with wild-type transthyretin (TTR)13,14 and tighter binding of thyroxine14, resulting in a twofold increase in thyroxine-binding affinity15,16. Further details of the likely genes related to all our observed independent novel signals are shown in Supplementary Table 6.
Expression quantitative trait locus analysis
Expression quantitative trait locus (eQTL) analysis17,18 reveals that our SYN2 variant modulates SYN2 transcription in adipose, skin and whole-blood cells, but not lymphoblastoid cell lines (Supplementary Table 7). Furthermore, bioinformatics analysis suggests that the C-allele at rs310763 attenuates an EGR1 regulatory motif19. EGR1 is expressed in thyrocytes, regulates pituitary development20,21 and may influence thyroid status via LHX3 promotor activity20. Several other variants in the SYN2 gene region are in strong LD (r2>0.8) with rs310763, including the non-synonymous coding variant rs794999. Although predicted to be benign (PolyPhen-2 score=0.002 (ref. 22)), rs794999 is located in a DNase hypersensitivity cluster23, influences four predicted regulatory motifs19 and appears to be under evolutionary constraint24,25. SNPs identified in our study, or those in LD, also showed strong eQTL associations with PDE8B (P=8.69 × 10−27), FOXE1 (P=9.10 × 10−54) and AADAT (P=7.86 × 10−9) gene expressions (Supplementary Table 7).
DNA methylation analysis
To further explore cis-regulatory effects of variants identified in our study, we carried out analysis of DNA methylation profiles in whole-blood samples in 279 individuals from the TwinsUK cohort. We find evidence for a methylation quantitative trait locus (meQTL) at the novel TSH-associated variant rs2928167 in PDE8B (P=4.38 × 10−7; Supplementary Table 8), which are also eQTLs in multiple tissues (Supplementary Table 7). Recently, meQTL effects using the same probe (cg16418800) in adipose tissue also identified a peak signal at rs2359775 (P=6 × 10−15), which is in LD with rs2928167 (r2=0.5). We find that variants in ABO (P=2.02 × 10−23) and AADAT (P=1.80 × 10−8) also show strong evidence for cis-meQTL effects (Supplementary Table 8). In additional analyses in 745 ALSPAC children, we find strong meQTL associations for rs2359775 in PDE8B (P=3.03 × 10−28) and variants in ABO (P=1.01 × 10−101) and AADAT (P=4.18 × 10−34) (Supplementary Table 8).
Tests of the association between aggregates of rare variants (MAF<1%) in the WGS cohorts were restricted to genes relevant to thyroid function. We find no evidence of association from SKAT analyses with TSH, however, for FT4 we identify one SKAT bin with multiple-testing-corrected evidence for association (P≤1.55 × 10−5) in NRG1 (P=2.53 × 10−6; Fig. 4; Supplementary Table 9). NRG1 is a glycoprotein that interacts with the NEU/ERBB2 receptor tyrosine kinase, and is critical in organ development.
GCTA and polygenic score analysis
SNPs were thinned to a set of 2,203,581 approximately independent SNPs with an LD threshold of r2>0.2, a window size of 5,000 SNPs and step of 1,000 SNPs. A genomic relationship matrix was then generated for unrelated individuals. We fitted linear mixed-effect models and estimate that all assessed common SNPs (MAF>1%) explain 24% (95% CI 19, 29) and 20% (95% CI 14, 26) of TSH and FT4 variance, respectively (P≤0.0001; Supplementary Table 10). Polygenic score analyses21 based on SNPs with P values under a fixed threshold do not detect evidence of a polygenic signal for TSH or FT4, nor of a shared polygenic basis between thyroid function and key metabolic outcomes. However, a genetic score based on 67 SNPs previously associated with thyroid function in GWAS8,10,26 shows strong evidence of association with TSH (P=7.9 × 10−20) and FT4 (P=2.7 × 10−4) and we observe evidence of shared genetic pathways with TSH associated with the FT4 gene score (P=7.0 × 10−4). These 67 SNPs explain 7.1% (95% CI 5.2, 9.0) of the variance in TSH and 1.9% (95% CI 1.1, 3.0) of the variance in FT4. Taken together, this suggests that many loci underlying thyroid function remain unknown.
We undertook a database analysis of differential gene expression in cultured cells in response to hormone stimulation. We find SYN2 (rank 64 of 22283 (HL60 cells)) rates highest among the genes studied in the experiment, providing strong support for the role of this newly discovered locus in thyroid metabolism. Two other genes, NRG1 and CAPZB, also show evidence of levothyroxine responsiveness in at least one cell line27 on the basis of a genome-wide differential expression and rank in the top 5th percentile (Supplementary Table 11). Publicly available data on altered SYN2 expression in brain, limb and tail from control and levothyroxine-treated Xenopus laevis during metamorphosis also provide evidence for the relevance of SYN2 in thyroid function28.
In this study, we demonstrate the utility of WGS data (and SNP array data when deeply imputed to WGS reference panels) in appraising the genetic architecture of thyroid function. Using WGS data, we identify a rare functional variant in TTR that appears to drive the observed association between an uncommon novel variant near B4GALT6 and FT4, and we demonstrate a novel association with FT4 arising from rare aggregates in NRG1. We also show that common variants collectively account for over 20% of the variance in TSH and FT4, a substantial advance on using only the ‘top SNPs’ from earlier GWA studies10. Taken together, this work indicates that both common variants with modest effects and rare variants with larger effects might explain a substantial proportion of the missing heritability of thyroid function, but larger studies are required to identify these variants. Studies including individuals with subclinical thyroid disease, particularly those who are negative for thyroid autoantibodies, may be particularly rewarding, as rare genetic variants with large effect sizes may be associated with serum TSH and FT4 concentrations outside the inclusion ranges we used and therefore would not be detected in our analyses.
Such endeavours are clinically relevant, as there has been a dramatic increase in levothyroxine prescribing for borderline TSH levels29. At least three loci identified in this study show evidence of responsiveness to levothyroxine in cell line models, underscoring that borderline TSH levels often reflect the influence of genetic variation rather than overt autoimmune thyroid disease, in which case thyroid hormone replacement may not be appropriate. Our results indicate that further investigation of TSH heterogeneity at the population level is necessary.
Seven populations were used in this study. They are known as the TwinsUK WGS cohort, the TwinsUK GWAS cohort, the ALSPAC WGS cohort, the ALSPAC GWAS cohort, the SardiNIA cohort, the ValBorbera cohort and the Busselton Health Study cohort. Summary statistics of each cohort and full descriptions are given in Supplementary Methods, Supplementary Tables 1 and 2. All human research was approved by the relevant institutional ethics committees.
WGS data generation
Low-read depth WGS was performed in the TwinsUK and ALSPAC as part of the UK10K project. The SardiNIA cohort also had WGS data available (see Supplementary Methods).
An inverse normal transformation was applied to each trait within each cohort. Age, age2, gender and any other cohort-specific variables (Supplementary Table 1) were applied as covariates. Genotype imputation was performed for relevant cohorts using the IMPUTE30, MaCH31 or Minimac32 software packages, with poorly imputed variants excluded. See Supplementary Table 1 for cohort-specific details.
Single-point association analysis
Association analysis within each cohort was performed using the SNPTEST v2 (ref. 33), GEMMA (genome-wide efficient mixed model association)34, EPACTS (efficient and parallelizable association container toolbox) or ProbABEL35 software packages. Cohort-specific quality control filters relating to call rate and Hardy–Weinberg equilibrium were applied (Supplementary Table 1). In our analysis, we assessed the change in standardized thyroid measure by allele using a MAF threshold ≥1% and a genome-wide significance threshold of P=1.17 × 10−08 (ref. 36). Meta-analyses were performed using the GWAMA (genome-wide association meta analysis) software37, which was used to perform fixed-effect meta-analyses using estimates of the allelic effect size and s.e. Two meta-analyses were performed for each phenotype: a meta-analysis of the two UK10K WGS cohorts and a meta-analysis of all seven cohorts. The ValBorbera cohort does not have FT4 phenotype data, so this cohort was not included in the meta-analyses for this phenotype. In the meta-analyses, any variants that were missing from >2 cohorts or with a combined MAF ≤1% were excluded. However, in the discovery analyses, a MAF of 0.5% in either cohort was accepted to prevent marginal MAF dropouts; the MAF <1% exclusion was then applied during the meta-analysis.
A conditional analysis was performed to identify independent association signals. Each study re-analysed significant loci using the lead SNP identified in the primary analysis (Table 1) as the conditioning marker. In cohorts where the lead SNP was not available, the best proxy was included (r2>0.8). A meta-analysis was then performed on these conditional results, using the same methods and filters as described above. The standard genome-wide significant cut-off (P<5 × 10−8) was used to identify secondary associations.
Estimation of phenotypic variance explained by genetic variants
We undertook GCTA using WGS data in the ALSPAC and TwinsUK discovery cohorts and data from the SardiNIA and Busselton cohorts to estimate the variance explained by all common SNPs (MAF>1%) in the genome for TSH and FT4, using the GCTA method of Yang et al.12 We fitted linear mixed-effect models to estimate the phenotypic variance attributable to the common SNPs (hg2). In these data sets, SNPs were thinned to a set of 2,203,581 approximately independent SNPs using the –indep-pairwise option in PLINK with an LD threshold of r2>0.2, window size of 5,000 SNPs and step of 1,000 SNPs. A genomic relationship matrix was generated for unrelated individuals, namely, those with genomic correlation <0.025. Estimates were calculated on SNPs filtered for Hardy–Weinberg equilibrium P value ≥1 × 10−6 and MAF ≥0.01. The genetic and residual variance components were estimated by the restricted maximum likelihood (REML) procedure for different MAF thresholds and for SNPs within a 250 kb window of known markers of thyroid function.
Expression quantitative trait loci analysis
Data for this study were available from a large-scale genetic association study of human gene expression traits in multiple disease-targeted tissue samples including subcutaneous fat, lymphoblastoid cell lines and whole skin, derived from 856 monozygotic (MZ) and dizygotic (DZ) female twins from the TwinsUK cohort, as part of the MuTHER project18. We interrogated only lead SNPs (or proxies in LD, r2>0.8) using Genevar software17. For whole-blood eQTL studies, samples were obtained from a large population-based study38. The whole-blood eQTL results were downloaded from the GTex Browser at the Broad Institute on 26 November 201339. We identified alias rsIDs for significant index SNPs using JLIN software and UK10K WGS data. Associations at P<1 × 10−3 were considered significant.
DNA methylation analysis
DNA methylation profiles were obtained in whole-blood samples from 279 MZ and DZ twins from the TwinsUK cohort using the Illumina Infinium HumanMethylation450 BeadChip. Illumina beta values were quantile normalized to a standard normal distribution and corrected for chip, order of the sample on the chip, bisulfite-converted DNA concentration and age. The resulting values were used for meQTL analysis, which was performed separately in two samples, first in 149 unrelated individuals from the TwinsUK WGS sample and second in 130 individuals with deeply imputed data from the TwinsUK GWAS sample. MeQTL analysis was performed for each sample in PLINK by fitting an additive model and meta-analysis across the two samples was performed in GWAMA, where we considered results without strong evidence for heterogeneity (Cochran’s Q P>0.05 and I2<0.7). We analysed genotype data at 17 sequence variants (from Table 1), where for each variant meQTL analysis was performed with all DNA methylation array CpG sites located within 50 kb of the variant, resulting in 265 pair-wise tests. MeQTL results (Supplementary Table 8) are presented for variants with nominally significant associations in both the WGS and GWAS samples less than a meta-analysis P-value of 1 × 10−04. In the PDE8B gene, we also considered meQTL effects at the eQTL rs251429 (Supplementary Table 7) and found nominally significant association with DNA methylation at CpG site cg16461538 (B=−0.18, s.e.=0.08, P=0.02). We assessed the association between DNA methylation levels at the CpG sites identified to harbour meQTLs in our study (Supplementary Table 8) and TSH and FT4 levels. Using the same study design as that adopted in the meQTL analysis, we obtained no nominally significant association between DNA methylation at the 11 CpG sites (Supplementary Table 8) for TSH or FT4 levels. Subsequent replication of meQTL associations observed in TwinsUK was performed in the ALSPAC cohort for which DNA methylation profiles from whole blood were available in 745 individuals. Here, data were rank transformed to follow the normal distribution and then regressed against batch number. Analyses were also performed using PLINK, adjusting for age, sex, top 10 PCs (genetic) and houseman-estimated cell counts (to account for cellular heterogeneity).
Rare variant analysis
We conducted GWAS candidate gene (AADAT, ABO, B4GALT6, CAPNS2, CAPZB, DIO1, DIRC3, ELK3, FBXO15, FGF7, FOXA2, FOXE1, GLIS3, HACE1, IGFBP2, IGFBP5, INSR, ITPK1, LHX3, LOC440389/LOC102467146, LPCAT2, MAF, MBIP, MIR1179, NETO1, NFIA, NKX2-3, NR3C2, NRG1, PDE10A, PDE8B, PRDM11, RAPGEF5, SASH1, SIVA1, SLC25A52, SOX9, SYN2 TMEM196, TPO, TTR, VAV3, VEGFA)-based analyses to test for association of the combined effects of rare variants on TSH and FT4 using SKAT-O software40. This approach maximizes statistical power by applying both burden-based and SKATs. We used the TwinsUK and ALSPAC WGS data to examine loci with a known association with TSH and FT4. We examined all SNPs within the candidate gene regions, including variants within 50 kb on either side of the gene with MAF <1% down to a MAF of 0.04% (in a cohort), or 0.02% (overall). These analyses used sequential non-overlapping windows each containing 50 SNPs. Association at P<1.55 × 10−5 (Bonferroni corrected) was considered significant. For the meta-analysis of rare variant data from the WGS cohorts, we used SkatMeta41.
Polygenic score analysis
We conducted polygenic score analyses to test for substantive polygenic effects on TSH and FT4 and for a shared polygenic basis between thyroid traits and a range of related phenotypes including key cardiovascular traits, metabolic, anthropometric, endocrine and bone traits. Polygenic scores have been used to summarize genetic effects for an ensemble of markers that may not individually achieve significance but are relevant to regulation of the trait. The composite score represents an overall genetic signal and can then be used to obtain evidence of a common genetic basis for related disorders42. We ranked SNPs by their marginal association with TSH and FT4 using the meta-analysis data set, with TwinsUK samples excluded (leaving N=13,874 for TSH and N=12,561 for FT4). SNPs were thinned to a set of 2,203,581 approximately independent SNPs using the –indep-pairwise option in PLINK with an LD threshold of r2>0.2, window size of 5,000 SNPs and step of 1,000 SNPs. On the basis of their associations in the meta-analysis data, SNPs were selected for constructing polygenic scores according to a range of P value thresholds. Scores were then constructed for subjects in the TwinsUK data sets by forming the weighted sum of trait-increasing alleles, with the weights taken as the effect size in the meta-analysis data. To construct polygenic scores, we used 67 SNPs (rs10028213, rs10030849, rs10032216, rs10420008, rs10499559, rs10519227, rs10799824, rs10917469, rs10917477, rs11103377, rs113107469, rs11624776, rs116552240, rs116909374, rs11694732, rs11726248, rs11755845, rs12410532, rs13015993, rs1537424, rs1571583, rs17020124, rs17723470, rs17776563, rs2046045, rs2235544, rs2396084, rs2439302, rs28435578, rs2928167, rs3008034, rs3008043, rs310763, rs334699, rs334725, rs34269820, rs3813582, rs4704397, rs4804416, rs56738967, rs6082762, rs61938844, rs6499766, rs6885099, rs6923866, rs6977660, rs7128207, rs7190187, rs7240777, rs729761, rs73362602, rs73398284, rs737308, rs753760, rs7568039, rs7694879, rs7825175, rs7860634, rs7864322, rs7913135, rs9322817, rs944289, rs9472138, rs9497965, rs965513, rs966423 and rs9915657) that have been shown to be associated with thyroid hormone levels8,10,26. The polygenic score was then tested for association with relevant thyroid and other phenotypes in the TwinsUK sample.
To identify putative thyroxine-responsive genes among the candidate loci (AADAT, ABO, B4GALT6, CAPZB, DIO1, FOXE1, IGFBP2, LHX3, MAF, MBIP, MFAP3L, NR3C2, NRG1, PDE10A, PDE8B, QSOX2, SLC25A52, SYN2, TTR and VEGFA), gene expression data measured in response to levothyroxine treatment in a range of cell lines were retrieved from the Connectivity Map resource27. We considered a genome-wide differential expression rank in the top 5th percentile among 22,283 probes as evidence of differential expression.
How to cite this article: Taylor, P. N. et al. Whole-genome sequence-based analysis of thyroid function. Nat. Commun. 6:5681 doi: 10.1038/ncomms6681 (2015).
Dumont, J. et al. Ontogeny, anatomy, metabolism and physiology of the thyroid. Thyroid Dis. Manag Available at http://www.thyroidmanager.org/chapter/ontogeny-anatomy-metabolismand-physiology-of-the-thyroid (2011).
Haddow, J. E. et al. Maternal thyroid deficiency during pregnancy and subsequent neuropsychological development of the child. New Engl. J. Med. 341, 549–555 (1999).
Vanderpump, M. P. The epidemiology of thyroid disease. Br. Med. Bull. 99, 39–51 (2011).
Hadlow, N. C. et al. The relationship between TSH and free T4 in a large population is complex and nonlinear and differs by age and sex. J. Clin. Endocrinol. Metab. 98, 2936–2943 (2013).
Taylor, P. N., Razvi, S., Pearce, S. H. & Dayan, C. M. A review of the clinical consequences of variation in thyroid function within the reference range. J. Clin. Endocrinol. Metab. 98, 3562–3571 (2013).
Panicker, V. et al. Heritability of serum TSH, free T4 and free T3 concentrations: a study of a large UK twin cohort. Clin. Endocrinol. (Oxf.) 68, 652–659 (2008).
Arnaud-Lopez, L. et al. Phosphodiesterase 8B gene variants are associated with serum TSH levels and thyroid function. Am. J. Hum. Genet. 82, 1270–1280 (2008).
Gudmundsson, J. et al. Discovery of common variants associated with low TSH levels and thyroid cancer risk. Nat. Genet. 44, 319–322 (2012).
Panicker, V. et al. A locus on chromosome 1p36 is associated with thyrotropin and thyroid function as identified by genome-wide association study. Am. J. Hum. Genet. 87, 430–435 (2010).
Porcu, E. et al. A meta-analysis of thyroid-related traits reveals novel loci and gender-specific differences in the regulation of thyroid function. PLoS Genet. 9, e1003266 (2013).
Bodmer, W. & Bonilla, C. Common and rare variants in multifactorial susceptibility to common diseases. Nat. Genet. 40, 695–701 (2008).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Alves, I. L. et al. Thyroxine binding in a TTR Met 119 kindred. J. Clin. Endocrinol. Metab. 77, 484–488 (1993).
Sebastiao, M. P., Lamzin, V., Saraiva, M. J. & Damas, A. M. Transthyretin stability as a key factor in amyloidogenesis: X-ray analysis at atomic resolution. J. Mol. Biol. 306, 733–744 (2001).
Curtis, A. J. et al. Thyroxine binding by human transthyretin variants: mutations at position 119, but not position 54, increase thyroxine binding affinity. J. Clin. Endocrinol. Metab. 78, 459–462 (1994).
Hamilton, J. A. & Benson, M. D. Transthyretin: a review from a structural perspective. Cell. Mol. Life Sci. 58, 1491–1521 (2001).
Yang, T. P. et al. Genevar: a database and Java application for the analysis and visualization of SNP-gene associations in eQTL studies. Bioinformatics 26, 2474–2476 (2010).
Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).
Ward, L. D. & Kellis, M. HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants. Nucleic Acids Res. 40, D930–D934 (2012).
Yaden, B. C., Garcia, M. 3rd, Smith, T. P. & Rhodes, S. J. Two promoters mediate transcription from the human LHX3 gene: involvement of nuclear factor I and specificity protein 1. Endocrinology 147, 324–337 (2006).
Savage, J. J., Yaden, B. C., Kiratipranon, P. & Rhodes, S. J. Transcriptional control during mammalian anterior pituitary development. Gene 319, 1–19 (2003).
Adzhubei, I. A. et al. A method and server for predicting damaging missense mutations. Nat. Methods 7, 248–249 (2010).
Bernstein, B. E. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++. PLoS Comput. Biol. 6, e1001025 (2010).
Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).
Medici, M. et al. A large-scale association analysis of 68 thyroid hormone pathway genes with serum TSH and FT4 levels. Eur. J. Endocrinol. 164, 781–788 (2011).
Lamb, J. et al. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929–1935 (2006).
Das, B. et al. Gene expression changes at metamorphosis induced by thyroid hormone in Xenopus laevis tadpoles. Dev. Biol. 291, 342–355 (2006).
Taylor, P. N. et al. Falling threshold for treatment of borderline elevated thyrotropin levels—balancing benefits and risks: evidence from a large community-based study. JAMA Intern. Med. 174, 32–39 (2013).
Howie, B. N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Li, Y., Willer, C. J., Ding, J., Scheet, P. & Abecasis, G. R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, 816–834 (2010).
Howie, B., Fuchsberger, C., Stephens, M., Marchini, J. & Abecasis, G. R. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nat. Genet. 44, 955–959 (2012).
Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).
Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).
Aulchenko, Y. S., Struchalin, M. V. & van Duijn, C. M. ProbABEL package for genome-wide association analysis of imputed data. BMC Bioinformatics 11, 134 (2010).
Xu, C. et al. Estimating genome-wide significance for whole-genome sequencing studies. Genet. Epidemiol. 38, 281–290 (2014).
Magi, R. & Morris, A. P. GWAMA: software for genome-wide association meta-analysis. BMC Bioinformatics 11, 288 (2010).
Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).
GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Voorman, A., Brody, J. & Lumley, T. SkatMeta: an R package for meta analyzing region-based tests of rare DNA variants. Available at (http://cran.r-project.org/web/packages/skatMeta (2013).
Dudbridge, F. Power and predictive accuracy of polygenic risk scores. PLoS Genet. 9, e1003348 (2013).
We are grateful to all the participants in the cohort studies and the staff involved including interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists and nurses. This study makes use of the data generated by the UK10K Consortium. Funding for UK10K was provided by the Wellcome Trust under award WT091310. A full list of the investigators who contributed to the generation of the data is available at www.UK10K.org. Further acknowledgements from all the cohorts and details on cohort and investigator funding can be found in the Supplementary Methods.
The authors declare no competing financial interests.
About this article
Cite this article
Taylor, P., Porcu, E., Chew, S. et al. Whole-genome sequence-based analysis of thyroid function. Nat Commun 6, 5681 (2015). https://doi.org/10.1038/ncomms6681
Nature Reviews Endocrinology (2020)
Human Genetics (2020)
The Journal of Clinical Endocrinology & Metabolism (2020)