Analysis of predicted loss-of-function variants in UK Biobank identifies variants protective for disease

Less than 3% of protein-coding genetic variants are predicted to result in loss of protein function through the introduction of a stop codon, frameshift, or the disruption of an essential splice site; however, such predicted loss-of-function (pLOF) variants provide insight into effector transcript and direction of biological effect. In >400,000 UK Biobank participants, we conduct association analyses of 3759 pLOF variants with six metabolic traits, six cardiometabolic diseases, and twelve additional diseases. We identified 18 new low-frequency or rare (allele frequency < 5%) pLOF variant-phenotype associations. pLOF variants in the gene GPR151 protect against obesity and type 2 diabetes, in the gene IL33 against asthma and allergic disease, and in the gene IFIH1 against hypothyroidism. In the gene PDE3B, pLOF variants associate with elevated height, improved body fat distribution and protection from coronary artery disease. Our findings prioritize genes for which pharmacologic mimics of pLOF variants may lower risk for disease.


Supplementary
HLA-DQB1 encodes the beta 1 subunit of the class II major histocompatibility complex, present on antigen-presenting cells to display antigens. Polymorphisms in this gene are associated with a range of autoimmune disorders. 8 IL33 IL33 encodes interleukin-33, a cytokine that is an inducer of T helper 2 cell responses. Injection of IL33 into mice induces of eosinophilia. Expression of IL33 is elevated among individuals with severe asthma. 9

GPR151
GPR151 encodes a G-protein coupled receptor of unknown function whose expression seems limited to the DAP encodes death-associated protein 1, a small cytosolic mediator of interferon-gamma induced cell death. 20 Reduced expression of death-associated protein 1 in tumor tissues has been associated with an increased risk of death in colorectal cancer patients 21 and in breast cancer patients. 22

TRIM40
TRIM40 encodes tripartite motif containing 40, a member of the tripartite motif containing family which have been reported to have roles in ubiquitination. TRIM40 has been reported to prevent inflammation in the gastrointenstinal tract by inhibiting nuclear factor-kappaB. 23 MICA MICA encodes MHC class I polypeptide-related sequence A, a protein that is highly expressed on the surface of intestinal epithelial cells during stress. 24 By binding to receptors on T cells and natural killer cells, MICA promotes a cytolytic response, thus inducing an anti-tumor response when expressed on tumor cells. 24 MICA has also been reported to be expressed by intestinal epithelial during cytomegalovirus infection 25 and in the intestinal epithelial of active celiac disease patients. 26

PDE3B
PDE3B encodes the gene phosphodiesterase 3B, an adipocyte-expressed enzyme that hydrolyzes cyclic adenosine monosphosphate (cAMP) and inhibits lipolysis in response to insulin binding to the insulin receptor. 27 PDE3B knockout mice have been reported to have reduced aortic atherosclerosis and markers of inflammation. 28 Furthermore, PDE3B knockout reduced infarct size in a mouse coronary artery ligation model of myocardial infarction. 29 Of note, cilostazol is an approved medicine that is a non-selective pharmacologic inhibitor of both phosphodiesterase 3B and the related isoform phosphodiesterase 3A. 29 In a small 211 participant randomized trial, cilostazol significantly reduced restenosis after percutaneous coronary balloon angioplasty. 30 APOLD1 APOLD1 encodes apolipoprotein L domain containing 1, a protein of unclear function. Apolipoprotein L domain containing 1 is expressed in vascular endothelial cells and has been reported to been induced in the endothelium by electrical or chemical stimulation. 31

IFIH1
IFIH1 encodes interferon induced with helicase C domain 1, a cytoplasmic receptor that induces interferon signaling upon binding to viral RNA. 32 Gain of function mutations in IFIH1 cause Aicardi-Goutières syndrome, a rare genetic disorder characterized by lymphocytosis in the cerebral spinal fluid and cerebral atropy. 32 Common variants in the IFIH1 locus have previously been identified as associated with psoriasis 33 and vitiligo 34  EGFL8 encodes epidermal growth factor like 8, a gene of unknown function. GEM GEM encodes a GTP binding protein induced in T-cells in response to mitogen stimulation. 38 GEM has been reported to down regulate voltage gated calcium channel activity. 39 The physiologic function of GEM is unclear. PYGM PYGM encodes glycogen phosphorylase, an enzyme expressed in skeletal muscle that hydrolyzes glycogen into glucose-1-phosphate for energy during exercise. 40 Homozygous loss of function variants in PYGM cause McArdle disease, which is characterized by the presence of muscle pain and weakness upon exercise that is immediately relieved upon stopping and the absence of lactate formation during exercise. 41 Supplementary