A genomic data archive from the Network for Pancreatic Organ donors with Diabetes

Perry, Daniel J.; Shapiro, Melanie R.; Chamberlain, Sonya W.; Kusmartseva, Irina; Chamala, Srikar; Balzano-Nogueira, Leandro; Yang, Mingder; Brant, Jason O.; Brusko, Maigan; Williams, MacKenzie D.; McGrail, Kieran M.; McNichols, James; Peters, Leeana D.; Posgai, Amanda L.; Kaddis, John S.; Mathews, Clayton E.; Wasserfall, Clive H.; Webb-Robertson, Bobbie-Jo M.; Campbell-Thompson, Martha; Schatz, Desmond; Evans-Molina, Carmella; Pugliese, Alberto; Concannon, Patrick; Anderson, Mark S.; German, Michael S.; Chamberlain, Chester E.; Atkinson, Mark A.; Brusko, Todd M.

doi:10.1038/s41597-023-02244-6

Download PDF

Data Descriptor
Open access
Published: 26 May 2023

A genomic data archive from the Network for Pancreatic Organ donors with Diabetes

Daniel J. Perry ORCID: orcid.org/0000-0002-0607-2238¹^na1,
Melanie R. Shapiro ORCID: orcid.org/0000-0003-2090-0877¹^na1,
Sonya W. Chamberlain²,
Irina Kusmartseva ORCID: orcid.org/0000-0003-2614-0813¹,
Srikar Chamala¹,
Leandro Balzano-Nogueira¹,
Mingder Yang¹,
Jason O. Brant^1,3,
Maigan Brusko¹,
MacKenzie D. Williams¹,
Kieran M. McGrail¹,
James McNichols¹,
Leeana D. Peters¹,
Amanda L. Posgai ORCID: orcid.org/0000-0002-9491-0958¹,
John S. Kaddis⁴,
Clayton E. Mathews^1,5,
Clive H. Wasserfall¹,
Bobbie-Jo M. Webb-Robertson ORCID: orcid.org/0000-0002-4744-2397^1,6,
Martha Campbell-Thompson^1,7,
Desmond Schatz⁵,
Carmella Evans-Molina⁸,
Alberto Pugliese⁹,
Patrick Concannon^1,10,
Mark S. Anderson²,
Michael S. German²,
Chester E. Chamberlain²,
Mark A. Atkinson^1,5 &
…
Todd M. Brusko ORCID: orcid.org/0000-0003-2878-9296^1,5

Scientific Data volume 10, Article number: 323 (2023) Cite this article

1915 Accesses
2 Citations
10 Altmetric
Metrics details

Subjects

Abstract

The Network for Pancreatic Organ donors with Diabetes (nPOD) is the largest biorepository of human pancreata and associated immune organs from donors with type 1 diabetes (T1D), maturity-onset diabetes of the young (MODY), cystic fibrosis-related diabetes (CFRD), type 2 diabetes (T2D), gestational diabetes, islet autoantibody positivity (AAb+), and without diabetes. nPOD recovers, processes, analyzes, and distributes high-quality biospecimens, collected using optimized standard operating procedures, and associated de-identified data/metadata to researchers around the world. Herein describes the release of high-parameter genotyping data from this collection. 372 donors were genotyped using a custom precision medicine single nucleotide polymorphism (SNP) microarray. Data were technically validated using published algorithms to evaluate donor relatedness, ancestry, imputed HLA, and T1D genetic risk score. Additionally, 207 donors were assessed for rare known and novel coding region variants via whole exome sequencing (WES). These data are publicly-available to enable genotype-specific sample requests and the study of novel genotype:phenotype associations, aiding in the mission of nPOD to enhance understanding of diabetes pathogenesis to promote the development of novel therapies.

KiT-GENIE, the French genetic biobank of kidney transplantation

Article 03 February 2023

An accessible insight into genetic findings for transplantation recipients with suspected genetic kidney disease

Article Open access 02 July 2021

Every islet matters: improving the impact of human islet research

Article 11 August 2022

Background & Summary

Genetic predisposition to risk for or protection from type 1 diabetes (T1D) is highly polygenic, with the total possible set of disease-associated variants yet to be fully defined¹. Genome-wide association studies (GWAS) have identified population-level risk loci (minor allele frequency (MAF) >1%), dominated by Human Leukocyte Antigen (HLA) class II and insulin, and accompanied by 77 additional regions, which in total cover over 3,600 predicted causal moderate effect size variants (odds ratio (OR) <2) associated with genes thought to impact leukocyte and pancreatic β-cell function^2,3. While a combination of many such population-level variants may contribute to the development of “classical” T1D^2,3 and latent autoimmune diabetes in adults (LADA)⁴, we are beginning to appreciate that rare (MAF ≤1%) larger effect size (OR ≥2) variants may explain the “missing heritability” of autoimmune diabetes⁵ (Fig. 1a). In support of this notion, rare variants with large effect size are associated with monogenic autoimmune forms of diabetes including immune dysregulation, polyendocrinopathy, enteropathy, X-linked syndrome (IPEX), signal transducer and activator of transcription 3 (STAT3)-, and cytotoxic T-lymphocyte protein 4 (CTLA4)-associated diabetes⁶, in addition to non-autoimmune forms of diabetes such as maturity-onset diabetes of the young (MODY) and cystic fibrosis-related diabetes (CFRD)⁷.

The Network for Pancreatic Organ donors with Diabetes (nPOD)⁸, founded in 2007, has become the largest biorepository of human pancreata, pancreatic lymph nodes, and spleen from organ donors with T1D, MODY, CFRD, T2D, gestational diabetes, non-diabetic islet autoantibody positive (AAb+) donors, and non-diabetic autoantibody-negative (AAb-) control donors⁹. nPOD provides worldwide distribution of biospecimens to researchers working to elucidate T1D pathogenesis in order to promote the development of new strategies for prevention and treatment. To date (September 2022), nPOD has supplied biosamples to > 280 independent research projects studying β-cell physiology, β-cell differentiation, immunology, T1D biomarkers, technology development, T1D pathology, and diabetes etiology (https://www.jdrfnpod.org/publications/current-npod-projects/, accessed October 21, 2021). A major goal of nPOD, in addition to biosample distribution, is the sharing of de-identified donor data from multiple core laboratories to facilitate discovery efforts. The nPOD Data Portal provides approved investigators with access to donor clinical and demographic information, serum HbA1c and C-peptide levels, islet AAb status (insulin, glutamic acid decarboxylase [GAD], insulinoma-associated antigen-2 [IA-2], zinc transporter 8 [ZnT8])¹⁰, pancreas weights, and histopathology reviews^8,11 (https://portal.jdrfnpod.org/, accessed October 21, 2022). Whole slide scans from hematoxylin and eosin-stained (H&E) sections are available for online viewing via the Online Pathology portal (https://aperioeslide.ahc.ufl.edu/, accessed October 21, 2022) for access to cross-sectional pancreas morphology as well as multiplex immunohistochemistry (IHC)-stained sections for insulin, glucagon, somatostatin, and pancreatic polypeptide (PP) to visualize endocrine β-, α-, δ-, and PP cells, respectively. Multiplex IHC staining panels are also available for markers including, but not limited to, Ki67, CD3, insulin, and/or glucagon for quantification of cell proliferation and immune cell infiltration in pancreatic endocrine and exocrine compartments. Histopathology reports summarizing blinded assessment of H&E- and IHC-stained sections from pancreas and other available organs are provided to detail islet parameters and major abnormalities. In terms of genetic data, the standard operating procedures (SOPs) for nPOD donors were previously limited to the collection of high-resolution four-digit HLA genotypes^8,12. The availability of additional high-parameter genotyping data has therefore been a high priority that is now realized with the data release described herein.

Our approach for characterizing nPOD donor genetics was twofold: donors were genotyped with 1) the University of Florida Diabetes Institute (UFDI) custom single nucleotide polymorphism (SNP) microarray (UFDIchip)¹³ and 2) the University of California San Francisco (UCSF) standardized whole exome sequencing (WES) pipeline¹⁴. Specifically, nPOD cases (N = 372)— comprised of AAb- no diabetes controls (N = 147), AAb + without T1D (N = 26), T1D (N = 111), T1D medalists^15,16 (N = 2), T1D recipients of pancreas transplant (N = 5), type 2 diabetes (T2D, N = 38), gastric bypass (N = 2), gestational diabetes (N = 4), monogenic diabetes (N = 4), cystic fibrosis (CF, N = 5), other diabetes (N = 12), other no diabetes (N = 12), and pregnant without diabetes (N = 4)— were genotyped using the UFDIchip¹³ custom Axiom^TM array (Fig. 1b). All nPOD donors with available DNA or tissue were evaluated for population-level variants via UFDIchip. We prioritized the selection of T1D, AAb + without T1D, gestational diabetes, monogenic diabetes, and other diabetes donors in addition to including a few no diabetes donors as controls for WES-based characterization of rare diabetes-associated variants that may not have been powered for detection by previous GWAS studies. Specifically, nPOD donors (N = 207)— including AAb- no diabetes controls (N = 13), AAb + without T1D (N = 34), T1D (N = 135), T1D recipients of pancreas transplant (N = 6), gestational diabetes (N = 4), monogenic diabetes (N = 4), and other diabetes (N = 11)— were queried for rare known and novel coding region variants in autoimmune and MODY-associated genes via WES¹⁴ (Fig. 1c). Data emanating from these assays were used to provide individual genotypes, infer relatedness¹⁷ and genetic ancestry¹⁸, impute HLA¹⁹, and calculate a combined T1D genetic risk score (GRS)^20,21,22 per donor.

These genotyping data have been generated and made accessible to enable genotype-selected sample requests and the study of novel genotype:phenotype associations by the international community of nPOD investigators. We anticipate that the diversity of nPOD donor genetics may be partly responsible for inter-donor heterogeneity observed in islet health, insulitis composition, age at T1D onset, islet AAb status, and other endotype-related characteristics^23,24,25. Importantly, beyond explaining diabetes heterogeneity, the findings facilitated by these data are expected to inform precision medicine strategies for prevention or suspension of the pathogenesis of T1D as well as other forms of diabetes.

Methods

Donor tissues

Transplant-quality organs, including pancreas and up to 13 other tissues, were recovered from cadaveric organ donors by United States (U.S.) organ procurement organizations (OPOs, http://www.jdrfnpod.org//for-partners/npod-partners/, accessed October 15, 2021) in accordance with federal guidelines, then processed by the nPOD Organ Processing and Pathology Core (OPPC) according to University of Florida (UF) Institutional Review Board (IRB) approved protocol IRB201600029, as previously described^8,11. Studies conducted using organ donor tissue samples from the nPOD biobank are classified as minimum risk research, as study participants are no longer living. However, informed consent for research participation is obtained from family members via both written and verbal communication prior to organ donation, with the consent processes undertaken by qualified personnel affiliated with the U.S. OPO network. All subject information is de-identified in accordance with HIPAA regulations. For each donor, clinical and demographic information, were obtained via medical chart review and OPO-conducted interview with the donor’s family. High-resolution four-digit HLA typing was performed by Next Generation Sequencing (NGS) as previously described^8,12 at the Barbara Davis Center for Childhood Diabetes HLA Core (University of Colorado Anschutz Medical Campus). nPOD donors were categorized by diabetes type, verified by UF endocrinologist review of the de-identified terminal medical records (including diagnosis and duration of diabetes, history or clinical data for diabetic ketoacidosis, medications, and BMI), donor metadata (e.g., age, sex, reported race and ethnicity), and additional data (serum C-peptide levels, islet AAb status¹⁰, hemoglobin A1c [HbA1c], and high-resolution HLA^8,12). Unique research resource identifiers (RRIDs) were assigned to each organ donor, in order to facilitate the provenance and reproducibility of results²⁶.

DNA isolation

DNA was extracted from frozen spleen or, for a limited number of cases in which spleen was unavailable, frozen pancreas, pancreatic lymph node, or small intestine were used. DNA isolation was performed using the Qiagen DNeasy Blood and Tissue DNA isolation kit according to the manufacturer’s instructions. Purity and concentration of extracted DNA were assessed with the Epoch Microplate Spectrophotometer (BioTek).

UFDIchip design

372 nPOD donors (Table 1, Phenotype_data.txt²⁷) were genotyped at 985,971 unique loci on a custom SNP array termed the UFDIchip¹³ (Fig. 1b). The base array is the Axiom^TM Precision Medicine Research Array (Thermo Fisher Scientific), to which all content from the ImmunoChip v2²⁸ was added, as well as all previously reported credible T1D risk variants³ (Fig. 2, UFDIchip_library_file.xlsx²⁷). The array also includes dense coverage of the highly polymorphic HLA region, which allows for accurate imputation of HLA haplotypes to 4-digit resolution.

Table 1 nPOD donors genotyped by UFDIchip.

Full size table

Genotype processing and analysis

UFDIchip plates were processed on an Affymetrix GeneTitan instrument with external sample handling on a BioMek FX dual arm robotic workstation. Axiom™ Analysis Suite software (v3.0, Thermo Fisher Scientific) was used to process raw CEL file data to plink text files. The software includes quality control (QC) procedures at the sample, plate, and SNP levels. These QC threshold parameters were set to Axiom™ Analysis Suite default stringency (“Best Practices Workflow” using “Human.legacy.v5” settings). Under these settings, samples were included in analysis if dish QC (DQC) ≥ 0.82 and if QC call rate ≥ 97%. Plates were considered acceptable for analysis if average QC call rate ≥ 98.5% for passing samples. Best probe set was identified per SNP, with the SNP call rate threshold set to 95%. A screen for discordance from reported sex via X chromosome heterozygosity was then performed using plink v1.9²⁹. All data passed these QC screens and raw CEL files and a binary plink file containing processed data (GRCh37/hg19) from all cases are stored in the database of Genotypes and Phenotypes (dbGaP)²⁷. Subsequent analyses included relatedness estimation using KING³⁰, genetic ancestry imputation using ADMIXTURE¹⁸, HLA imputation using Axiom^TM HLA Analysis Software¹⁹, imputation to 300 M SNPs and indels using the Trans-Omics for Precision Medicine (TOPMed) reference cohort with the Michigan Imputation Server³¹, and calculation of a T1D GRS^21,22.

Validation of technical replicates

DNA from 24 nPOD donors were run in duplicate on the UFDIchip. SNP call rates were compared between technical replicates using Bland-Altman analysis. Reproducibility of genotype calls between technical replicates were evaluated by kinship coefficient using KING³⁰ (v2.1.2) software.

Relatedness

Genotyping data from the nPOD cohort with unknown and from the 1000 Genomes phase 3 cohort³² with known family relationships were merged and analyzed for genetic relatedness using KING³⁰ (v2.1.2) software. The integrated relationship inference command was used to infer up to third-degree relatives. Relationships between nPOD case pairs and between 1000 Genomes pairs were represented by plotting estimated kinship coefficients. Kinship coefficients of unrelated 1000 Genomes pairs were randomly downsampled to the number of nPOD subject pairs to allow for data visualization.

Genetic ancestry

Data from unrelated subjects from the 1000 Genomes phase 3 cohort³² were filtered for SNPs that overlap with the UFDIchip array using plink v1.9²⁹. The data were pruned for linkage disequilibrium (LD) by removing SNPs with R² > 0.1, screening within a 50 SNP block and proceeding by steps of 10 SNPs. This yielded 1000 Genomes genotypes for 320,005 SNPs, which were used to run an unsupervised analysis using ADMIXTURE software¹⁸ (v1.3.0) with k set to five populations. Each of the five groups represented a unique continental population from 1000 Genomes and as such, were assigned: 1) African (AFR), 2) Admixed American (AMR), 3) East Asian (EAS), 4) European (EUR), and 5) South Asian (SAS)³². The 372 nPOD cases were then projected onto the reference population to estimate ancestry proportions. Dimensionality reduction of the resulting Q-values (ancestry proportions) was performed using principal component analysis (PCA) to enable visualization.

HLA Imputation

Axiom^TM HLA Analysis Software (v1.2.0.38)¹⁹ was used to impute 2-digit and 4-digit HLA genotypes, along with probability scores for the imputed calls. Concordance with nPOD HLA typing results⁸ was assessed at HLA-A, HLA-DRB1, HLA-DQA1, and HLA-DQB1. The typed result was considered ground truth when the imputed result was discordant.

Imputation accuracy for each of these loci [Acc(L)] was calculated as previously reported³³, substituting the dosage for the probability score that is provided by Axiom^TM HLA Analysis Software¹⁹:

$$Acc\left(L\right)=\frac{{\sum }_{i=1}^{n}{P}_{i}\left(A{1}_{i,L}\right)+{P}_{i}\left(A{2}_{i,L}\right)}{2n}$$

where P_i is the probability for imputed alleles A1_i,L and A2_i,L for donor i at locus L. Imputed alleles were considered concordant when they were included in the donor’s set of typed alleles at locus L, and discordant when they were not in the set of typed alleles at locus L. For discordant alleles, P_i was set to 0. The summation of probabilities for the total number of donors assessed, n, was then divided by the total number of alleles tested, 2n. The accuracy score ranges from 0, for no concordant calls, to 1, for complete concordance with probabilities of 1 for all alleles.

Concordance was calculated at the 2-digit and 4-digit level for genotypes related to T1D risk or protection, as determined in primarily White cohorts^34,35,36. These included HLA-A*02:01, HLA-A*24:02, HLA-DRB1*03:01 (DR3), HLA-DQA1*05:01–HLA-DQB1*02:01 (DQ2), HLA-DRB1*04:xx (DR4), HLA-DQA1*03:01–HLA-DQB1*03:02 (DQ8), HLA-DRB1*08:01 (DR8), HLA-DQA1*04:01–HLA-DQB1*04:02 (DQ4), HLA-DRB1*15:xx (DR15), HLA-DQA1*01:02–HLA-DQB1*06:02 (DQ6), and HLA-DQA1*03:01–HLA-DQB1*03:01 (DQ7), where xx is any sub-allele. The following formula was used:

$$Concordance=\frac{{\sum }_{i=1}^{n}\,A{1}_{i,L}+A{2}_{i,L}}{{\sum }_{i=1}^{n}\,A{1}_{t,L}+A{2}_{t,L}}$$

where the number of imputed alleles A1_i,L and A2_i,L matching the genotype of interest for donor i at locus L was summed across all donors, n, and divided by the number of typed alleles A1_t,L and A2_t,L matching the genotype of interest for donor i at locus L summed across all donors. The accuracy score ranges from 0, for no concordant calls, to 1, for complete concordance.

Donor-level imputation accuracy [Acc(S)] was calculated as:

$$Acc\left(S\right)=\frac{{\sum }_{j=1}^{n}{P}_{j}\left(A{1}_{j,S}\right)+{P}_{j}\left(A{2}_{j,S}\right)}{2n}$$

where P_j is the probability for imputed alleles A1_j,S and A2_j,S at each HLA locus j of donor S. Concordance was determined as described above, and P_j was set to 0 for discordant alleles. The total number of loci tested, n, was 4 per donor (HLA-A, HLA-DRB1, HLA-DQA1, and HLA-DQB1). The accuracy score ranges from 0, for no concordant calls, to 1, for complete concordance with probabilities of 1 for both alleles at each locus for donor S.

T1D GRS calculation

We computed polygenic T1D genetic risk scores, referred to as GRS1^21,22, GRS2³⁷, and African-Ancestry (AA)-GRS³⁸. GRS1 is calculated using dosages of risk genotypes for 30 T1D-associated SNPs²¹. Genotypes were obtained by imputing to the TOPMed (v r2)³¹ panel (R² > 0.97). rs2187668 was not found in TOPMed, thus, a suitable proxy SNP from GRS2³⁷, rs9273369, was used instead. The HLA component of GRS1 was calculated using the Polygenic Risk Score (PRS) Toolkit for HLA (v0.22a) developed by Sharp et al.³⁷. The non-HLA component of GRS1 was then calculated via weighted sum, using odds ratios from Oram et al.²¹. The HLA and non-HLA scores were summed and normalized as described in Oram et al.²¹. GRS2 is calculated using dosages of risk genotypes for 67 T1D-associated SNPs³⁷. Genotypes were obtained by imputing to the TOPMed (v r2)³¹ panel (R² > 0.97). rs2476601, rs1281934, rs9273342, rs9271346, rs1233320, rs16822632, rs116522341, rs559242105, and rs371250843 were not found in TOPMed, thus, suitable proxy SNPs rs6679677, rs1281943, rs9273032, rs9271347, rs1233320, rs17840116, rs9268500, rs3129197, and rs9266268 were respectively used instead. The HLA component of GRS2 was calculated using the PRS Toolkit for HLA (v0.22a) developed by Sharp et al.³⁷. The non-HLA component of GRS2 was then calculated via weighted sum, using odds ratios from Sharp et al.³⁷ and added to the HLA component. AA-GRS is calculated using dosages of risk genotypes for 7 T1D-associated SNPs³⁸. Genotypes were obtained by imputing to the TOPMed (v r2)³¹ panel (R² > 0.96). rs2187668 and rs34303755 were not found in TOPMed; thus, suitable proxy SNPs rs9273369 and rs9268838 were respectively used instead. The AA-GRS was then calculated via weighted sum, using odds ratios from Onengut-Gumuscu et al.³⁸.

WES

For 207 nPOD donors (Table 2, Phenotype_data.txt²⁷), WES libraries were generated as previously described³⁹ (Fig. 1c) using the Agilent SureSelect Human All Exon kit (Agilent Technologies, CA, USA). Procedures and quality control (QC) measures were performed following manufacturer’s recommendations. Briefly, 180–280 bp fragments were generated from genomic DNA by sonication (Covaris) with exonuclease and polymerase subsequently utilized to convert remaining overhangs into blunt ends. The DNA fragments were adenylated on the 3′ ends followed by ligation of adapter oligonucleotides. Successfully ligated DNA fragments were enriched by PCR. Following hybridization with biotin-labelled probes, exons were captured with streptavidin-coated magnetic beads. After a wash, probes were digested. Libraries were enriched and index tags added by PCR. Amplified exon libraries were purified using AMPure XP (Beckman Coulter), quantified by Agilent high sensitivity DNA kit using an Agilent Bioanalyzer 2100, then sequenced via Illumina Novaseq. 6000 (Illumina, CA, USA). Burrows-Wheeler Aligner (BWA, v0.7.17) was utilized to map the paired-end clean reads to the GRCh37/hg19 human reference genome⁴⁰. Genome Analysis Toolkit (GATK, v4.1.2.0) was employed for SNP/InDel detection⁴¹. Annotate Variation (ANNOVAR, v20191024) was used for variant annotation⁴². Other variant annotations were performed using American College of Medical Genetics (ACMG) Classification⁴³, Sorting Intolerant from Tolerant (SIFT) Function Prediction (SIFT4G)⁴⁴, PolyPhen-2 Function Prediction (v 2.2.2)⁴⁵, Combined Annotation Dependent Depletion (CADD, v1.6) Score⁴⁶, Genome Aggregation Database (gnomAD, v2.1.1) frequency⁴⁷, Human Gene Mutation Database (HGMD professional 2020.2)⁴⁸, ClinVar (accessed August 31, 2020)⁴⁹ and Centogene Mutation Database (CentoMD, v5.8)⁵⁰. All data passed these QC screens and are stored in dbGaP²⁷.

Table 2 nPOD donors subjected to WES.

Full size table

UFDIchip and WES comparison

For 167 nPOD donors, both UFDIchip- (Table 1, Phenotyp_data.txt²⁷) and WES-based (Table 2, Phenotype_data.txt²⁷) genotyping were performed. Biallelic autosomal variants detected in both assays and with at least one minor allele count (MAC) in the WES data were filtered using plink v1.9²⁹, resulting in 27,852 variants for comparison. Per-SNP intra-assay concordance levels were calculated across all subjects.

Data Records

UFDIchip array data are stored in dbGaP²⁷ as raw CEL files and compiled processed data from all donors deposited as binary plink files (hg19). All genotyped donors, as well as their age, sex, reported race, diabetes status and duration, are provided in Phenotype_data.txt²⁷ with additional donor information available on the nPOD Data Portal (https://portal.jdrfnpod.org/, accessed October 21, 2022).

WES data are stored in dbGaP²⁷, including raw exome sequencing data files (fastq format) or hg19 aligned exome sequencing data (bam format), in addition to processed variant files (vcf format). A spreadsheet listing variants and associated annotations per donor, as described in the methods, was also submitted (csv format). All donors subjected to WES, as well as their age, sex, reported race, diabetes status and duration, are listed in Phenotype_data.txt²⁷ with additional donor information available on the nPOD Data Portal (https://portal.jdrfnpod.org/, accessed October 21, 2022).

Technical Validation

Quality control assessment of the UFDIchip genotype array

As of this report, 372 nPOD donors have been genotyped on the UFDIchip and the resulting data are accessible on dbGaP (see Data Records). All array results were subjected to basic QC analyses that assessed donor-level DQC; donor-, plate-, and SNP-level call rate; and sex concordance. Donor DQC or call rate failures were re-processed with freshly extracted DNA when necessary. nPOD samples were batch-processed with data from living donors²⁰ to facilitate calling of low-frequency variants⁵¹, resulting in 942,466 high quality genotypes passing the SNP call rate threshold. The nPOD cohort demonstrated SNP call rates of 99.58 [99.19–99.84] (median [interquartile range (IQR)]). All nPOD samples were assessed for concordance between reported and imputed sex according to level of X chromosome heterozygosity using plink v1.9²⁹. For all nPOD cases, imputed sex matched reported sex. Thus, all 372 nPOD samples passed basic QC measures. Additionally, 24 nPOD samples were run in technical replicate to assess assay reproducibility. Call rates between the technical replicates differed minimally, with 0.087 ± 0.640% bias (mean ± standard deviation, Fig. 3a). Importantly, the kinship coefficients between the 24 technical replicates were 0.499 [0.496–0.499] (median [IQR]), suggesting near identical genotype calls (Fig. 3b).

Relatedness estimation

Next, relatedness between donors was assessed. Due to the nature of donor organ procurement, it is highly improbable, although not impossible, that nPOD donors may be related. A relatedness analysis of the 372 nPOD donors (69,006 possible pair combinations) using KING software³⁰ found that all of these donor pairs were inferred to be unrelated (>third-degree relatives). For comparison, we also assessed the relatedness of 2,504 1000 Genomes phase 3 cohort³² subjects. While this set was designed to consist of unrelated individuals, it unintentionally included a few first-, second-, and third-degree relatives³². When relatedness between nPOD donor pairs was compared to relatedness between 1000 Genomes³² subject pairs, nPOD donor pairs showed significantly smaller kinship coefficients than inferred parent-offspring (PO), full sibling (FS), 2^nd degree relative, and 3^rd degree relative pairs from 1000 Genomes (Fig. 3b), suggesting that nPOD donors are not closely related. Note that nPOD donor pairs had significantly larger kinship coefficients than inferred unrelated (UN) pairs from 1000 Genomes³² (Fig. 3b), potentially due to increased similarity in genetic ancestry⁵² between subjects in the nPOD cohort than in the 1000 Genomes cohort, which was specifically designed to sample individuals with diverse genetic ancestry. Beyond confirming expected relatedness in the nPOD cohort, this validates that users of this resource may employ population-level quantitative trait locus (QTL) analysis methods with these genetic data.

Alignment with genetic ancestry

To further validate the UFDIchip data, we used the 1000 Genomes phase 3 cohort³² to build a reference model for genetic ancestry using ADMIXTURE software¹⁸ (Fig. 4a,b), projected all 372 nPOD donors onto this model to impute ancestry (Fig. 4c), and compared those results with reported race. Using methods modified from Kaddis, et al.⁵³, we plotted PCA results of ancestry proportions and observed that each of the five major continental populations in the 1000 Genomes cohort (AFR, AMR, EAS, EUR, and SAS)³² clustered to occupy distinct PC space (Fig. 4b). This suggested that the five ancestry populations computed by ADMIXTURE were representative of the five continental populations from 1000 Genomes³². The ancestry proportions of 1000 Genomes³² continental populations were almost entirely represented by a single ancestry group, with the exception of admixed populations including Admixed Americans (AMR), as well as the subcontinental populations, Americans of African ancestry in SW USA (ASW) and African Caribbeans in Barbados (ACB, Fig. 4a,b), as previously observed³². Next, the nPOD cohort was projected onto the 1000 Genomes reference, revealing overlap with AFR, AMR, EAS, and EUR groups in PC space (Fig. 4c). Donors were then assessed for agreement between reported race and genetic ancestry, showing that the highest AFR, AMR, EAS, and EUR ancestry proportions were observed in donors reported as Black/African American, Hispanic/Latinx, Asian, and White/Caucasian respectively (UFDIchip_admixture.xlsx²⁷), which is consistent with other U.S.-based admixture studies^54,55. Notably, racial identity is complex and the method of estimating proportions of continental genetic ancestries may not adequately reflect genetic diversity⁵⁶. With this limitation in mind, these analyses accomplish the aims of: 1) ADMIXTURE model validation using UFDIchip array data and 2) qualification of the genetic ancestry results as an alternative or additional covariate to reported race for users of this resource (UFDIchip_admixture.xlsx²⁷)⁵⁷.

HLA imputation accuracy and concordance

The nPOD cohort was HLA typed using next generation sequencing (NGS) at HLA-A, HLA-DRB1, HLA-DQA1, and HLA-DQB1⁸ to identify donors with genotypes that are associated with T1D risk or protection^34,35,36. This enables an extra level of QC and validation of the UFDIchip array data by comparing typed to genetically imputed HLA genotypes. Imputation accuracy at each locus, Acc(L), was calculated assuming typed results were correct if discordant with imputed results. Overall, Acc(L) was >0.93 for low-resolution HLA (2-digit) and >0.90 for high-resolution HLA (4-digit) for the four loci tested (UFDIchip_HLA_imputation_accuracy.xlsx²⁷).

Next, we assessed concordance between typed and imputed HLA for T1D risk (A2, A24, DR3, DQ2, DR4, DQ8, DR8, and DQ4) or protective (DR15, DQ6, and DQ7) genotypes^34,35,36 (Table 3). At 2-digit resolution, all tested loci were greater than 93% concordant (median [IQR]: 98.5% [97.4%–99.8%], Table 3). At 4-digit resolution, HLA concordance was predictably lower (median [IQR]: 97.1% [92.5%–99.3%]), with notable discordance in the less common HLA-DRB1*04:xx genotypes (Table 3). Importantly, 4-digit genotypes that convert 2-digit risk to protective genotypes, such as HLA-DRB1*04:03 and HLA-DQB1*03:01, were accurately imputed with greater than 97.9% concordance (Table 3).

Table 3 Imputed HLA concordance with typed HLA for T1D-associated genotypes.

Full size table

Data validation at the sample level was assessed using a sample imputation accuracy score, Acc(S), for 2-digit HLA at the four typed loci. Acc(S) was 0.984 [0.946–0.998] (median [IQR]), indicating high performance of the imputation methodology per sample (Fig. 5a). HLA imputation may be inaccurate when rare HLA genotypes and ancestrally diverse populations are underrepresented in the reference cohort^33,58,59. In agreement with this notion, a breakdown of nPOD donors by reported race or by top genetic ancestry proportion suggests that imputation accuracy could potentially be improved with greater reference cohort diversity (Fig. 5b,c). Donors with reported race of White had significantly higher HLA imputation accuracy than those reported as Black or Hispanic/Latinx (Fig. 5b). Similarly, donors whose highest genetic ancestry proportion were EUR had higher imputation accuracy than donors whose were AFR or AMR (Fig. 5c). Notably, 4-digit HLA imputation showed 100% concordance for the 24 nPOD subjects run in technical replicate on the UFDIchip.

T1D polygenic GRS performance using UFDIchip data

Polygenic risk scores summarize genetic risk for T1D as a continuous value by aggregating estimated risk at HLA and non-HLA loci^21,22,37,38. The original reports of GRS1 described its utility for discerning T1D from other forms of diabetes including T2D^21,60 and MODY²². We previously observed that a similar GRS robustly discriminated living controls from T1D subjects reported as White but was less effective for non-White subjects, highlighting a need for diversity in risk modeling^20,53. Shortly thereafter, GRS2 was developed to incorporate the impact of interactions between HLA haplotypes on T1D risk, showing improved discrimination of European ancestry T1D from control subjects³⁷. Additionally, an AA-GRS was created to account for ancestry-specific T1D risk loci, with enhanced performance at distinguishing T1D from control subjects in AFR populations³⁸. We therefore attempted to validate these previous findings regarding the ability of GRS1, GRS2, and AA-GRS to differentiate controls from T1D subjects by using the 372 nPOD cases subjected to genotyping. Indeed, White T1D donors (0.287 [0.264–0.303], median [IQR]) had significantly higher GRS1 values than White No Diabetes donors (0.231 [0.195–0.256], Fig. 6a). While Hispanic/Latinx T1D donors (0.283 [0.274–0.303]) also showed significantly higher GRS1 values than Hispanic/Latinx No Diabetes donors (0.233 [0.223–0.257]), Black T1D and No Diabetes donors were unable to be distinguished by GRS1 due to low scores in T1D donors (0.250 [0.234–0.261], Fig. 6a). In contrast, GRS2 values were significantly higher in Black T1D donors (11.62 [10.05–12.78]) than Black No Diabetes donors (8.83 [6.75–10.36]), although Black T1D donor GRS2 values remained significantly lower than those of White T1D donors (14.38 [12.94–15.16], Fig. 6b). While the AA-GRS likewise succeeded at differentiating Black T1D donors (5.634 [4.061–8.001]) from Black No Diabetes donors (1.751 [0.804–3.964]), no significant differences remained between Black T1D and White T1D donors (Fig. 6c). Taken together, these results indicate that the nPOD cohort UFDIchip array data represent a validated resource for genetic studies of T1D. Additionally, we provide GRS1, GRS2, and AA-GRS genotypes and calculated scores to the community for future studies (GRS1_GRS2_AAGRS_TOPMed_Imputed.xlsx²⁷). Note that these scores differ from those provided in Kaddis, et al.⁵³, due to updating the reference cohort for imputation from the Haplotype Reference Consortium (HRC)⁶¹ cohort, with predominantly European ancestry, to the TOPMed³¹ reference, with diverse genetic ancestry.

WES

207 nPOD donors were also queried for rare variants using WES and associated data are accessible on dbGaP (see Data Records). Standard QC measures were performed to minimize adapter contamination, low-quality reads, error rate, and sequencing bias. To further validate data quality, we measured concordance between genotype calls from the UFDIchip and WES (N = 167 donors). Indeed, 27,852 autosomal biallelic variants with at least one minor allele count (MAC) in the WES data showed a concordance of 98.8% [92.2%–99.7%] (median [IQR]) with UFDIchip calls.

Six of the nPOD donors were previously reported to have genetic variants with possible clinical impact in KCNJ11, LMNA, HNF1A, GLIS3, INSR, and GATA6 using a custom-designed NGS panel that included 140 genes implicated in monogenic diabetes⁶². These genetic variants were validated with WES (Table 4) and visual exploration of the data using the Integrative Genomics Viewer (IGV)⁶³ confirmed reads for each variant (Fig. 7). WES captures genomic DNA sequence in exons and the intronic sequence adjacent to exons. This enables the discovery of variants that directly alter the protein coding portion of mRNA (missense, nonsense, insertion/deletions) and also some regulatory intronic sequences, such as splice sites. Variants in genes associated with autoimmune diabetes (AIRE, FOXP3, IL2RA, ITCH, LRBA, SKAP2, STAT1, and STAT3) or MODY and neonatal diabetes (GCK, HNF1A, HNF4A, HNF1B, ABCC8, KCNJ11, and INS)^39,64,65,66 were observed in 141 nPOD cases with T1D (Fig. 8a). Monogenic forms of diabetes are rare, and the vast majority of the detected variants are not expected to have functional or clinical consequences.

Table 4 Gene variants previously published.

Full size table

There are several databases and tools available to help with the identification and interpretation of genetic variants. For example, the frequency of a variant in the general population can be estimated using the gnomAD, which contains data from 140,000 + exomes and genomes from unique, unrelated individuals spanning six global ancestries⁴⁷. Additionally, the CADD score can be used to predict severity of impact of the variants based on a variety of criteria such as sequence context, evolutionary constraint, and functional predictions⁴⁶. As expected, the variants observed in T1D cases were distributed across a spectrum of functional classes, with only a few predicted to be both rare (frequency < 0.01%) and deleterious (CADD score ≥ 20, Fig. 8b–d). As an example of how these tools can be used, variants in the monogenic diabetes genes HNF1A and STAT1 were analysed in the nPOD donors classified as T1D. One variant for each gene was predicted to be rare and deleterious based on the thresholds set for the gnomAD frequency and CADD score (Fig. 9, Table 5). The thresholds set for these and other bioinformatic tools are determined by each investigator, and are often informed by the clinical phenotype of the patient and previous knowledge about the gene’s disease association. Other variant annotations from tools including ACMG Classification⁴³, SIFT Function Prediction⁴⁴, PolyPhen-2 Function Prediction⁴⁵, HGMD⁴⁸, ClinVar⁴⁹, and CentoMD⁵⁰ are available for all 207 nPOD donors on dbGaP (see Data Records). A suggested workflow for evaluating genetic variants for potential clinical significance is shown in Fig. 10. Importantly, while computational tools facilitate interpretation, confidence in the functional or clinical relevance of the genetic variants reported herein requires rigorous experimentation.

Table 5 Potential monogenic diabetes variants.

Full size table

Usage Notes

The associated data are openly available with unrestricted access.

Code availability

No custom code or scripts were used for the curation and validation of the dataset.

References

Grant, S. F. A., Wells, A. D. & Rich, S. S. Next steps in the identification of gene targets for type 1 diabetes. Diabetologia 63, 2260–2269, https://doi.org/10.1007/s00125-020-05248-8 (2020).
Article PubMed PubMed Central Google Scholar
Robertson, C. C. et al. Fine-mapping, trans-ancestral and genomic analyses identify causal variants, cells, genes and drug targets for type 1 diabetes. Nat Genet 53, 962–971, https://doi.org/10.1038/s41588-021-00880-5 (2021).
Article CAS PubMed PubMed Central Google Scholar
Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat Genet 47, 381–386, https://doi.org/10.1038/ng.3245 (2015).
Article CAS PubMed PubMed Central Google Scholar
Chen, Y. & Chen, G. New genetic characteristics of latent autoimmune diabetes in adults (LADA). Ann Transl Med 7, 81, https://doi.org/10.21037/atm.2019.01.01 (2019).
Article CAS PubMed PubMed Central Google Scholar
Pang, H. et al. Emerging roles of rare and low-frequency genetic variants in type 1 diabetes mellitus. J Med Genet 58, 289–296, https://doi.org/10.1136/jmedgenet-2020-107350 (2021).
Article CAS PubMed Google Scholar
Strakova, V. et al. Screening of monogenic autoimmune diabetes among children with type 1 diabetes and multiple autoimmune diseases: is it worth doing? J Pediatr Endocrinol Metab 32, 1147–1153, https://doi.org/10.1515/jpem-2019-0261 (2019).
Article CAS PubMed Google Scholar
Porter, J. R. & Barrett, T. G. Acquired non-type 1 diabetes in childhood: subtypes, diagnosis, and management. Arch Dis Child 89, 1138–1144, https://doi.org/10.1136/adc.2003.036608 (2004).
Article CAS PubMed PubMed Central Google Scholar
Campbell-Thompson, M. et al. Network for Pancreatic Organ Donors with Diabetes (nPOD): developing a tissue biobank for type 1 diabetes. Diabetes Metab Res Rev 28, 608–617, https://doi.org/10.1002/dmrr.2316 (2012).
Article PubMed PubMed Central Google Scholar
Insel, R. A. et al. Staging presymptomatic type 1 diabetes: a scientific statement of JDRF, the Endocrine Society, and the American Diabetes Association. Diabetes Care 38, 1964–1974, https://doi.org/10.2337/dc15-1419 (2015).
Article CAS PubMed PubMed Central Google Scholar
Wasserfall, C. et al. Validation of a rapid type 1 diabetes autoantibody screening assay for community-based screening of organ donors to identify subjects at increased risk for the disease. Clin Exp Immunol 185, 33–41, https://doi.org/10.1111/cei.12797 (2016).
Article CAS PubMed PubMed Central Google Scholar
Pugliese, A. et al. The Juvenile Diabetes Research Foundation Network for Pancreatic Organ Donors with Diabetes (nPOD) Program: goals, operational model and emerging findings. Pediatric Diabetes 15, 1–9, https://doi.org/10.1111/pedi.12097 (2014).
Article PubMed Google Scholar
Noble, J. A. et al. The role of HLA class II genes in insulin-dependent diabetes mellitus: molecular analysis of 180 Caucasian, multiplex families. Am J Hum Genet 59, 1134–1148 (1996).
CAS PubMed PubMed Central Google Scholar
Williams, M. D. et al. Genetic Composition and Autoantibody Titers Model the Probability of Detecting C-Peptide Following Type 1 Diabetes Diagnosis. Diabetes 70, 932–943, https://doi.org/10.2337/db20-0937 (2021).
Article CAS PubMed PubMed Central Google Scholar
Moore, P. C. et al. Elastase 3B mutation links to familial pancreatitis with diabetes and pancreatic adenocarcinoma. J Clin Invest 129, 4676–4681, https://doi.org/10.1172/JCI129961 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yu, M. G. et al. Residual β cell function and monogenic variants in long-duration type 1 diabetes patients. J Clin Invest 129, 3252–3263, https://doi.org/10.1172/JCI127397 (2019).
Article PubMed PubMed Central Google Scholar
Keenan, H. A. et al. Residual insulin production and pancreatic ß-cell turnover after 50 years of diabetes: Joslin Medalist Study. Diabetes 59, 2846–2853, https://doi.org/10.2337/db10-0676 (2010).
Article CAS PubMed PubMed Central Google Scholar
Manichaikul, A. et al. Robust Relationship Inference in Genome-Wide Association Studies. Bioinformatics (Oxford, England) 26, 2867–2873, https://doi.org/10.1093/bioinformatics/btq559 (2010).
Article CAS PubMed Google Scholar
Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19, 1655–1664, https://doi.org/10.1101/gr.094052.109 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dilthey, A. et al. Multi-population classical HLA type imputation. PLoS Comput Biol 9, e1002877, https://doi.org/10.1371/journal.pcbi.1002877 (2013).
Article CAS PubMed PubMed Central Google Scholar
Perry, D. J. et al. Application of a Genetic Risk Score to Racially Diverse Type 1 Diabetes Populations Demonstrates the Need for Diversity in Risk-Modeling. Sci Rep 8, 4529, https://doi.org/10.1038/s41598-018-22574-5 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Oram, R. A. et al. A Type 1 Diabetes Genetic Risk Score Can Aid Discrimination Between Type 1 and Type 2 Diabetes in Young Adults. Diabetes Care 39, 337–344, https://doi.org/10.2337/dc15-1111 (2016).
Article CAS PubMed Google Scholar
Patel, K. A. et al. Type 1 Diabetes Genetic Risk Score: A Novel Tool to Discriminate Monogenic and Type 1 Diabetes. Diabetes 65, 2094–2099, https://doi.org/10.2337/db15-1690 (2016).
Article CAS PubMed Google Scholar
Campbell-Thompson, M. et al. Insulitis and beta-Cell Mass in the Natural History of Type 1 Diabetes. Diabetes 65, 719–731, https://doi.org/10.2337/db15-0779 (2016).
Article CAS PubMed Google Scholar
Battaglia, M. et al. Introducing the Endotype Concept to Address the Challenge of Disease Heterogeneity in Type 1 Diabetes. Diabetes care 43 https://doi.org/10.2337/dc19-0880 (2020).
Arif, S. et al. Blood and islet phenotypes indicate immunological heterogeneity in type 1 diabetes. Diabetes 63, 3835–3845 DB_140365 [pii] db14-0365 [pii] https://doi.org/10.2337/db14-0365 (2014).
Bandrowski, A. et al. The Resource Identification Initiative: A Cultural Shift in Publishing. Neuroinformatics 14, 169–182, https://doi.org/10.1007/s12021-015-9284-3 (2016).
Article PubMed PubMed Central Google Scholar
Perry, D. J. et al. dbGaP https://identifiers.org/dbgap:phs002861.v1.p1 (2022).
Cortes, A. & Brown, M. A. Promise and pitfalls of the Immunochip. Arthritis Res Ther 13, 101, https://doi.org/10.1186/ar3204 (2011).
Article PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559–575, https://doi.org/10.1086/519795 (2007).
Article CAS PubMed PubMed Central Google Scholar
A, M. et al. Robust Relationship Inference in Genome-Wide Association Studies. Bioinformatics (Oxford, England) 26 https://doi.org/10.1093/bioinformatics/btq559 (2010).
Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299, https://doi.org/10.1038/s41586-021-03205-y (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
1000 Genomes Project Consortium. et al. A global reference for human genetic variation. Nature 526, 68–74, https://doi.org/10.1038/nature15393 (2015).
Article CAS Google Scholar
Jia, X. et al. Imputing amino acid polymorphisms in human leukocyte antigens. PloS One 8, e64683, https://doi.org/10.1371/journal.pone.0064683 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Noble, J. A. et al. HLA class I and genetic susceptibility to type 1 diabetes: results from the Type 1 Diabetes Genetics Consortium. Diabetes 59, 2972–2979. db10-0699 [pii] https://doi.org/10.2337/db10-0699 (2010).
Erlich, H. et al. HLA DR-DQ haplotypes and genotypes and type 1 diabetes risk: analysis of the type 1 diabetes genetics consortium families. Diabetes 57, 1084–1092. db07-1331 [pii] https://doi.org/10.2337/db07-1331 (2008).
Noble, J. A., Johnson, J., Lane, J. A. & Valdes, A. M. HLA class II genotyping of African American type 1 diabetic patients reveals associations unique to African haplotypes. Diabetes 62, 3292–3299, https://doi.org/10.2337/db13-0094 (2013).
Article CAS PubMed PubMed Central Google Scholar
Sharp, S. A. et al. Development and Standardization of an Improved Type 1 Diabetes Genetic Risk Score for Use in Newborn Screening and Incident Diagnosis. Diabetes Care 42, 200–207, https://doi.org/10.2337/dc18-1785 (2019).
Article CAS PubMed PubMed Central Google Scholar
Onengut-Gumuscu, S. et al. Type 1 Diabetes Risk in African-Ancestry Participants and Utility of an Ancestry-Specific Genetic Risk Score. Diabetes Care 42, 406–415, https://doi.org/10.2337/dc18-1727 (2019).
Article CAS PubMed PubMed Central Google Scholar
Rutsch, N. et al. Diabetes With Multiple Autoimmune and Inflammatory Conditions Linked to an Activating SKAP2 Mutation. Diabetes Care 44, 1816–1825, https://doi.org/10.2337/dc20-2317 (2021).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760, https://doi.org/10.1093/bioinformatics/btp324 (2009).
Article CAS PubMed PubMed Central Google Scholar
Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr Protoc Bioinformatics 43, 11 10 11–11 10 33, https://doi.org/10.1002/0471250953.bi1110s43 (2013).
Article Google Scholar
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38, e164, https://doi.org/10.1093/nar/gkq603 (2010).
Article CAS PubMed PubMed Central Google Scholar
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 17, 405–424, https://doi.org/10.1038/gim.2015.30 (2015).
Article PubMed PubMed Central Google Scholar
Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31, 3812–3814, https://doi.org/10.1093/nar/gkg509 (2003).
Article CAS PubMed PubMed Central Google Scholar
Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet Chapter 7, Unit7 20, https://doi.org/10.1002/0471142905.hg0720s76 (2013).
Article PubMed Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet 46, 310–315, https://doi.org/10.1038/ng.2892 (2014).
Article CAS PubMed PubMed Central Google Scholar
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443, https://doi.org/10.1038/s41586-020-2308-7 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Stenson, P. D. et al. The Human Gene Mutation Database (HGMD((R))): optimizing its use in a clinical diagnostic or research setting. Hum Genet 139, 1197–1207, https://doi.org/10.1007/s00439-020-02199-3 (2020).
Article PubMed PubMed Central Google Scholar
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res 46, D1062–D1067, https://doi.org/10.1093/nar/gkx1153 (2018).
Article CAS PubMed Google Scholar
Trujillano, D. et al. A comprehensive global genotype-phenotype database for rare diseases. Mol Genet Genomic Med 5, 66–75, https://doi.org/10.1002/mgg3.262 (2017).
Article PubMed Google Scholar
Chierici, M., Miclaus, K., Vega, S. & Furlanello, C. An interactive effect of batch size and composition contributes to discordant results in GWAS with the CHIAMO genotyping algorithm. Pharmacogenomics J 10, 355–363, https://doi.org/10.1038/tpj.2010.47 (2010).
Article CAS PubMed Google Scholar
Thornton, T. et al. Estimating kinship in admixed populations. Am J Hum Genet 91, 122–138, https://doi.org/10.1016/j.ajhg.2012.05.024 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kaddis, J. et al. Improving the Prediction of Type 1 Diabetes Across Ancestries. Diabetes Care, dc211254 https://doi.org/10.2337/dc21-1254 (2022).
Bryc, K., Durand, E., Macpherson, J., Reich, D. & Mountain, J. The genetic ancestry of African Americans, Latinos, and European Americans across the United States. American Journal of Human Genetics 96, 37–53, https://doi.org/10.1016/j.ajhg.2014.11.010 (2015).
Article CAS PubMed PubMed Central Google Scholar
Dai, C. L. et al. Population Histories of the United States Revealed through Fine-Scale Migration and Haplotype Analysis. Am J Hum Genet 106, 371–388, https://doi.org/10.1016/j.ajhg.2020.02.002 (2020).
Article CAS PubMed PubMed Central Google Scholar
Lewis, A. et al. Getting genetic ancestry right for science and society. Science 376, 250–252, https://doi.org/10.1126/science.abm7530 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Borrell, L. et al. Race and Genetic Ancestry in Medicine - A Time for Reckoning with Racism. The New England Journal of Medicine 384, 474–480, https://doi.org/10.1056/NEJMms2029562 (2021).
Article PubMed PubMed Central Google Scholar
Abi-Rached, L. et al. Immune diversity sheds light on missing variation in worldwide genetic diversity panels. PloS One 13, e0206512, https://doi.org/10.1371/journal.pone.0206512 (2018).
Article CAS PubMed PubMed Central Google Scholar
Luo, Y. et al. A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response. Nature Genetics 53, 1504–1516, https://doi.org/10.1038/s41588-021-00935-7 (2021).
Article CAS PubMed PubMed Central Google Scholar
Carr, A. L. J. et al. Histological validation of a type 1 diabetes clinical diagnostic model for classification of diabetes. Diabet Med 37, 2160–2168, https://doi.org/10.1111/dme.14361 (2020).
Article CAS PubMed PubMed Central Google Scholar
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet 48, 1279–1283, https://doi.org/10.1038/ng.3643 (2016).
Article CAS PubMed PubMed Central Google Scholar
Sanyoura, M. et al. Pancreatic Histopathology of Human Monogenic Diabetes Due to Causal Variants in KCNJ11, HNF1A, GATA6, and LMNA. J Clin Endocrinol Metab 103, 35–45, https://doi.org/10.1210/jc.2017-01159 (2018).
Article PubMed Google Scholar
Robinson, J. T. et al. Integrative genomics viewer. Nat Biotechnol 29, 24–26, https://doi.org/10.1038/nbt.1754 (2011).
Article CAS PubMed PubMed Central Google Scholar
Riddle, M. C. et al. Monogenic Diabetes: From Genetic Insights to Population-Based Precision in Care. Reflections From a Diabetes Care Editors’ Expert Forum. Diabetes Care 43, 3117–3128, https://doi.org/10.2337/dci20-0065 (2020).
Article PubMed PubMed Central Google Scholar
Johnson, M. B., Hattersley, A. T. & Flanagan, S. E. Monogenic autoimmune diseases of the endocrine system. Lancet Diabetes Endocrinol 4, 862–872, https://doi.org/10.1016/S2213-8587(16)30095-X (2016).
Article CAS PubMed Google Scholar
Husebye, E. S., Anderson, M. S. & Kampe, O. Autoimmune Polyendocrine Syndromes. N Engl J Med 378, 1132–1141, https://doi.org/10.1056/NEJMra1713301 (2018).
Article CAS PubMed PubMed Central Google Scholar
Sanyoura, M. et al. Pancreatic Histopathology of Human Monogenic Diabetes Due to Causal Variants in KCNJ11, HNF1A, GATA6, and LMNA. J Clin Endocrinol Metab https://doi.org/10.1210/jc.2017-01159 (2017).
Chen, R., Im, H. & Snyder, M. Whole-Exome Enrichment with the Roche NimbleGen SeqCap EZ Exome Library SR Platform. Cold Spring Harb Protoc 2015, 634–641, https://doi.org/10.1101/pdb.prot084855 (2015).
Article PubMed Google Scholar

Download references

Acknowledgements

This research was performed with the support of the Network for Pancreatic Organ donors with Diabetes (nPOD; RRID:SCR_014641), a collaborative type 1 diabetes research project supported by JDRF (nPOD: 5-SRA-2018-557-Q-R) and The Leona M. & Harry B. Helmsley Charitable Trust (Grant#2018PG-T1D053, G-2108-04793). The content and views expressed are the responsibility of the authors and do not necessarily reflect the official view of nPOD. Organ Procurement Organizations (OPO) partnering with nPOD to provide research resources are listed at http://www.jdrfnpod.org/for-partners/npod-partners/. Additional funding was provided by the National Institutes of Health (NIH, P01 AI042288, UC4 DK108132, U01 DK112217) and The Leona M. & Harry B. Helmsley Charitable Trust (G-2018PG-T1D018, G-2003-04376). MRS was supported by a JDRF Postdoctoral Fellowship (3-PDF-2022-1137-A-N). LDP was supported by NIH predoctoral training grants (T32 DK108736 and F31 DK129004-01A1).

Author information

These authors contributed equally: Daniel J. Perry, Melanie R. Shapiro.

Authors and Affiliations

Department of Pathology, Immunology and Laboratory Medicine, Diabetes Institute, College of Medicine, University of Florida, Gainesville, FL, 32611, USA
Daniel J. Perry, Melanie R. Shapiro, Irina Kusmartseva, Srikar Chamala, Leandro Balzano-Nogueira, Mingder Yang, Jason O. Brant, Maigan Brusko, MacKenzie D. Williams, Kieran M. McGrail, James McNichols, Leeana D. Peters, Amanda L. Posgai, Clayton E. Mathews, Clive H. Wasserfall, Bobbie-Jo M. Webb-Robertson, Martha Campbell-Thompson, Patrick Concannon, Mark A. Atkinson & Todd M. Brusko
Diabetes Center, School of Medicine, University of California San Francisco, San Francisco, CA, 94143, USA
Sonya W. Chamberlain, Mark S. Anderson, Michael S. German & Chester E. Chamberlain
Department of Biostatistics, College of Public Health and Health Professions, University of Florida, Gainesville, FL, 32610, USA
Jason O. Brant
Department of Diabetes and Cancer Discovery Science, Arthur Riggs Diabetes and Metabolism Research Institute, Beckman Research Institute, City of Hope, Duarte, CA, 91010, USA
John S. Kaddis
Department of Pediatrics, Diabetes Institute, College of Medicine, University of Florida, Gainesville, FL, 32610, USA
Clayton E. Mathews, Desmond Schatz, Mark A. Atkinson & Todd M. Brusko
Biological Sciences Division, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, 99352, USA
Bobbie-Jo M. Webb-Robertson
Department of Biomedical Engineering, College of Engineering, University of Florida, Gainesville, FL, 32611, USA
Martha Campbell-Thompson
Center for Diabetes and Metabolic Diseases and the Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
Carmella Evans-Molina
Diabetes Research Institute, Department of Medicine, Division of Endocrinology, Diabetes and Metabolism, Department of Microbiology and Immunology, Miller School of Medicine, University of Miami, Miami, FL, 33021, USA
Alberto Pugliese
Genetics Institute, University of Florida, Gainesville, FL, 32601, USA
Patrick Concannon

Authors

Daniel J. Perry
View author publications
You can also search for this author in PubMed Google Scholar
Melanie R. Shapiro
View author publications
You can also search for this author in PubMed Google Scholar
Sonya W. Chamberlain
View author publications
You can also search for this author in PubMed Google Scholar
Irina Kusmartseva
View author publications
You can also search for this author in PubMed Google Scholar
Srikar Chamala
View author publications
You can also search for this author in PubMed Google Scholar
Leandro Balzano-Nogueira
View author publications
You can also search for this author in PubMed Google Scholar
Mingder Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jason O. Brant
View author publications
You can also search for this author in PubMed Google Scholar
Maigan Brusko
View author publications
You can also search for this author in PubMed Google Scholar
MacKenzie D. Williams
View author publications
You can also search for this author in PubMed Google Scholar
Kieran M. McGrail
View author publications
You can also search for this author in PubMed Google Scholar
James McNichols
View author publications
You can also search for this author in PubMed Google Scholar
Leeana D. Peters
View author publications
You can also search for this author in PubMed Google Scholar
Amanda L. Posgai
View author publications
You can also search for this author in PubMed Google Scholar
John S. Kaddis
View author publications
You can also search for this author in PubMed Google Scholar
Clayton E. Mathews
View author publications
You can also search for this author in PubMed Google Scholar
Clive H. Wasserfall
View author publications
You can also search for this author in PubMed Google Scholar
Bobbie-Jo M. Webb-Robertson
View author publications
You can also search for this author in PubMed Google Scholar
Martha Campbell-Thompson
View author publications
You can also search for this author in PubMed Google Scholar
Desmond Schatz
View author publications
You can also search for this author in PubMed Google Scholar
Carmella Evans-Molina
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Pugliese
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Concannon
View author publications
You can also search for this author in PubMed Google Scholar
Mark S. Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Michael S. German
View author publications
You can also search for this author in PubMed Google Scholar
Chester E. Chamberlain
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Atkinson
View author publications
You can also search for this author in PubMed Google Scholar
Todd M. Brusko
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Daniel J. Perry: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing—original draft preparation, Writing—reviewing and editing, Visualization. Melanie R. Shapiro: Software, Validation, Formal analysis, Investigation, Data curation, Writing—original draft preparation, Writing—reviewing and editing, Visualization. Sonya W. Chamberlain: Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing—original draft preparation, Writing—reviewing and editing, Visualization. Srikar Chamala: Writing—reviewing and editing, Irina Kusmartseva: Investigation, Writing— reviewing and editing, Supervision, Project administration. Leandro Balzano-Nogueira: Investigation, Data curation, Writing— reviewing and editing. Mingder Yang: Data curation, Writing— reviewing and editing, Supervision, Project administration. Jason Brant: Investigation, Writing— reviewing and editing. Maigan Brusko: Investigation, Writing— reviewing and editing. MacKenzie D. Williams: Investigation, Data curation, Writing— reviewing and editing. Kieran M. McGrail: Investigation, Writing— reviewing and editing. James McNichols: Investigation, Writing— reviewing and editing. Leeana D. Peters: Investigation, Writing— reviewing and editing. Amanda L. Posgai: Writing—original draft preparation, Writing— reviewing and editing. John S. Kaddis: Software, Writing— reviewing and editing. Clayton E. Mathews: Writing— reviewing and editing. Clive H. Wasserfall: Writing— reviewing and editing, Supervision, Project administration. Bobbie-Jo Webb-Robertson: Writing— reviewing and editing. Martha Campbell-Thompson: Investigation, Writing— reviewing and editing, Supervision, Project administration. Desmond Schatz: Investigation, Writing— reviewing and editing, Supervision, Project administration. Carmella Evans-Molina: Writing— reviewing and editing, Supervision, Project administration. Alberto Pugliese: Writing— reviewing and editing, Supervision, Project administration. Patrick Concannon: Conceptualization, Writing— reviewing and editing, Supervision, Project administration, Funding acquisition. Mark S. Anderson: Conceptualization, Writing— reviewing and editing, Supervision, Project administration. Michael S. German: Conceptualization, Writing— reviewing and editing, Supervision, Project administration. Chester E. Chamberlain: Conceptualization, Writing— reviewing and editing, Supervision, Project administration. Mark A. Atkinson: Conceptualization, Resources, Writing— reviewing and editing, Supervision, Project administration, Funding acquisition. Todd M. Brusko: Conceptualization, Writing— reviewing and editing, Supervision, Project administration.

Corresponding authors

Correspondence to Mark A. Atkinson or Todd M. Brusko.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Perry, D.J., Shapiro, M.R., Chamberlain, S.W. et al. A genomic data archive from the Network for Pancreatic Organ donors with Diabetes. Sci Data 10, 323 (2023). https://doi.org/10.1038/s41597-023-02244-6

Download citation

Received: 24 December 2022
Accepted: 16 May 2023
Published: 26 May 2023
DOI: https://doi.org/10.1038/s41597-023-02244-6

Subjects

Abstract

Similar content being viewed by others

KiT-GENIE, the French genetic biobank of kidney transplantation

An accessible insight into genetic findings for transplantation recipients with suspected genetic kidney disease

Every islet matters: improving the impact of human islet research

Background & Summary

Methods

Donor tissues

DNA isolation

UFDIchip design

Genotype processing and analysis

Validation of technical replicates

Relatedness

Genetic ancestry

HLA Imputation

T1D GRS calculation

WES

UFDIchip and WES comparison

Data Records

Technical Validation

Quality control assessment of the UFDIchip genotype array

Relatedness estimation

Alignment with genetic ancestry

HLA imputation accuracy and concordance

T1D polygenic GRS performance using UFDIchip data

WES

Usage Notes

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links