Type 1 diabetes genetic risk score is discriminative of diabetes in non-Europeans: evidence from a study in India

Type 1 diabetes (T1D) is a significant problem in Indians and misclassification of T1D and type 2 diabetes (T2D) is a particular problem in young adults in this population due to the high prevalence of early onset T2D at lower BMI. We have previously shown a genetic risk score (GRS) can be used to discriminate T1D from T2D in Europeans. We aimed to test the ability of a T1D GRS to discriminate T1D from T2D and controls in Indians. We studied subjects from Pune, India of Indo-European ancestry; T1D (n = 262 clinically defined, 200 autoantibody positive), T2D (n = 345) and controls (n = 324). We used the 9 SNP T1D GRS generated in Europeans and assessed its ability to discriminate T1D from T2D and controls in Indians. We compared Indians with Europeans from the Wellcome Trust Case Control Consortium study; T1D (n = 1963), T2D (n = 1924) and controls (n = 2938). The T1D GRS was discriminative of T1D from T2D in Indians but slightly less than in Europeans (ROC AUC 0.84 v 0.87, p < 0.0001). HLA SNPs contributed the majority of the discriminative power in Indians. A T1D GRS using SNPs defined in Europeans is discriminative of T1D from T2D and controls in Indians. As with Europeans, the T1D GRS may be useful for classifying diabetes in Indians.

Type 1 diabetes (T1D) is a significant problem in Indians and misclassification of T1D and type 2 diabetes (T2D) is a particular problem in young adults in this population due to the high prevalence of early onset T2D at lower BMI. We have previously shown a genetic risk score (GRS) can be used to discriminate T1D from T2D in Europeans. We aimed to test the ability of a T1D GRS to discriminate T1D from T2D and controls in Indians. We studied subjects from Pune, India of Indo-European ancestry; T1D (n = 262 clinically defined, 200 autoantibody positive), T2D (n = 345) and controls (n = 324). We used the 9 SNP T1D GRS generated in Europeans and assessed its ability to discriminate T1D from T2D and controls in Indians. We compared Indians with Europeans from the Wellcome Trust Case Control Consortium study; T1D (n = 1963), T2D (n = 1924) and controls (n = 2938). The T1D GRS was discriminative of T1D from T2D in Indians but slightly less than in Europeans (ROC AUC 0.84 v 0.87, p < 0.0001). HLA SNPs contributed the majority of the discriminative power in Indians. A T1D GRS using SNPs defined in Europeans is discriminative of T1D from T2D and controls in Indians. As with Europeans, the T1D GRS may be useful for classifying diabetes in Indians.
As of 2015 there were 490,000 children <15 years of age living with type 1 diabetes (T1D) globally, with >100,000 in India 1 . Despite this, almost all large genetic, biomarker and phenotype studies focus on T1D in people of European ancestry. The extent of overlapping genetic risk for T1D between different ethnic populations is not well described. Rising obesity rates and the recognition that T1D can occur at any age make the discrimination between T1D and type 2 diabetes (T2D) an increasingly difficult challenge. The discrimination of diabetes subtypes is more challenging in the Indian population due to the higher prevalence of early-onset T2D at a lower body mass index (BMI) than in European populations 2 . It is vital to identify the correct diabetes subtype as optimal treatment differs between T1D and T2D.
We and others have previously shown that a T1D genetic risk score (T1D GRS) comprising of between 9 and 67 SNPs can be a useful tool to aid the discrimination between T1D and T2D or controls in Europeans 3-6 . The majority of the discriminative power of the T1D GRS is in the first 9 SNPs when ranked by effect size 3 . These 9 SNPS include SNPs tagging the high-risk HLA DR3-DQ2.5 (DR3)/DR4-DQ8 (DR4) alleles and the highly protective HLA DR15-DQ6.2 (DR15) allele. In this study we aimed to assess if the T1D GRS developed in Europeans discriminates T1D from T2D and controls in Indians and could aid diabetes classification in this population.

Methods
In order to assess the performance of the T1D GRS we studied a cohort from Pune, India of 305 people with T1D, 352 people with T2D and 334 people without diabetes (

Cohort characteristics
Indian cohort. We studied individuals with and without diabetes from studies in Pune, in the state of Maharashtra, India, who were of self-reported Indo-European ancestry.
T1D. We considered people with T1D eligible for the study if they were attending the outpatients clinic of the Diabetes Unit, King Edward Memorial Hospital and Research Centre, Pune. We defined T1D as diabetes diagnosed <30 years of age, on insulin treatment from diagnosis and with a history of ketoacidosis. We excluded any sample which failed genotyping QC (n = 4). To reduce the risk of including people with monogenic diabetes or T2D we also excluded people diagnosed at age <9 months if islet autoantibody negative and those with random serum C-peptide concentrations >600 pmol/L who were islet autoantibody negative 8 or who had missing data in any of the measures. 262 people met the inclusion criteria and had clinically defined T1D. We performed analyses of this group and of a group defined by a stricter definition of T1D that included only autoantibody positive individuals (n = 200) in case some autoantibody negative individuals had non-T1D.
T2D. People with T2D (n = 352) were recruited as part of the Wellcome Genetic collection (WellGen) of people with diabetes mellitus 9 . For this study we selected people with T2D who were diagnosed >45 years and were on oral anti-diabetic agents for more than 5 years after diagnosis. We excluded those clinically judged to have exocrine pancreatic disease, monogenic diabetes or insulin-dependence (history of ketoacidosis, unresponsiveness to oral hypoglycaemic agents, on continuous insulin treatment since diagnosis). We excluded any samples which failed genotyping QC (n = 7), leaving 345 people with T2D in final analysis.
Control subjects. Control subjects (n = 334) were people without diabetes (75 g oral glucose tolerance test; WHO 1999 criteria) residing in and around Pune, India. These included parents of children from the Pune Children Study 10 -a study on the relationship between child's birthweight and future risk for T2D. Ten samples did not pass genotype QC leaving 324 controls.
Informed consent was obtained from all study participants who were above 18 years of age. For those below 18 years of age informed consent was obtained from a parent and/or legal guardian and for those between the ages of 12 to 18 years an additional consent was obtained from the participant.
The collection of clinical data and use of biobanked samples for the biochemical, immunological and genetic measurements was sanctioned by the Institutional Ethics Committee of the KEM Hospital Research Centre, Pune, India (KEMHRC ID No1737 & KEMHRC ID No PhD19) and all methods were performed in accordance with the relevant guidelines and regulations.

WTCCC Cohort.
To compare the results generated in our Indian cohort to results from Europeans we used the WTCCC cohort described previously 7 : T1D. The WTCCC T1D cohort (n = 1,938) all received a clinical diagnosis of T1D at <17 years of age and were treated with insulin from the time of diagnosis.
T2D. The WTCCC T2D cohort (n = 1,914) were all diagnosed between 25 and 75 years of age, were Glutamic Acid Decarboxylase (GAD) autoantibody negative and were either treated with diet/oral hypoglycaemic agents or had an interval of at least 1 year between diagnosis and the institution of insulin therapy.
Control subjects. The WTCCC control subjects (n = 2938) were made up of people with no known diagnosis of diabetes from two sources: the 1958 British Birth Cohort and the UK Blood Services control cohort. www.nature.com/scientificreports www.nature.com/scientificreports/ Genotyping. We genotyped 9 SNPs that capture the majority of discriminative power of the T1D GRS developed using subjects of European ancestry from the WTCCC study 7 (Table S1). Genomic DNA was isolated from all samples using the salt precipitation method and DNA samples were plated in 96-deep-well storage plates at a uniform concentration of 10 ng/λ at the CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India. Each plate includes eight repeat samples (∼10%) as a quality control measure. We used the Sequenom Mass Array technology to genotype 9 SNPs as a part of multiplex pool. We had >98% genotype success rate and >97% concordance between results of duplicate samples.
Autoantibody and C-peptide measurement. We measured random serum C-peptide concentration by direct electrochemiluminescence immunoassay (Cobas C-peptide kit, Roche Diagnostics GmBH, Germany; lower detection limit: 3.3pmol/l with CV of 0.6% at 33 pmol/L) on a Cobas e411 analyser 11 .

GRS calculation. The T1D GRS was calculated by:
For the non-DR3/DR4 SNPs -summing the dosages of the risk increasing allele at each locus (between 0 and 2) multiplied by the weight (ln(odds ratio[OR])) for each of the 7 SNPs (Supplementary table S1). For DR3/ DR4-DQ8 contribution -the DR3/DR4-DQ8 haplotypes were imputed and the corresponding weights assigned to each individuals score 12 (as in 13 ). (Supplementary table S1). The sum of these two was then divided by 15. This assumes that the score follows a log-additive model for T1D risk.
For analysis of discriminative power of HLA alone we created 4 groups defined by imputed haplotypes: low risk (category 0) with no DR3 or DR4, medium risk (category 1) with a single copy of DR3 or DR4, high risk (category 2) with two copies of DR3 or DR4, very high risk (category 3) with one copy of DR3 and one copy of DR4. This was used as an ordinal variable in logistic regression.

Statistics.
We tested the ability of a 9 SNP T1D GRS to discriminate between T1D and T2D and controls by using the area under the curve (AUC) of the receiver operator characteristic (ROC) statistic. All analyses were carried out using PLINK v1.90 14,15 and STATA 15(StataCorp LP, College Station, TX). Comparisons of discrimination were made using the roccomp package. HLA DR3/DR4 status is the major discriminator of T1D. The discriminative power of imputed HLA DR3/4 typing alone accounted for the majority of the power of the T1D GRS to discriminate people with T1D from people with T2D and controls in both Indians and Europeans and was similar between Indians and Europeans. As with the 9 SNP GRS the discriminative power was lower in Indians compared to Europeans Odds ratios and risk allele frequencies at DR3 and DR4 are different in Indians compared to Europeans. We assessed the individual odds ratios for each of the variants contributing to the 9 SNP GRS. This showed the DR3 and DR4 combinations to be different between Europeans and Indians (Fig. 2, Table 2 and Supplementary figure 4). The presence of either of DR4 and DR3 increased the odds ratio of T1D in the Indian cohort, however the strength of effect was reversed compared to Europeans. The odds ratio for T1D in Indians with DR4 was lower (OR [95% CI] for any DR4: 3.2 [2.1-4.9] in the Indian cohort vs 7.0 [6.2-8.0] in Europeans) and the odds ratio for T1D in Indians with DR3 was higher compared to Europeans (OR [95% CI] for any DR3: 9.6 [6.2-14.9] in the Indian cohort vs 4.0 [3.6-4.5] in Europeans).

Results
The frequencies of major risk alleles DR3 and DR4 were lower in the Indian controls compared to European controls (Indian controls: any DR3 12% frequency, any DR4: 14%; European controls: any DR3 26%, any DR4 21%). The major protective HLA allele in Europeans is DR15-DQ6. As expected this had a very low frequency in both Indians and Europeans ( Table 2). The DR15-DQ6 allele had a much lower frequency in Indian controls (1%) than European controls (14%) ( Table 2). This finding is consistent with results from Indian individuals in the 1000 Genomes Project (2%) 16

Discussion
In this study we demonstrate that a T1D GRS derived from Europeans can still be discriminative of T1D from T2D in Indians. The large numbers of people with T1D in India, the increasing recognition of childhood and early adulthood onset of T2D and recent studies highlighting the frequency of adult onset T1D 17,18 emphasize that tools such as the T1D GRS will be needed to discriminate people with T1D from people with T2D in non-European populations.
The utility of polygenic risk scores for prediction and diagnosis of disease is an area of increasing interest 19,20 . The most effective polygenic risk scores are those derived from large genome wide association studies in independent datasets of heritable diseases. These studies have most commonly been performed in large datasets of people of European ancestry as is the case for T1D [20][21][22] . The effects of population stratification and ethnicity on genetic associations of disease may hinder the utility of polygenic risk scores in non-Europeans 23,24 . This is in part due to differences in underlying risk allele frequencies between populations. A good example in our study is the strongly protective HLA DR15-DQ6 allele which is common in Europeans but virtually absent in the Indian population we studied. This explains some of the differences in baseline T1D risk between Indians and Europeans. Despite these limitations of population stratification our study confirms that the major T1D risk alleles in Europeans are also key risk alleles in Indians and even in its current form (biased towards a European population) the T1D GRS is still strongly discriminative of T1D from T2D.
The effect of the DR3 allele on T1D risk in Indians is greater than in Europeans. Conversely, the DR4 association with T1D was less strong. These findings strengthen earlier findings by other smaller studies [25][26][27] which hint at a stronger DR3 association in India. This could be explained by an interaction with the environment. It is possible that a specific pathogen or environmental risk factor that interacts with DR3 is more prevalent in the Indian population or environment compared to the European population and/or a common environmental exposure related to DR4 risk may be less common. Further exploration of this could provide important new insights  www.nature.com/scientificreports www.nature.com/scientificreports/ into the environmental triggers that lead to T1D. The linkage disequilibrium between the tag SNPs (rs2187668, rs7454108) and the DR3 and DR4 alleles could be different across the different populations which could lower the efficiency of the SNPs to tag the haplotypes. However, the accuracy of the DR3/DR4 haplotype combination classification in the 1000 Genomes Project data set (http://www.internationalgenome.org/home) was 98.4% using the DR3/DR4 tag SNPs.

Limitations
The population of India is very heterogenous and more data from across India are needed to properly assess how representative this study is of all Indian ethnicities. The subjects in the study were of Indo-European ethnicity and it will be interesting to test the observations on Dravidian people, another major ethnicity in India. We did not perform autoantibody testing in the T2D group and therefore could not actively exclude slowly evolving T1D in this group. However, we used clinical criteria designed to exclude misdiagnosed T1D (by excluding those on insulin within 5 years) and therefore make the assumption this will have a very low prevalence in the T2D group. If the T2D group contained large numbers of people with T1D this would only give a more conservative estimate of the utility of the T1D GRS and the similarity between the European and Indian results suggests this is unlikely. We used odds ratios derived from Europeans for the T1D GRS. We hypothesize that if key HLA risk allele frequencies are similar and these alleles confer similar risk then the T1D GRS is likely to be similarly discriminative. The use of large genome wide association studies to generate the weights in the T1D GRS means the odds ratios are precise for a European population. A natural next step is to try to define genetic relationships in a large Indian cohort with T1D to generate an Indian specific T1D GRS and to test expanded T1D GRSs that include more HLA alleles and recently associated non-HLA loci. However, a critical issue is the power required to do this and without large sample sizes it is possible that a GRS defined in a small (e.g. <1000 cases) cohort may not improve discrimination of T1D.
It is likely that a combined approach to classification of T1D that uses clinical features autoantibodies and genetic risk will be optimal for clinical diagnosis. Once larger cohorts with clinical and biomarker information are assembled it will be possible to test this approach. www.nature.com/scientificreports www.nature.com/scientificreports/

Conclusion
In conclusion, we show that a 9 SNP T1D GRS can be used to help classify diabetes in Indians. This study suggests the T1D GRS could be an effective tool to aid discrimination of T1D from T2D in Indians. Our study additionally highlights differences in OR of key risk alleles for T1D that warrant further investigation.

Data availability
In order to access data generated or used in this study not contained in the manuscript please contact corresponding authors.