Risk factors for type 1 diabetes, including environmental, behavioural and gut microbial factors: a case–control study

Type 1 diabetes (T1D) is a common autoimmune disease that is characterized by insufficient insulin production. The onset of T1D is the result of gene-environment interactions. Sociodemographic and behavioural factors may contribute to T1D, and the gut microbiota is proposed to be a driving factor of T1D. An integrated preventive strategy for T1D is not available at present. This case–control study attempted to estimate the exposure linked to T1D to identify significant risk factors for healthy children. Forty children with T1D and 56 healthy controls were included in this study. Anthropometric, socio-economic, nutritional, behavioural, and clinical data were collected. Faecal bacteria were investigated by molecular methods. The findings showed, in multivariable model, that the risk factors for T1D include higher Firmicutes levels (OR 7.30; IC 2.26–23.54) and higher carbohydrate intake (OR 1.03; IC 1.01–1.05), whereas having a greater amount of Bifidobacterium in the gut (OR 0.13; IC 0.05 – 0.34) was a protective factor for T1D. These findings may facilitate the development of preventive strategies for T1D, such as performing genetic screening, characterizing the gut microbiota, and managing nutritional and social factors.

migrant children was observed in Europe 6,9,10 . Such a pronounced increase in incidence cannot be attributable to genetic factors alone. Other major risk factors may include the environment, Western lifestyle and nutrition 10 . Other diseases with immune involvement, such as allergies, exhibit a similar trend, suggesting an inductor role for exogenous factors regarding the increased predisposition to autoimmunity 11 . Preventive measures to reduce the incidence of T1D have not been defined to date. Various factors seem to be involved in modulating the incidence of T1D, including birth delivery mode, feeding, birth weight, infections (especially viral), dietary behaviour, and pharmaceutical use (especially antibiotics). Such factors may contribute to T1D development during the early disease stage 12 ; however, compared with genetic factors, environmental factors are less well characterized 13 . β-Cell vulnerability to stress factors has been discussed as the basis of the overload hypothesis 14 .
Associations among the microbiome, metabolome, and T1D were shown, highlighting a host-microbiota role in the onset of the disease 12,15 . The origin of the disease process was suspected to be gut microbiota dysbiosis (imbalances in the composition and function of intestinal microbes) associated with altered gut permeability and a major vulnerability of the immune system 6 . Accordingly, evidence obtained from both animal models and human studies suggests that the gut microbiota and the immune system interact closely, emphasizing the role of the intestinal microbiota in the maturation and development of immune functions 16 . Recently, mycobiomebacteriome interactions, as well as intestinal virome and islet autoimmunity, were hypothesized to be drivers of dysbiosis 17 . Several studies have specifically investigated microbiota composition in children with T1D 18-20 , but the results have not been consistent. Interestingly, most studies are in agreement regarding the reduced microbial diversity observed in subjects with T1D compared with controls; moreover, the microbiota structure in T1D subjects was found to be different from that of control subjects 21,22 . To date, a typical T1D-associated microbiota has not been identified [23][24][25][26] . The research also determined that T1D clinical management could be improved by in-depth analysis of the partial remission phase 27 ; however, preventive measures are limited and generally focus only on genetic susceptibility 28 and general population screening for islet autoimmunity 29 . The development of an integrated prediction strategy could be useful for increasing early diagnosis while avoiding onset complications by identifying children at risk of T1D to place under observation and, in the future, to treat with preventive methods 10 .
The aim of this study is to identify environmental, behavioural, and microbial risk factors of T1D onset to develop an integrated T1D preventive management strategy that is suitable for paediatricians in the Piedmont region.

Results
Subject description and origin factor analysis. To analyse the origin factor, the study population was subdivided by the children's origins (Italian and migrant, 69 and 27 children, respectively). An analysis of the socio-demographic and behavioural factors examined in the study showed many differences between Italian and migrant children, while other variables appear to be quite homogeneous (Table 1). In the studied cohort, migrant status did not produce a significant increase in T1D onset.
Approximately 79% of the children in the cohort had siblings; approximately 40% of the included children lived with a pet in the house, and more than 65% of the children took antibiotics during the first two years of life. The residency zone was notably different between Italians and migrants: the percentage of migrant children living in urban sites was higher but not significant following the adjusted model. Regular sports activities seem to be practised more by Italian children than by migrant children (73.5% vs 51.8%, p = 0.054). A total of 77.9% of Italian children and 55.6% of migrant children were subjected to regular health check-ups (p = 0.017). A significant difference was confirmed for the ages of the migrant mother and father (Table 1), meanly 6 years and 4 years younger respectively at recruitment, respect the Italians (p = 0.017 and p = 0.0425). The analysis of eating habits and nutritional intake revealed that the majority of the children were breastfed. Moreover, the weaning age was 6 months, as recommended. Migrant children showed higher total carbohydrate intake (+ 12%, p = 0.044) and simple carbohydrate intake (+ 24%, p = 0.0045). Moreover, among migrants, the children tended to access food by themselves and to consume meals alone. The percentage of migrant children who ate meals while watching TV was higher but not significant. Finally, the one-course meal was more frequent in migrant families (ratio 1:3, p = 0.006).
The analysis of microbiota and bioindicator species displayed no significant differences between Italian and migrant children: the qRT-PCR measurements showed a trend of greater value for the total bacteria (both for   Table 1. Summary of the population anthropometric characteristics, comparing cases and controls: number of children involved, sex, age and anthropometrics as the mean and standard deviation. www.nature.com/scientificreports/ the experimental design with and without probe), Bacteroides and M. smithii (both using 16S rDNA and nifH) in migrant children. The DGGE profile and dendrogram analysis did not show a different clustering pattern based on the origin, and the migrant group showed a trend towards greater α-diversity of the faecal microbiota profiles (Shannon index + 5%). Additionally, the α-diversity analyses in next generation sequencing (NGS) showed a difference in taxonomic units (OTUs), i.e., there were more OTUs in migrants than in Italians, but the difference was not significant, though it was close to the limit of significance (p = 0.057). Furthermore, the phylogenetic diversity index (Faith PD) suggested that the origin of the subjects could influence the structure of the microbial community. Although the overall number of OTUs did not change significantly, the phylogenetic distance of the individual OTUs was greater in the migrant group than in the Italian group, as the OTUs occupied a broader ecological niche in the migrant group.
T1D risk factors. Previous results indicated that being a migrant child in the Piedmont region is not a significant risk factor for T1D onset 30 . Table 2 shows single logistic regressions performed to estimate the impact of the different variables on the outcome. Notably, the analysis of socio-demographic, behavioural, and nutritional determinants revealed that having parents with at least a high school certificate seems to be a protective factor for T1D onset, even if not significant after adjusted comparisons.
High total caloric intake, as well as high protein intake and consumption of total carbohydrates, are associated with only a slightly increased risk of T1D onset.
The DGGE gel and the results of the cluster analysis are shown in Fig. 1. The Pearson similarity clustering showed macro beta-diversity differences between the T1D patients and healthy children, with the main division being in two different clusters.
Firmicutes and Bacteroidetes followed by Proteobacteria and Actinobacteria (Table 3) predominantly composed the gut microbiota of all children. In the children with diabetes, an increase in the levels of three members of Bacteroidetes (Alistipes senegalensis, Bacteroides timonensis, and Barnesiella intestinihominis) and three members of Firmicutes (Christensenella timonensis, Ruminococcus bromii, and Urmitella timonensis) was observed by sequencing.
Furthermore, other notable results were obtained by NGS analyses. The taxonomic analysis revealed that the gut microbiota of the study participants was composed of nine relevant phyla: Firmicutes, Bacteroidetes, Actinobacteria, Proteobacteria, Verrucomicrobia, Euryarchaeota, Tenericutes, Cyanobacteria, and an unclassified phylum.
Moreover, beta-diversity analyses were carried out to highlight the differences among the samples based on the structures of their microbial communities. The weighted UniFrac metric showed that the samples were not subdivided into clusters. The intragroup and intergroup distances were comparable, and there was no separation between the clusters. These findings were confirmed by the Permanova test. Finally, analyses of the differential abundance were performed to compare the increase or decrease in the abundance of one or more bacteria in the case and control groups. DeSeq2 showed 48 significantly abundant OTUs (p < 0.001). The most abundant OTU www.nature.com/scientificreports/ was Rikenellaceae followed by Prevotellaceae (Prevotella copri), Barnesiellaceae, Lachnospiraceae, and Ruminococcaceae (Ruminococcus bromii), which were significantly more abundant in children with diabetes. The difference in the results observed between methods is an interesting discussion point. The methods are characterized by different sensitivities; they represent different molecular perspectives regarding the faecal microbiota. When a method with a higher sensibility is used (NGS), a flattening effect is possible. On the other hand, the major abundance of such genera as Ruminococcus was confirmed by different microbiota study methods, which is in keeping with the qRT-PCR results. A group of 23 samples showed different clusterization compared to the others (Fig. 2, left). This small group was not different from the main group regarding any characteristics. The only significant difference was observed for the M. smithii presence and the A. muciniphila levels, both of which were higher in the separated group (Fig. 2, right). A. muciniphila was proposed as a probiotic 31 , while M. smithii has been characterized as the most abundant methanogen in the gut 32 .
The multivariable analysis produced a R 2 = 0.6259 (p < 0.001). After adjusting for confounding factors, the likelihood of having diabetes is significantly higher in those with higher amount of Firmicutes, lower amount of Bifidobacterium spp and a higher amount of total carbohydrate intake (Table 4).

Discussion
T1D is an important disease that affects health with onset primarily occurring in childhood. At present, there is no cure for this disease, and only disease management is possible. The disease burden of T1D is immense, especially considering the number of years of life lost due to disability but also the years of life lost due to premature death. The life expectancy for T1D patients is approximately 16 years shorter than that of the comparable healthy population 33 . Even if relevant risk factors are known, to date, such scientific determinants do not include a screening programme for preventive purposes. Of course, preventive action must be considered as a systematic process that focuses on the main risk factors to identify children at higher risk of T1D and to suggest efficacious preventive treatments. In the study, the main T1D onset risk factors seem to be identifiable in the composition of the microbiota and, in particular, the microbiota α-diversity, Firmicutes and Bacteroidetes levels and their ratio, as well as the Bifidobacterium level. Similar evidence was obtained by other studies, which observed both higher Bacteroidetes in T1D patients 34,35 and less abundant anti-inflammatory genera in children with multiple islet autoantibodies 36 . Reduced microbial diversity appears to become significant between seroconversion and overt T1D 15 . A significant difference in the Bifidobacterium level was observed in different studies, including both a small cohort of autoimmune children 37,38 and a larger population associated with such protective factors as breastfeeding 21 . At the genus level, a significant difference in, for example, Blautia (increased in patients), was observed 39 ; however, in other studies, different single species (Bacteroides ovatus) seem to be more abundant in patients than in the controls 18 . However, prior studies suggest the presence of duodenal mucosa abnormalities in the inflammatory profile for T1D patients 22,40 and on the T1D-related changes in the gut microbiota, even if proving the causality of these factors has remained challenging 21 .
The characterization of the microbiota is rapidly evolving. Traditional methods that are not as sensitive as PCR-DGGE are still suitable, while NGS methods are expanding. Sophisticated whole-genome sequencing www.nature.com/scientificreports/ development of a simple method to describe microbiota modulation using validated biomarkers, which could serve as a rapid screening test, may be warranted. Another risk factor is the occurrence of stress due to a traumatic or emotional experience. This stress seems to be able to affect the autoimmunity process. Therefore, particular attention could be paid to such risk factors for T1D risk in children.
A high education level of one or both parents could be also protective, suggesting that socioeconomic factors affect the T1D risk. Other factors, identified as significant risk modulators among behavioural and nutritional factors, had minor effects.
The study has some potential limitations, including susceptibility to bias in recollection about exposure and reverse causality. The exposure recollection could be biased, but this issue can be less influential at the onset, as in this study. Moreover, recruitment at the onset guarantees a temporal coherence of the exposure with respect to the disease onset.
T1D is one of the most frequently diagnosed diseases in children; however, it is not a high-incidence disease. The prospective inclusion of a large number of healthy children, which is needed for the observation of enough cases, requires a very long time of observation. Moreover, a restricted age range was necessary in children for the rapid changes in behaviour and microbiota. This requirement resulted in an additional included subject restriction. On the other hand, the study of multifactorial diseases with poorly understood pathogenic pathways is imperative, even if it is at risk for obtaining less conclusive evidence. Of course, such a study alone could not elucidate the causation process, but the evidence obtained could be important for the selection of higher-risk subpopulations, planning of future research, and improving prevention.
Identification of a higher-risk subpopulation is strictly relevant for the subsequent validation of an efficient preventive screening to be produced with a prospective method. Of course, the pathogenesis of type 1 diabetes has not been fully elucidated to date; however, in this study, various factors (associated with both the disease and the microbiota composition) were included, such as the origin of the children, the age of the mother, the age of breastfeeding and the age of weaning. Other possible confounding factors not included in our analysis are viral infections, particularly enteroviruses, and preterm birth; however, there was no clear consensus regarding these novel factors at the beginning of the study.
Concerning the microbiota, the knowledge is still incomplete, and various factors can interact to produce a T1D risk modulation that is not explainable at present. Moreover, the results obtained using different techniques were also dissimilar (for example, clusterization due to β-diversity analysis). This finding is likely due to the different sensitivities of the applied methods 41 . Furthermore, even if the time between the symptom comparison and the diagnosis is very short, there is a danger of biased estimates due to reverse causality.
In conclusion, this study confirmed that T1D onset risk is modulated by compositional changes in the gut microbiota and that such evidence must be employed to devise preventive measure. The results showed that the gut microbial indicators found in children with T1D differ from those found in healthy children. These findings also pave the way for new research attempting to develop strategies to control T1D development by modifying the gut microbiota. However, a better knowledge of gut microbial composition associated with the development of T1D must be obtained to choose the best treatment 10,[42][43][44][45] .
In brief, direct or indirect manipulations of the intestinal microbiome may provide effective measures for preventing or delaying the disease process leading to the manifestation of clinical T1D. At present, a preventive strategy could be developed that includes the main genetic and microbiome risk factors. Then, this strategy could be applied to healthy children to reduce the burden of T1D.

Methods
Study design and participants. The case-control study began in January 2016 46  The recruitment included 40 paediatric patients with T1D (cases) and 56 healthy children (controls), who were comparable in terms of age, gender, and ethnicity to avoid bias. The included subjects represent the most convenient sample possible. The inclusion criteria were age (5-10 years), normal weight, and residence in Piedmont. Exclusion criteria were celiac disease, chronic disease diagnosis, eating disorders, active infections, use of antibiotics and/or probiotics and/or any other medical treatment that influences intestinal microbiota during the 3 months before recruitment and children with parents of mixed origins (Italian and migrant) for the exclusion of important confounding factors due to genetic and cultural mixed backgrounds 19 .
The T1D children were integrated into the study at disease onset, with hyperglycaemia, with or without ketoacidosis, polyuria symptoms, a high value of glycated haemoglobin (HbA1c > 42 mmol/mol) and T1Dspecific autoantibody positivity. Healthy children were contacted by paediatricians in the territory of the acute care system. The guardians of the enlisting children read, understood, and then signed informed consent forms following the declaration of Helsinki. A module is prepared for parents, children, and mature children 47 . All the following methods were carried out following relevant guidelines and regulations when available. A questionnaire was given to the parents containing items and questions to retrieve data on the family contest with particular regards to emotive stressors, such as mourning or separation, anthropometrics, and socio-demographic, nutritional, and behavioural information. www.nature.com/scientificreports/ Anthropometric and nutritional data included weight, height, body mass index (BMI), food frequency based on 24-h recall and a food frequency questionnaire (FFQ), neonatal feeding, and age of weaning. The anthropometric parameters (weight and height) were measured according to standard recommendations. The BMI values were interpreted according to the WHO criterion. The 24-h recall technique reconstructed the meals and food intake on a recent "typical" day, estimating the bromatological inputs according to a food composition database for epidemiological studies in Italy (BDA). The FFQ, developed for the study, focused on the consumption of certain food categories (those containing sugars, fibre, omega-3, calcium, vitamin D, condiments, and cereals) and eating habits (e.g., alone or with adults, in front of the TV).
Twenty-eight percent of the involved population is migrants (both parents not Italian). Such data are consistent with the percentage of newborns from non-Italian mothers, which is approximately 30% in northern Italy 48 . The migrant group included children coming mainly from northern Africa and Eastern Europe. The migration involved the parents and sometimes the children; on average, the included children as migrants were residents in Italy for less than 5 years. At the end of recruitment, no significant differences were observed between the case and control groups for age, sex composition, and origins (criteria for pairing) or for height, weight, and BMI (T-test, p > 0.05) ( Table 5).

Sample collection and DNA extraction.
A kit for stool collection was delivered to each study participant following a validated procedure 49,50 and using a Fecotainer device (Tag Hemi VOF, Netherlands). Faecal samples were homogenized within 24 h in the laboratory, and five 2 g aliquots were stored at − 80 °C until DNA isolation was performed. Total DNA extractions from the stool samples were performed using the QiaAmp PowerFecal DNA Kit (QIAGEN, Hilden, Germany). The nucleic acids were quantified using a NanoQuant Plate (TECAN Trading AG, Switzerland), which allows quantification using a spectrophotometer read at 260 nm. The spectrophotometer used was the TECAN Infinite 200 PRO, and the software was i-Control (version 1.11.10). The extracted DNA concentrations ranged from 1.1-155.5 ng/μl (mean 41.35 ± 38.70 ng/μL). Samples were stored at -20 °C until molecular analysis was performed. www.nature.com/scientificreports/ PCR-DGGE. The PCR products for denaturing gradient gel electrophoresis (DGGE) were obtained by amplifying the bacterial 16S rRNA genes following a marker gene analysis approach 51 . The primer pairs were 357F-GC and 518R (

NGS.
High-throughput DNA sequencing and analysis were conducted by BMR Genomics s s.r.l. The V3-V4 region of 16S rDNA was amplified using the MiSeq 300PEPro341F and Pro805R primer pair 6 . The sample reads were above 12*10 6 . The reaction mixture (25 μl Data elaboration and statistical analyses. The statistical analysis was performed using STATA version 11.0. Moreover, the data on the included T1D patients and healthy controls were elaborated to highlight the likelihood of having diabetes. A descriptive analysis of the variables was conducted. The data were reported as absolute numbers and percentages for categorical variables and as means and standard deviations for continuous variables. Moreover, the subjects were divided by individual origins into two groups: Italian and migrant, considering the origin of the children and their families, to show differences in the distribution of disease determinants and to assess whether being a migrant could be associated with T1D onset. Differences between Italian and migrant children were assessed using the χ 2 test with Fisher's correction for categorical variables and Student's t-test for continuous variables. Univariable logistic regression was then performed to estimate the impact of sociodemographic, nutritional, and microbiota-related variables on the outcome. These associations were Table 6. Multivariable logistic regression model assessing potential risk factors of T1D. *Odds Ratio, adjusted also for age and gender. **Confidence Interval. www.nature.com/scientificreports/ expressed as odds ratios (OR) at a 95% confidence interval (CI). Moreover, the adjusted p-value for multiple comparisons was calculated using the Benjamini and Hochberg false discovery rate method. We conducted multivariable analyses including various variables (age, gender, Firmicutes, Bifidobacterium spp., and total carbohydrate intake) and the risk of type 1 diabetes using logistic regression models. The Spearman rank-order correlation coefficient was also determined to assess the relationships between variables. A p-value p < 0.05 was considered significant for all analyses. The DGGE gel analysis was performed with Bionumerics 7.2. The hierarchical classification was performed with a UPGMA system (1% tolerance and optimization level) and Pearson correlation. Simpson's diversity index, Shannon's index, and Margalef index were calculated for each DGGE profile to evaluate alpha diversity.
NGS bioinformatics analysis was performed with the software pipeline Qiime2. The reads were cleaned up by the primers using the software Cutadapt (version 2018.8.0) and processed with the software DADA2. The sequences were trimmed at the 3′ end (forward: 270 bp; reverse 260 bp), filtered by quality, and merged with default values. Subsequently, the sequences were elaborated to obtain unique sequences. In this phase, the chimaeras (denoised-paired) are also eliminated. The sequences were clustered against unique sequences at 99% similarity. The taxonomies of both GreenGenes (version  and Silva (version 132) were assigned to the OTU sequences. Alpha-diversity analyses were performed on all samples using the observed OTUs, Shannon, Pielou's evenness, and Faith PD indices, and for each index, the Kruskal-Wallis test was used to verify the significance of the comparisons between samples. Beta-diversity analyses were performed on all samples using the Bray-Curtis, Jaccard, and UniFrac metrics (weighted and unweighted). Multivariable statistical analyses were performed using the PERMANOVA, Adonis, and ANOSIM tests; instead, the analysis of the differential abundance was based on the packages of R (MetagenomeSeq, DeSeq2, and ANCOM).