Association of SLC11A1 polymorphisms with anthropometric and biochemical parameters describing Type 2 Diabetes Mellitus

Diabetes, a leading cause of death globally, has different types, with Type 2 Diabetes Mellitus (T2DM) being the most prevalent one. It has been established that variations in the SLC11A1 gene impact risk of developing infectious, inflammatory, and endocrine disorders. This study is aimed to investigate the association between the SLC11A1 gene polymorphisms (rs3731864 G/A, rs3731865 C/G, and rs17235416 + TGTG/− TGTG) and anthropometric and biochemical parameters describing T2DM. Eight hundred participants (400 in each case and control group) were genotyped using the polymerase chain reaction-restriction fragment length polymorphism (PCR–RFLP) and amplification-refractory mutation system-PCR (ARMS-PCR) methods. Lipid profile, fasting blood sugar (FBS), hemoglobin A1c level, and anthropometric indices were also recorded for each subject. Findings revealed that SLC11A1–rs3731864 G/A, –rs17235416 (+ TGTG/− TGTG) were associated with T2DM susceptibility, providing protection against the disease. In contrast, SLC11A1–rs3731865 G/C conferred an increased risk of T2DM. We also noticed a significant association between SLC11A1–rs3731864 G/A and triglyceride levels in patients with T2DM. In silico evaluations demonstrated that the SLC11A2 and ATP7A proteins also interact directly with the SLC11A1 protein in Homo sapiens. In addition, allelic substitutions for both intronic variants disrupt or create binding sites for splicing factors and serve a functional effect. Overall, our findings highlighted the role of SLC11A1 gene variations might have positive (rs3731865 G/C) or negative (rs3731864 G/A and rs17235416 + TGTG/− TGTG) associations with a predisposition to T2DM.

To calculate sample size, we conducted a pilot study to collect blood samples from a small population (100 participants, including 50 T2DM patients and 50 healthy subjects) and genotyped all of the examined SNPs.This allowed us to identify an adequate sample size.The chi-square test was then used to calculate the allelic frequencies of the investigated variants in both groups.The estimated frequencies were then subjected to a sample size analysis utilizing the sample size calculator server's online version (available at: https:// clinc alc.com/ stats/ Sampl eSize.aspx).The server uses the below formula to calculate sample size.
where P1 represents the frequency of the wild or mutant allele in control, P2 is the frequency of the wild or mutant allele in case, Z is the critical Z value for a given α or β, α indicates the probability of type I error (usually 0.05), and β is considered the probability of type II error (usually 0.2).The calculator was used to determine the sample size for the tested variations in the studied groups, with study power set to 80%.The threshold of sample size was then adjusted for a total of 800 subjects.
The weight and height were measured twice for each person to calculate the BMI, and the average was considered for the final measure.For this, weight was determined with minimum clothing, without shoes, and with a standard weight gauge with an accuracy of 100 g.A standard meter with an accuracy of 0.1 cm was also utilized to calculate the height while the person was barefoot and placed next to the behind-the-leg gauge.Moreover, the narrowest waist area between the lowermost rib and the iliac crest above the navel was metered as waist circumference (WC) with an inelastic measuring tape to within 0.1 cm.The widest hip area and its maximum bulge were determined to measure the hip circumference with an accuracy of 0.1 cm.Waist-to-hip ratio (WHR) was calculated by dividing the waist circumference by the hip circumference in centimeters.The conicity index (CI) was calculated as previously described by Shidfar et al. 26 .Table 1 summarizes the clinical and demographic characteristics of all participants.www.nature.com/scientificreports/Genomic DNA isolation and genotyping.Genomic DNA was extracted from nucleated white blood cells using a simple salting-out technique 27 .The purity and concentration of extracted DNA were determined by calculating the 260/280 optical density ratio using a Nanodrop device (Maestrogen®, Taiwan).Data for selecting variations and designing specific primers were acquired from National Center for Biotechnology Information (NCBI) database.Specific primers were designed using the Gene Runner® v.6.5.52 Beta software and produced by GenFanAvaran Company in Iran.Studied variations were genotyped by applying the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) (for SLC11A1-rs3731864 and -rs3731865 SNPs) and amplification-refractory mutation system-PCR (ARMS-PCR) (for SLC11A1-rs17235416) techniques.The reaction mixture had a final volume of 20 μL and contained 0.9 μL of genomic DNA (~ 60 ng/mL), 0.8 μL of each primer (8 pmol), 10 μL of 2 × Taq PreMix (Parstous Biotechnology®, Mashhad, Iran), and 7.5 μL of double-distilled water.The mixture was cycled using a Techne thermal cycler (Techne, US) under the following conditions: initial denaturation at 95 °C for 5 min, 35 cycles at 94 °C for 30 s, specific annealing temperatures (based on Supplementary Table S1 for each variation) for 30 s, and an extension step at 72 °C for 30 s.These stages were followed by a final extension step at 72 °C for 5 min.
Statistical analyses.SPSS version 22.0 software (SPSS, Inc., Chicago, IL, USA) was recruited for data analysis.Deviation from Hardy-Weinberg Equilibrium (HWE) was assessed via Pearson's Chi-square test.Continuous variables were compared between cases and controls using standard single sample t statistic, Mann-Whitney-Wilcoxon, and Pearson Chi-Square tests where appropriate and expressed as mean ± standard deviation (SD).Odds ratio (OR) and 95% confidence intervals were calculated to estimate the relative risk of the disease.Binary logistic regression analysis was employed to examine the correlation between the clinical-demographic findings of the studied groups and T2DM risk.Besides, haplotype analysis was conducted through the online SHEsis software 28 .A p-value less than 0.05 was considered statistically significant.

Computational analyses.
A complex set of RNA-binding proteins controls the post-transcriptional processing of RNA, such as capping, polyadenylation, splicing, export, and the protein's secondary structure.Allelic substitution in DNA can affect some of these processes, primarily the accuracy and efficiency of splicing, by altering the complex of proteins bound to the pre-mRNAs.Knowing the RNA sequences recognized by each protein involved in post-transcriptional RNA processing is necessary for predicting the effects of mutations at both RNA and protein levels.For this purpose, we recruited the SpliceAid database to determine the impact of studying both intronic variants of SLC11A1 on the pattern of splicing processes 29 .SpliceAid is a web-based tool collecting all the experimentally assessed target RNA sequences that are bound by splicing proteins in humans.Using the www.nature.com/scientificreports/SpliceAid server, the user submits sequences, and the server identifies the exact correspondence between the sequences submitted and the sequences in the database, giving accurate and dynamic graphic results.The WebLogo v.2.8.2 server was employed to identify the preserved regions of all three studied polymorphisms 30 .Using WebLogo, sequence logos are generated, representing patterns within multiple sequence alignments.In comparison with consensus sequences, sequence logos provide a more detailed and more accurate description of sequence similarity and can be used to quickly reveal characteristics of an alignment that would otherwise be difficult to detect.An individual logo consists of a stack of letters at each position in the sequence, one stack for each letter.Stack heights (measured in bits) represent sequence conservation at each position, and symbol heights reflect the relative frequency of amino acids or nucleic acids at each position 30 .As defined by Schneider and Stephens, sequence conservation is the difference between the entropy of the observed symbol distribution and the maximum possible entropy 31 .In particular, sequence logos offer a richer and more detailed description of, for example, a binding site, as compared with consensus sequences 32 .Sequence information of genomic DNA in different formats, including CLUSTALW, FASTA, MSF, NBRF, PIR, NEXUS, PHYLIP, and plain flat-file, can be entered into the Weblogo server (available at https:// weblo go.three pluso ne.com/ create.cgi) for multiple sequence alignments.Depending on how frequently they occur, various SNPs are scaled.
To predict the protein-protein interaction (PPI) network of the SLC11A1 protein, the newest version (01-04-2022) of the web-based inBio Discover™ database (https:// inbio-disco ver.intom ics.com/ map.html) was employed 33 .The database is a comprehensive and accurate PPI resource built from more than six million traceable entries, showing a set of highly trusted interactions between proteins based on experimentally determined databases.This research put the network expansion in the "Include neighboring proteins" mode to show close and related proteins in terms of expression, regulation, or function.In this connection, pathway interactions are indicated as lines, and the remainder is inBio Map™ high-confidence interactions.Data were generated by entering the UniProt ID for the SLC11A1 protein, P49279, into the server.UniProt is an online reservoir for proteins that extracts data from the Swiss-Prot, TrEMBL, and PIR-PSD databases 33,34 .Expression, regulatory, and function-related proteins were made available through the network expansion method of this database.In order to design an interaction network for SLC11A1 as a hub gene, information regarding the known and/or predicted interactions, gene fusion, co-expression, and protein homology was obtained using STRING 34 .STRING imports protein association knowledge from databases of physical interactions and databases of curated biological pathway knowledge (MINT, HPRD, BIND, DIP, BioGRID, KEGG, Reactome, IntAct, EcoCyc, NCI-Nature Pathway Interaction Database, and GO).Finally, inBio Discover™ was utilized to analyze the PPI network.

Results
Laboratory and demographic findings.The case group consists of 400 patients with T2DM (274 women and 126 men; mean age of 54.4 ± 9.7) and 400 healthy control subjects (277 females and 123 males; the average age of 53.4 ± 9.6).As shown in Table 1, no marked difference was noticed between the studied groups concerning age, gender, and WHR (p = 0.058, 0.819, and 0.439, respectively).At the same time, FBS, TG, TC, HDL-c, LDL-c, conicity index, and body mass index (BMI) were significantly different between cases and controls (p < 0.001).
The correlation between the SLC11A1 SNPs and laboratory findings and the demographical characteristics of the studied groups are shown in Table 3.We noticed a significant association between SLC11A1-rs3731864 G/A and TG levels in patients with T2DM (p = 0.048).The SLC11A1-rs3731865 C/G variant was associated with WC and LDL-c levels of the healthy controls (p = 0.046 and 0.006, respectively).Moreover, the SLC11A1-rs17235416 Ins /Del variant was associated with WC, conicity index, and HDL-c levels in controls (p = 0.018, 0.027, and 0.025, respectively).S2 represents the association between SLC11A1-rs3731864 G/A, -rs3731865 G/C, and -rs17235416 +TGTG/−TGTG haplotypes in T2DM cases and healthy controls.We found a higher frequency of the SLC11A1-rs3731864 G/A, -rs3731865 G/C, and -rs17235416 +TGTG/−TGTG haplotypes in patients with T2DM compared with controls.Compared with the reference haplotype (G/C/ + TGTG), the A/C/ + TGTG haplotype of rs3731864/rs373186/rs17235416 significantly diminished T2DM risk in our population (OR 0.84, 95% CI 0.27-0.85,and p = 0.043).The linkage disequilibrium (LD) between three SLC11A1 polymorphisms was also calculated, and no strong LD was found between the three studied variations (Supplementary Fig. S1).Table 4 summarizes the interaction analysis of SLC11A1 polymorphisms on T2DM risk.Compared with the reference combination (GG/GC/Ins-Del), the genotype combination of GA/CC/Ins-Ins markedly increased T2DM risk by 1.66 folds (OR 1.66, CI 1.11-2.48,and p = 0.013), whereas the GA/CC/ Ins-Ins combination diminished T2DM risk by 57% (OR 0.43, 95% CI 0.26-0.71,and p < 0.001).

Computational predictions.
The results of the SpliceAid server showed that G to A substitution in the rs3731864 position disrupts the binding sites of some splicing factors, including SC35, SF2/ASF, hnRNP F, hnRNP H3, hnRNP H1, and hnRNP H2.On the contrary, nucleotide change on the position of rs3731865 creates a binding site for SF2/ASF and hnRNP H3 factors (Fig. 3).Variation analysis using the WebLogo server  www.nature.com/scientificreports/demonstrated that all three studied polymorphisms, especially SLC11A1-rs3731864 G/A and -rs3731865 G/C, resided in unconserved regions across multiple mammalian species (Fig. 4).Furthermore, the inBio Discover™ databank revealed that solute carrier family 11 member 2 (SLC11A2) and ATPase copper transporting alpha (ATP7A) proteins have direct interactions with the SLC11A1 protein in Homo Sapiens.According to the known interactions (from curated databases and experimentally ascertained), the ATP7A in and of itself interacts with SLC11A2, solute carrier family 31 member 2 (SLC31A2), and antioxidant 1 copper chaperone (ATOX1) in Homo sapiens (Fig. 5).

Discussion
In the last 30 years, the prevalence of T2DM has doubled, making it one of the most significant global health issues 35 , suggesting the urgent need to identify novel biomarkers for early diagnosis of this endocrine disease.Genetic variations located in the intronic region 36 or the 3′-Untranslated Region (UTR) 37,38 of some genes have been found to independently contribute to the predisposition to T2DM, suggesting that there may be other, as-yet-undiscoverable functional variations in Homo sapiens.Additional in-depth population genetic research will be required to comprehend the connection between more complicated haplotypes and disease development in various geographical areas 39 .In the present work, for the first time, we sought to investigate the correlation between SLC11A1 variants and the risk of T2DM in a sample of the Iranian population.Our findings showed a significant association between SLC11A1 polymorphisms and T2DM risk, where rs3731865 G/C markedly enhanced T2DM risk and both rs3731864 G/A and rs17235416 + TGTG/− TGTG variants significantly diminished the risk of this endocrine diseases under different genetic models.
In terms of SLC11A1, most previous reports focused on the role of SLC11A1 variants in the pathogenesis of autoimmune and infectious diseases 40 , such as HIV 41 .It was also suggested that the (GT)n polymorphism's highexpressing allele (iii) and low-expressing allele (ii), respectively, might be responsible for susceptibility to these conditions.This has also been confirmed in numerous studies on autoimmune and infectious diseases, including tuberculosis, demonstrating that there may be some balance in selecting factors that maintain both alleles in the population 39 .In another study by Ling et al. (2014), it was reported that the A allele in SLC6A20-rs13062383 increases the susceptibility to T2DM in populations with different genetic backgrounds 42 .Xu et al. concluded that T2DM is associated with the AA genotype of SLC30A8-rs11558471 in Homo sapiens.Haplotype A/C/A seems to be a risk factor, and haplotype A/C/G may be a protective factor against T2DM in the Han population 43 .The results of Chen et al. 's meta-analysis in 2015 showed that SLC30A8 rs13266634 might be a crucial genetic contributor to the risk of T2DM among Asians and Europeans but not Africans.It also indicated that people with the CC genotype have a 33.0% and 16.5% higher risk of T2DM compared with those with TT and CT genotypes, respectively 14 .According to research by Zaahl et al., the promoter SNP rs7573065 (− 237 C/T) plays a protective role in the contribution of SLC11A1 to inflammatory bowel disease 44 .They also revealed that when allele 3 of the 5′ microsatellite was present, the allele C to T alteration at the position of 237 (rs7573065) downregulated SLC11A1 to a level comparable to that observed allele 2 of the microsatellite 45 .www.nature.com/scientificreports/Kissler et al. discovered that SLC11A1 downregulation in NOD mice mimicked the protective Idd5.2 T1DMresistant haplotype and decreased the prevalence of T1DM 46 .It was found that this gene affects the ability of dendritic cells (DCs) to process and present pancreatic islet antigens [i.e., glutamic acid decarboxylase GAD65], increasing the stimulation of a diabetogenic T-cell clone 47 .Unfortunately, there is a lack of evidence for the involvement of SLC11A1 variants in the etiology of DM.In this regard, Yang and colleagues concluded that the SLC11A1 gene variant rs3731685 (INT4) might be correlated with T1DM risk in a population of European ancestry.Although they found no correlation between mRNA levels of SLC11A1 and different genotypes of this SNP in whole blood samples, a possible association with purified cell subsets, particularly monocytes or macrophages, could not be completely ruled out 48 .In another cohort study, Takahashi et al. examined the SNPs located in the promoter region of SLC11A1, which might affect transcriptional activity, in 224 controls and 95 Japanese cases of T1DM.Japanese participants have been found to carry the specified alleles 2, 3, and 7.They found a significant difference in the subset of Japanese individuals with T1DM; these patients had considerably higher allele 7 frequencies than the healthy subjects, as did those carrying no susceptibility HLA class II haplotypes, DR4-DQ4 or DR9-DQ9.Overall, they concluded that the new promoter variant of SLC11A1 impacts Japanese individuals' susceptibility to T1DM 49 .Mycobacterium avium subsp.paratuberculosis (MAP) has been linked to the onset of T1DM; accordingly, Paccagnini et al. examined 59 T1DM cases and 79 healthy individuals for 9 SLC11A1 SNPs and the presence of MAP using the PCR technique.Blood levels of MAP DNA and the 274C/T SCL11A1 polymorphism were discovered to be linked to T1DM.Because MAP is not degraded by macrophages and is processed by DCs, it is important to determine whether mutant variants of SLC11A1 affect the processing or presentation of MAP antigens, which could lead to an autoimmune disorder and T1DM 50 .In agreement with Regarding "Known Interactions", the blue lines represents interactions based on curated databases and the purple lines highlight interactions based on experimentally determined.Regarding "Predicted Interactions", the green lines show interactions based on gene neighborhood; the red lines indicate gene fusions; and the dark blue, represents gene co-occurrence.Regarding "Others, " the yellow lines represents interaction based on text mining; the black lines show co-expression; and the light blue lines represent protein homology.The above classification information was obtained using STRING.STRING imports protein association knowledge from databases of physical interactions and databases of curated biological pathway knowledge (MINT, HPRD, BIND, DIP, BioGRID, KEGG, Reactome, IntAct, EcoCyc, NCI-Nature Pathway Interaction Database, and GO).A PPI analysis was conducted using inBio Discover™ to investigate the possible interactions between SLC11A1 and other proteins.The PPI analysis was conducted using inBio Discover™ to investigate possible interactions between SLC11A1 and other proteins.Our Bioinformatics results showed that some of the genes that have direct interaction with SLC11A1, including ATP7A, FGB, FGA, SLC11A2, and SLC40A1, can also play important roles in the course of T2DM, and this makes SLC11A1 a hub protein that can regulate different signaling pathways involved in the pathogenesis of T2DM.PPI protein-protein interaction, SLC11A1 solute carrier family 11 member 1, SLC11A2 solute carrier family 11 member 2, SLC25A37 solute carrier family 25 member 37, SLC31A2 solute carrier family 31 member 2, SLC34A1 solute carrier family 34 member 1, SLC34A2, solute carrier family 34 member 2, SLC40A1 solute carrier family 40 member 1, ATOX1 antioxidant 1 copper chaperone, ATP7A ATPase copper transporting alpha, HAMP hepcidin antimicrobial peptide, FGA fibrinogen alpha chain, FGB fibrinogen beta chain, FGG fibrinogen gamma chain, F2 coagulation factor II, thrombin, SPL1 squamosa promoter binding protein-like 1, IRF8 interferon regulatory factor 8, GATA2 GATA binding protein 2, TTN titin, NEB nebulin, TRIM63 tripartite motif containing 63, ACTN2 actinin alpha 2, TCAP titin-cap.www.nature.com/scientificreports/these reports, we found a negative association between T2DM and two SNPs in SLC11A1 (rs3731864 G/A and rs17235416) and a positive association between the disease and SLC11A1 rs3731865 G/C.Both of the studied intronic variants were located in the splicing sites of the SLC11A1 gene.Interestingly, results of our web-based analysis showed that the A allele of SLC11A1 rs3731864 disrupts the binding sites of SC35, SF2/ASF, heterogeneous nuclear RNA protein (hnRNP) F, hnRNP H3, hnRNP H1, and hnRNP H2, whereas the minor allele of SLC11A1 rs3731865 creates the binding site for SF2/ASF and hnRNP H3, as splicing factors.This is important because splicing factors are involved in regulating distinct gene expression processes 51 .It has been documented that alternative splicing via SF2/ASF 52 , hnRNP F 53,54 can contribute to the pathogenesis of diabetes or cause insulin resistance.Furthermore, overexpression of the hnRNP H1 was also observed in the nucleus of Inflamed Islets of fulminant T1DM 55 .A distinct binding specificity has been reported between the human splicing factors ASF/SF2 and SC35, and these specificities are functionally important 56 .
An interaction network between proteins comprises a few highly connected nodes (known as hubs) and many poorly connected nodes.In genome-wide studies, it has been established that the deletion of a hub protein increases the likelihood of death, a phenomenon known as the centrality-lethality rule.A key notion of systems biology lies in the biological significance of network architectures, which are believed to reflect the special role hubs play in organizing networks 57 .Since proteins cannot act alone, most cellular functions depend on interactions between them.In the current study, we have utilized the inBio Discover™ to explore possible interactions between SLC11A1 and other proteins to gain valuable insight into a complex interaction network that may be responsible for the onset of T2DM.This is important since, to the best of our knowledge, the role of this solute carrier protein has not been studied in T2DM patients.Our Bioinformatics results showed that some of the genes that directly interact with SLC11A1 can also play important roles in the course of T2DM, making SLC11A1 a hub protein that can regulate different signaling pathways involved in the pathogenesis of T2DM.SLC11A1, a divalent cation transporter, plays an important role in early macrophage activation and exerts multiple pleiotropic effects on macrophage function, including on the expression of chemokines, IL-1β, tumor necrosis factor α (TNF-α)inducible nitric oxide synthase, and MHC class II molecules.The multiple pleiotropic effects of SLC11A1 on macrophage function suggest that SLC11A1 is a prime candidate for T1DM in humans and mice 49 .
The main mediator of iron transfer is SLC11A2, and iron is absorbed through this apical transporter in intestinal epithelial cells and macrophages.Previous studies have demonstrated that iron metabolism can affect insulin sensitivity, leading to T2DM 58 .Ferroptosis is also associated with diabetic cognitive dysfunction, and a previous study has shown that Slc40a1 mediates ferroptosis in T1DM 59 .Accordingly, we found that SLC11A1 and ATPase copper transporting alpha (ATP7A) are the most relevant proteins interacting with SLC11A1.This is important since it has been established that ATP7A 60 , fibrinogen chain beta (FGB) 61 , fibrinogen chain alpha (FGA) 62 , Solute carrier family 11 member 2 (SLC11A2) 58 , and Solute carrier family 40 member 1 (SLC40A1) 63 might have essential roles in the pathogenesis of T2DM.Interestingly, hemostatic dysfunction and subclinical inflammation might play a role in the complex etiopathogenesis of diabetic peripheral neuropathy (DPN).Fibrinogen is involved in both hemostatic and inflammatory pathways, and it is hypothesized that fibrinogen gene polymorphisms might be associated with DPN 64 .In general, various studies have shown this gene to be related to the pathogenesis of T2DM and SLC11A1.These evidences suggest that SLC11A1 may act as a regulatory hub for controlling cell's metabolism and activity through controlling the activity of other genes.Further investigation on the relationship between these proteins and the investigated receptor are warranted.
Perversions from the normal pattern represent DNA distortion or base flipping in sequence 30 .From a clinical perspective, SNPs are potential diagnostic and therapeutic biomarkers for many types of cancer and metabolic disorders.Those in the promoter region affect gene expression by altering the promoter activity, binding of transcription factors, DNA methylation, and histone structure.Introns comprise approximately half of the human noncoding genome and have critical regulatory roles in gene regulation and expression.SNPs in intronic regions might cause diseases and alter the genotype-phenotype association by generating splice variants of transcripts and promoting or disrupting the binding and function of long noncoding RNAs (lncRNAs) (such as rs3731864 G/A and rs3731865 G/C).SNPs in the 5′-UTR regions can potentially affect translation, whereas those in the 3′-UTR region (i.e., rs17235416 + TGTG/−TGTG) impact the binding of microRNAs (miRNAs) 65,66 .Intronic sequences might be conserved, as they contain expression-regulating elements that impose functional constraints on their evolution 67,68 .SNPs located in these regions could be pathogenic even if they are conserved 69 .Variation analysis revealed that all three studied polymorphisms, specifically SLC11A1 rs3731864 G/A and rs3731865 G/C, reside in unconserved regions across multiple mammalian species.Understanding the mechanisms underlying the effects of SNPs that result in metabolic diseases such as diabetes is critical for elucidating their molecular pathogenesis.
Based on the chromosomal position, rs3731864 is located 18 bp before exon 6, rs3731865 is located 14 bp after exon 4, and rs17235416 into exon 15 of the SLC11A1 gene.The first and second variants are located in the regulatory regions; for example, splicing sites could potentially impact the expression of SLC11A1.Accordingly, the presence of a minor allele in these positions can affect post-transcriptional modifications and/or translation.Thus, it is hypothesized that nucleotide substitution in these locations is followed by producing a less or more efficient protein.However, the exact mechanism regarding the role of the SLC11A1-mutated protein is not understood yet and requires additional bioinformatics analyses.
Accordingly, rare functional noncoding SNPs identified by large-scale whole genome sequencing have revealed unexplained heritability of T2DM 70 and can thus be considered valuable prognostic markers for the disease.This is crucial because the lack of prevention and healthcare measures accounts for the fast rise in the prevalence of such endocrine disorders and their complications in developing countries.Iran lacks the most recent studies and knowledge necessary for effectively managing and treating T2DM.Our research aims to develop effective and affordable approaches to assess the genetic variation-based risk of T2DM development.In order to provide better treatment options against T2DM, we expect that our findings may help clinicians in the management and early detection of this condition.Additionally, creating a T2DM biobank for this cohort will be beneficial since this is the first study testing these SNPs in T2DM patients.Although SNPs in the SLC11A1 encoding gene's intronic and 3-UTR regions were chosen in the current study to study the relationship between SLC11A1 variations and the risk of T2DM, these variants might not provide a full picture of the SLC11A1 gene's genetic activity.As a result, a fine-mapping study may be needed subsequently.On the other hand, T2DM is a complicated metabolic disorder driven by environmental and genetic variables that were not examined in this study and can be considered a limitation.Furthermore, we have not performed Sanger sequencing to confirm our genotyping results, which can also be considered a limitation of the current study.Lastly, our sample size was relatively small, and this could potentially affect the outcome of such population-based studies.Despite these, we believe that the findings of our study highlight the essential role of SLC11A1 polymorphisms in predisposition to T2DM in subjects with Iranian ancestry.

Conclusion
Our findings showed that SLC11A1 rs3731865 G/C is associated with an increased risk of T2DM in our population, while SLC11A1 rs3731864 G/A and rs17235416 + TGTG/−TGTG SNPs were correlated to decreased risk of developing this endocrine disease.Further bioinformatics analyses, along with replicated studies on different ethnicities, are needed to confirm our findings.Additionally, given the significant effect of these SNPs on the onset of T2DM, it appears likely that additional genetic variants in this gene may contribute to T2DM susceptibility.These findings may facilitate a detailed understanding of the molecular pathogenesis of T2DM and the genetic basis of heterogeneous susceptibility, with potential implications for the development of more effective therapeutic strategies.

Figure 3 .
Figure 3. Web-based analysis of the impact of studied intronic variants on the pattern of splicing processes using the SpliceAid database.rs3731864 mutant (A), rs3731864 wild-type (B), rs3731865 mutant (C), rs3731865 wild-type (D).G to A substitution in the rs3731864 position disrupts the binding sites of some splicing factors including SC35, SF2/ASF, hnRNP f, hnRNP H3, hnRNP H1, and hnRNP H2.On the contrary, nucleotid change on the position of rs3731865 creates binding site for SF2/ASF and hnRNP H3 facors.

Figure 4 .
Figure 4. Illustration of sequence conservation.WebLogo illustrated the conservation of the DNA sequences around SLC11A1-rs3731864 G/A, -rs3731865 G/C, and -rs17235416 + TGTG/− TGTG variations loci.The red vertical line indicates the positions of the variants loci in humans and the conservation of wild alleles across multiple mammalian species.The high nucleotide symbols indicate more conservation, while the small and more diverse ones show less conservation.

Figure 5 .
Figure 5.The PPI network of the SLC11A1 protein.Colored lines between proteins indicate evidence of various types of interactions.Regarding "Known Interactions", the blue lines represents interactions based on curated databases and the purple lines highlight interactions based on experimentally determined.Regarding "Predicted Interactions", the green lines show interactions based on gene neighborhood; the red lines indicate gene fusions; and the dark blue, represents gene co-occurrence.Regarding "Others, " the yellow lines represents interaction based on text mining; the black lines show co-expression; and the light blue lines represent protein homology.The above classification information was obtained using STRING.STRING imports protein association knowledge from databases of physical interactions and databases of curated biological pathway knowledge (MINT, HPRD, BIND, DIP, BioGRID, KEGG, Reactome, IntAct, EcoCyc, NCI-Nature Pathway Interaction Database, and GO).A PPI analysis was conducted using inBio Discover™ to investigate the possible interactions between SLC11A1 and other proteins.The PPI analysis was conducted using inBio Discover™ to investigate possible interactions between SLC11A1 and other proteins.Our Bioinformatics results showed that some of the genes that have direct interaction with SLC11A1, including ATP7A, FGB, FGA, SLC11A2, and SLC40A1, can also play important roles in the course of T2DM, and this makes SLC11A1 a hub protein that can regulate different signaling pathways involved in the pathogenesis of T2DM.PPI protein-protein interaction, SLC11A1 solute carrier family 11 member 1, SLC11A2 solute carrier family 11 member 2, SLC25A37 solute carrier family 25 member 37, SLC31A2 solute carrier family 31 member 2, SLC34A1 solute carrier family 34 member 1, SLC34A2, solute carrier family 34 member 2, SLC40A1 solute carrier family 40 member 1, ATOX1 antioxidant 1 copper chaperone, ATP7A ATPase copper transporting alpha, HAMP hepcidin antimicrobial peptide, FGA fibrinogen alpha chain, FGB fibrinogen beta chain, FGG fibrinogen gamma chain, F2 coagulation factor II, thrombin, SPL1 squamosa promoter binding protein-like 1, IRF8 interferon regulatory factor 8, GATA2 GATA binding protein 2, TTN titin, NEB nebulin, TRIM63 tripartite motif containing 63, ACTN2 actinin alpha 2, TCAP titin-cap. https://doi.org/10.1038/s41598-023-33239-3

Table 1 .
Clinical and demographic data of patients with T2DM and healthy controls.T2DM Type 2 Diabetes Mellitus, n represents the numbers, BMI body mass index, WC waist circumference, WHR waist-to-hip ratio, CI conicity index, FBS fasting blood sugar, TC total cholesterol, TG triglyceride, HDL high-density lipoprotein, LDL low-density lipoprotein, a Mann-Whitney-Wilcoxon test; b Pearson Chi-Square test.Statistically significant parameters (p-value < 0.05) are shown in Bold and Italics.

Table 2 .
Allelic and genotypic distribution of SLC11A1 gene polymorphisms.T2DM Type 2 Diabetes Mellitus, n represents the numbers, CI confidence intervals, OR odds ratio, HWE Hardy-Weinberg equilibrium, BMI body mass index, *p-value and OR (95% CI), BMI-adjusted.Codominant 1 and Codominant 2 represent the heterozygous and homozygous codominant models, respectively.p < 0.05 were considered statistically significant, indicated in Bold.