Introduction

Arsenic contamination is a severe public health problem. More than seventy countries and regions around the world have high arsenic contents, including Bangladesh, China, Mexico, India, and Argentina1,2,3. In arid and semiarid regions in northwest China, individuals develop a typical drinking water-type endemic arsenic-poisoning lesion, and the arsenic content of the groundwater in the area is higher than the 10 μg/L standard set by the World Health Organization4. As a potential environmental carcinogen, arsenic can affect health, and long-term exposure to drinking water with high levels of arsenic can increase the risk of cancers, including skin, lung, bladder, kidney and liver cancers, and also influence various non-cancer diseases, including cardiovascular and cerebrovascular diseases, diabetes, reproductive and developmental disorders, and neurological and cognitive dysfunction5,6,7,8. In addition, arseniasis, also known as chronic arsenic poisoning, can cause different skin manifestations, including palmoplantar keratosis and pigmentation or depigmentation on the chest and back and in severe cases can cause skin cancer or Bowen’s disease9,10,11. Although more than 100 million people are exposed to arsenic worldwide, many studies have shown that only a portion of them exhibit arsenic-induced skin lesions12,13. This phenomenon may be caused by genetic differences14, as multiple epidemiological studies have revealed that genetic polymorphisms play an important role in susceptibility to arsenic poisoning9,15,16.

The pathogenic mechanism of inorganic arsenic is very complex and is most likely multifactorial17. Recently, many epidemiological studies have reported that inter-individual variations in arsenic metabolism can result in differences in susceptibility to arsenic-induced skin lesions18; such gene polymorphisms influence the proteins encoded and thus affect enzyme structure and function19,20. Two possible metabolic pathways for arsenic have been proposed in mammals5, with most studies favoring the classical metabolic pathway21 in which (1) pentavalent arsenic (AsV) is reduced to trivalent arsenic (AsIII) and (2) s-adenosyl methionine (SAM) provides a methyl donor for oxidative methylation of trivalent arsenic, with trivalent arsenic producing mono-, di- and trimethylated derivatives12,22. Many types of enzymes participate in this metabolic pathway, though arsenic (III) methyltransferase (AS3MT), glutathione S-transferase omega-1 (GSTO1) and omega-2 (GSTO2) and purine nucleoside phosphorylase (PNP) are considered the main enzymes3,19,23. AS3MT, an S-adenosyl methionine (SAM)-dependent enzyme, is essential for oxidative methylation of trivalent arsenic during the arsenic biotransformation process20. GSTO1 and GSTO2 are important rate-limiting enzymes in arsenic metabolism and catalyze reduction of methyl arsenic24. Previous studies have shown that PNP reduces AsV to AsIII in the rat liver and calf spleen25,26. Genetic polymorphisms in the genes encoding these enzymes may cause individual differences in arsenic biotransformation, leading to different individual sensitivities to arsenic.

To explore the correlation between arsenic-induced skin lesions and polymorphisms in arsenic-associated metabolic enzyme genes, we analyzed genetic polymorphisms of 25 polymorphic loci in the four genes mentioned above in samples collected from populations in areas of northwest China with high levels of arsenic in the drinking water. The arsenic concentration in the drinking water in Gansu Province was found to be 969 μg/L, nearly 100 times higher than the highest standard for arsenic in drinking water according to the recommendations of the World Health Organization (10 μg/L). The aim of this study was to determine the existence of a relationship between single-nucleotide polymorphisms (SNPs) of the above-mentioned arsenic-metabolizing genes and the development of arsenic-induced skin lesions in an arsenic-exposed population in northwest China.

Results

Characteristics of the studied populations

The general characteristics of the cases and controls are shown in Table 1. No significant differences were observed among the cases and controls with regard to age, gender, smoking status, drinking status, or ethnicity. There were however significant differences in the years of cigarette smoking between the two groups. Consistent with the total arsenic intake in drinking water, concentrations of arsenic species (total (tAs) and inorganic (iAs) as well as monomethylarsonous acid (MMA) and dimethylarsinous acid (DMA)) detected in urine samples showed a statistically significant difference between cases and controls (P < 0.05). The rs1191439 SNP in AS3MT was found to not be in Hardy-Weinberg equilibrium (HWE) and was therefore eliminated.

Table 1 General characteristics and polymorphic genes of cases and controls.

Screening for gene polymorphisms and analyzing allele frequencies

Information for all genes was obtained from an SNP database (https://www.ncbi.nlm.nih.gov/snp/?term), and we chose the higher allele frequencies of SNP in the Chinese population. Only the mutant allele frequency of rs11509438 was less than 10%, whereas the mutant allele frequencies of the other SNPs ranged from 0.1038 to 0.4635. The allele frequencies for GSTO1, GSTO2, AS3MT and PNP among the cases and controls showed a significant difference for GSTO1 SNPs rs11191979, rs2164624, rs2282326 and rs4925, GSTO2 SNPs rs156697 and rs2297235 and the PNP SNP rs3790064 (P < 0.05). In contrast, the allele frequencies of 13 SNPs at the AS3MT locus were not significantly different between the cases and controls.

Association between gene polymorphisms and the risk of arsenic-induced skin lesions

The main effects of each genotype on skin lesions after adjustment for age, gender, smoking status, drinking status and ethnicity are shown in Table 2. In addition to the analyis resported above, integral analysis of genotypes showed that AS3MT rs11191439 was not in HWE; another 13 SNPs in AS3MT showed no robust association with the risk of arsenic-induced skin lesions. However, each genotype for the six SNPs in GSTO1, GSTO1 and PNP were significantly different between the cases and controls. For GSTO1, individuals carrying at least one C allele for the rs11191979 polymorphism, at least one A allele or the AA genotype for rs2164624 or at least one A allele for rs4925 showed a significant risk of arsenic-induced skin lesions [OR = 1.382 (95% CI, 1.031–1.851) for rs11191979; OR = 1.367 (95% CI, 1.002–1.866) and OR = 1.379 (95% CI, 1.021–1.862) for rs2164624; and OR = 1.350 (95% CI, 1.009–1.808) for rs4925] compared with homozygous wild-type individuals. For GSTO2, subjects who carried the AG genotype for rs156697 and the AG genotype or at least one G allele for rs2297235 had an increased risk of arsenic-induced skin lesions [OR = 1.877 (95% CI, 1.109–3.177) for rs156697 and OR = 2.161 (95% CI, 1.035–4.513) and OR = 1.350 (95% CI, 1.031–1.851) for rs2297235] compared with homozygous wild-type individuals. Compared with homozygous wild-type individuals, those carrying the GG genotype or at least one G allele for rs3790064 of PNP had an increased risk of arsenic-induced skin lesions [OR = 1.468 (95% CI, 1.018–2.118) and OR = 1.520 (95% CI, 1.070–2.158)]. With regard to years of cigarette smoking, individuals who smoked for 1–35 years had an increased risk of arsenic-induced skin lesions [OR = 1.735 (95% CI, 1.152–2.611)] compared with nonsmoking individuals.

Table 2 Distribution and risk assessment of genotypes for 24 SNPs between cases and controls.

Stratification analyses of AS3MT, GSTO1, GSTO2 and PNP polymorphisms and risk of arsenic-included skin lesions

We further evaluated the influence of genotypes on arsenic-induced skin lesion risk after stratifying participants by sex, age, years of cigarette smoking and drinking (Table 3). The analysis stratified by sex revealed an evident increased risk of arsenic-induced skin lesions among female subjects carrying at least one G allele for rs3790064 of PNP (OR = 1.63, 95% CI = 1.01–2.61). When stratified by age (<55 or ≥ 55 years old), older subjects (≥55 years old) carrying at least one C allele (TC + CC) or the CC genotype for rs11191979, at least one A allele (GA + AA) or the AA genotype for rs2164624 or at least one A allele (CA + AA) or the AA genotype for rs4925, the AG genotype or at least one G allele for rs156697, the AG genotype or at least one G allele for rs2297235 and the GG genotype or at least one G allele for rs3790064 had a higher susceptibility to skin lesions than homozygous wild-type individuals. When stratified by years of cigarette smoking (0, 1–35 or >35 years), subjects smoking for 1–35 years and carrying the GG genotype or at least one G allele for the rs3790064 polymorphism had a higher susceptibility to skin lesions than homozygous wild-type individuals.

Table 3 Stratification analyses between polymorphisms and arsenic-induced skin lesions risk by sex, age, years of cigarette smoking.

Analysis of haplotype association with arsenic-induced skin lesions risk

Linkage disequilibrium (LD) patterns among AS3MT polymorphisms in cases and controls were investigated to determine haplotype blocks in the study population. Pairwise LD (D′) values between SNPs are indicated by the graphical overview of LD structure shown in Fig. 1. All 13 SNPs were split into two identified haplotype blocks. Overall distributions of haplotypes were not significantly different between the cases and controls (P > 0.05). Adjusted P values for haplotype blocks were also confirmed by the permutation test (Table 4), which revealed no significant differences. LD patterns among GSTO1 polymorphisms were also assessed, with all 5 SNPs split into two identified haplotype blocks. The distribution of haplotype CT between rs4925 and rs11191979 in the case group and the control group was statistically significant (P < 0.05), and haplotype CT appeared to confer a high risk of arsenic-included skin lesions (P = 0.030, OR = 1.377, 95% CI = 1.03–1.84). The cases and controls were also assessed for LD patterns among GSTO2 polymorphisms; the three SNPs grouped into one identified haplotype block. Statistical significance (P < 0.05) was observed for the distribution of haplotype GCG, which appeared to result in a high risk of arsenic-included skin lesions (P = 0.029, OR = 2.197, 95% CI = 1.08–4.44), in the cases and controls.

Figure 1
figure 1

LD patterns and haplotype blocks of cases and controls were defined according to the ‘spine of LD’, as based on each end marker of a block having a D’ value > 0.8. A standard color scheme is used to display the LD pattern, with black for perfect LD (r2 = 1), white for no LD (r2 = 0) and shades of gray for intermediate LD (0 < r2 < 1).

Table 4 Estimated AS3MT, GSTO1 and GSTO2 haplotypes in cases and controls.

Relationship between enzymatic activity and polymorphism

Enzyme-linked immunosorbent assay (ELISA) was used to determine the enzymatic activity of GSTO1, GSTO2 and PNP, as presented in Table 5. Compared with homozygous wild-type individuals, subjects who carried at least one mutant allele for GSTO1 SNPs rs11191979, rs2164624, rs2282326 and rs4925 exhibited a significant decrease in GSTO1 enzymatic activity (P < 0.05); in addition, participants carrying at least one mutant allele for GSTO2 SNPs rs156697, rs157077 and rs2297235 showed significant decreases GSTO2 activity (P < 0.05), and those carrying at least one mutant allele for PNP SNPs rs1760940, rs1713420 and rs3790064 showed significant decreases in PNP (P < 0.05).

Table 5 Relationship between enzyme activity and polymorphism.

Discussion

Genetic polymorphisms affect the risk of certain diseases and alter individual susceptibility to disease. Indeed, epidemiological studies have confirmed that there is a great difference in susceptibility to arsenic exposure in people in the same area due to differences in arsenic metabolism in various populations27. For instance, the incidence of skin lesions, bladder cancer and lung cancer was found to be higher in those with arsenic poisoning in northern Chile than in Taiwan, Mexico and India28. Although high-arsenic water is consumed in many countries and regions, only a fraction of the exposed individuals develop skin lesions or other arsenic-induced diseases. This heterogeneity may be caused by genetic differences23,29,30. In this case-control study in an arsenic-exposed population in northwest China, we found a correlation between arsenic-induced skin lesions and polymorphisms of arsenic-associated metabolic enzyme genes. The results showed that polymorphisms in GSTO1 SNPs rs11191979, rs2164624, rs2282326 and rs4925, GSTO2 SNPs rs156697 and rs2297235, and PNP SNP rs3790064 affect risk to arsenic-induced skin lesions. In addition, Hui Shen et al. reported that some chemicals in cigarettes can influence the enzymes involved in methylation processes, especially those involved in the second methylation phase. Moreover, smoking itself may be a pathway for arsenic exposure if the cigarettes contain trace amounts of arsenic31. In this study, we found that those who smoked for 1–35 years were more susceptible to arsenic-induced skin lesions than non-smokers. Although related gene polymorphisms and smoking interact in skin carcinogenesis due to arseniasis, the specific mechanisms by which smoking leads to changes remain poorly understood. Previous studies have shown that effective control of smoking is an important measure to reduce the incidence of arsenic-induced skin lesions.

GSTO1

GSTO1 is a multifunctional enzyme that is involved in many biological processes and plays a critical role in cellular detoxification systems32,33,34. According to the classical model of arsenic metabolism35, GSTO1 catalyzes the rate-limiting step of arsenic biotransformation in vivo. Many studies have investigated the relationship between the polymorphisms rs4925 and rs156697 and susceptibility to diseases, including Alzheimer’s disease, breast cancer and bladder cancer36,37,38,39,40,41. A C → A transition at position 419 in GSTO1 rs4925 results in alteration of the 140th amino acid from alanine (Ala) to aspartic acid (Asp). This change decreases the enzymatic activity of GSTO1 and thus affects arsenic metabolism, which reduces the arsenic biotransformation ability42. In the present study, we analyzed the relationship between GSTO1 polymorphism and its activity and found that participants who carried at least one mutant allele exhibited a significant decrease in GSTO1 enzymatic activity compared with homozygous wild-type individuals. From this perspective, such a polymorphism in GSTO1 may have an effect on susceptibility to certain diseases. The frequency of the rs4925 A allele in our population was 0.190, which was close to the reported frequency in the Chinese Han population, though great variation in this frequency has been found worldwide43,44. Our results suggest that the risk of arsenic-related skin lesions for subjects who carry at least one A allele for rs4925 is 1.36 times higher than in subjects who are homozygous wild-type; thus, the mutant allele (A) may be a risk factor for arsenic-induced skin lesions. This result is consistent with our previous findings in a urinary arsenic metabolism model30. In previous studies, the %DMA of individuals with the rs4925 CA genotype of GSTO1 was significantly reduced compared with that of individuals with the CC genotype. This finding also shows that the A allele is a risk factor for arsenic-induced skin lesions. To date, few studies have investigated rs2164624 and rs11191979 polymorphisms, and our results show that the AA genotype and the mutant A allele of rs2164624 and the mutant C allele of rs11191979 increase the risk of arsenic-related skin lesions. In addition, hierarchical regression indicated that the mutant C allele of rs11191979, the mutant A allele of rs2164264 and the mutant A allele of rs4925 are risk factors for arsenic-induced skin lesions in older individuals (≥55 years old). The meta-analysis by Hui Shen et al. indicated that older people might have poorer methylation capacity and thus be susceptible to arsenic-induced damage. In future research, we will investigate the relationship between age and urinary arsenic metabolites. For haplotype CT, the distribution of between rs4925 and rs11191979 in the case and control groups was statistically significant (P < 0.05). The frequency of the mutant A allele of rs2164624 and that of the mutant C allele of rs11191979 were 0.169 and 0.191, respectively, in our study population, values that are close to the reported 0.1897 and 0.1763 global minor allele frequency (MAF) values, respectively.

GSTO2

Multivariate logistic regression analysis revealed that the AG genotype of rs156697 and the AG genotype and mutant G allele of rs2297235 increased the risk of arsenic-related skin lesions. Moreover, hierarchical regression indicated that the AG genotype of rs156697 and the AG genotype and mutant G allele of rs2297235 are risk factors for arsenic-induced skin lesions in older subjects (≥55 years old). GSTO2 and GSTO1 exhibit 64% amino acid sequence identity, and the dehydroascorbate reductase activity of GSTO2 is 70–100 times higher than that of GSTO145. These results suggest that by recycling ascorbate, GSTO2 may play a significant role in protecting against oxidative stress. The rs156697 mutant G allele frequency was 0.28 in our study of 850 Chinese subjects, which was similar to the frequency reported in Turkish (0.219), Japanese (0.216) and Hong Kong Chinese (0.270) populations43,46. The A → G transition at position 424 of the exon in GSTO2 rs156697 changes an Asn to aspartic acid an Asp, and in our previous study, we reported that the mutant G allele might be a risk factor for arsenic-induced skin lesions30. Analysis of expression patterns showed that GSTO2 expression is low in human tissues, suggesting that GSTO2 may have a critical role in cellular signal transduction47. Other researchers have also studied rs156697 polymorphisms in GSTO243. We found these polymorphisms to be associated with decreased enzyme activity in arsenic metabolism, consistent with previous studies43,48. Regarding the rs2297235 SNP, Ema G. Rodrigues et al. revealed that homozygous wild-type GSTO2 rs2297235 is significantly associated with higher urinary MMA and DMA concentrations in individuals from an arsenic-exposed region in Bangladesh49. Our research shows that GCG haplotype is a risk factor for arsenic-induced skin lesions. As the effects of haplotype on phenotype include synergistic or antagonistic effects of genes and are predominately influenced by certain genes, further analysis is needed due to such complex interactions among genes.

PNP

The classical pathway suggests that AsV undergoes sequential reduction and oxidative methylation after entering the cell via a phosphate transporter. In the human liver, PNP reduces AsV to AsIII25,50, and many epidemiological studies have reported a positive association between PNP polymorphisms and arsenic-induced skin lesions16,25. For example, De et al. found three exonic polymorphisms of PNP (His20His, Gly51Ser and Pro57Pro) to be significantly associated with arsenism19. In addition, Fen Wu indicated that rs17886095, rs17882804 and rs3790064 in PNP are significantly correlated with urinary arsenic metabolites in a population exposed to high levels of arsenic in drinking water in Bangladesh51. It was also reported that there is no significant correlation between PNP polymorphisms and urinary arsenic metabolism52. In our study, we evaluated three polymorphic loci of PNP and found that SNPs rs1713420 and rs1760940 were not associated with the risk of arsenic-induced skin lesions; conversely, the mutant G allele of rs3790064 did increase the risk of arsenic-related skin lesions in our study population. We found that individuals with the GG genotype had increased odds (1.473) of skin lesions compared with individuals harboring the rs3790064 AA genotype (OR = 1.468 (95% CI, 1.018–2.118)); individuals with the mutant G allele of rs3790064 also displayed increased odds (1.520) of skin lesions compared with individuals harboring the rs3790064 AA genotype (OR = 1.520 (95% CI, 1.070–2.158)). Moreover, hierarchical regression indicated that the mutant G allele of rs3790064 is risk factor for arsenic-induced skin lesions in those who have smoked for 1-35 years, with these individuals being more susceptible to arsenic-induced skin lesions than non-smokers. The mutant G allele of SNP rs3790064 is risk factor for arsenic-induced skin lesions in older subjects (≥55 years old) and females. However, Haploview analysis did not identify a block with significant LD among the nine SNPs (rs1713420, rs1760940, rs3790064). The association between PNP polymorphisms and arsenic metabolism is controversial52,53. Therefore, further studies should consistently explore the function of PNP in arsenic metabolism and the relationship between PNP polymorphisms and arsenic-induced skin lesions using a larger arsenic-exposed population.

AS3MT

AS3MT, located in the 10q24.32 region of chromosome 10, encodes the main methyltransferase involved in arsenic metabolism and affects an individual’s efficiency to detoxify ingested arsenic54. Gordon Gong et al. discovered that the risk of coronary heart disease and hyperlipidemia was higher for subjects with rs10748835 polymorphism who carry the AG genotype than for subjects who carry the AA genotype55. Several in vitro and in vivo studies have revealed that AS3MT is essential for oxidative methylation of trivalent arsenic during the arsenic biotransformation process, which highlights the importance of the AS3MT enzyme in converting inorganic arsenic metabolites to their corresponding methylated products20,56. Genetic variations in AS3MT can lead to individual differences in inorganic arsenic metabolites, which is one of the main reasons for the different susceptibilities to endemic arsenic poisoning within a population exposed to high arsenic levels57. Regardless, the results of this study showed that the 13 polymorphic loci of a intron in AS3MT (rs1046778, rs10748835, rs10883790, rs11191438, rs11191442, rs11191454, rs12416687, rs3740390, rs3740392, rs3740393, rs7085104, rs7085854 and rs7098825) were not associated with arsenic-induced skin lesions; overall distributions of haplotypes were not significantly different between cases and controls (P > 0.05). In a previous study30, we investigated the relationship between rs10748835 (intron10-A35991G) polymorphism and urinary arsenic metabolism and found that this locus was not associated with the risk of endemic arsenism from contaminated drinking water. Although several studies have shown that AS3MT gene polymorphisms do not cause differences in disease susceptibility21,36,58, we believe that this result does not reflect the significance of the AS3MT gene in arsenic metabolism, possibly due to differences in populations and races.

In conclusion, the three typical arsenism areas in China were selected for study, and the samples were representative. The variants of GSTO1, GSTO2 and PNP render the susceptible toward developing arsenic-induced skin lesions in individuals exposed to high-dose inorganic arsenic in northwest China. We did not find significant association of 13 SNPs at AS3MT locus with arsenic-induced skin lesions, in future researchs, we will study the effects of the AS3MT gene on individual urinary arsenic metabolism based on urinary arsenic methylation metabolism levels in populations exposed to high arsenic levels.

Limitation

One limitation of our study is that we did not investigate the association between gene polymorphisms and urinary arsenic speciation. However, we suggest that a larger sample size is needed, owing to the various metabolites involved.

Materials and Methods

Study sites and subject selection

According to the Endemic Disease Control Center of the Chinese Center for Disease Control and Prevention and previous knowledge of the situation in areas with endemic arsenism in China, three adjacent districts were selected in northwest China (Gansu Province, Shanxi Province and the Inner Mongolia Autonomous Region). The three provinces are typical arsenism areas in China. Previous research has described the epidemiological investigation and sample collection in Gansu Province in detail30. Hanggin Rear Banner of the Bayannur League in the Inner Mongolia Autonomous Region, Tianzhen County in Shanxi Province, and Tuoketuo County were selected as study areas; water screening for the presence of high levels of arsenic was conducted in these areas in 2005 by a national program, and a compete list of arsenic exposure was generated. In the above regions, the arsenic levels were 0–0.5102 mg/L in hand-pumped well water and 0.1670 mg/L in central well water. In total, 1412 subjects were investigated, and 1124 subjects who had explicit arsenic levels were selected. Of these 1124 subjects, 1061 were exposed to high levels of arsenic in drinking water (>0.01 mg/L). SNPs were assessed in 1040 DNA samples (samples from 21 subjects for whom whole blood samples were missing were excluded). The epidemiological survey and sample collection of the above two groups were described in previous articles59,60.

In the present study, we recruited 850 subjects for sample collection from arsenism areas in Gansu Province, Shanxi Province and the Inner Mongolia Autonomous Region. The subjects included 331 cases and 519 controls. Cases that lacked SNP detection results or did not meet the inclusion criterion were excluded. The inclusion criterion for the cases was the presence of skin lesions, which is the hallmark sign of arsenism, as diagnosed by the national standard (Standard of Diagnosis for Endemic Arsenism, WS/T-211-2001). For the controls, we recruited individuals without arsenic-induced skin lesions. All subjects had similar lifestyles, social backgrounds and eating habits. All subjects provided informed consent before the study, and all protocols in this study were approved by the Ethics Committee of Harbin Medical University. All methods were performed in accordance with relevant guidelines and regulations, and the Institutional Review Board of Harbin Medical University approved all experimental protocols.

Sample collection

Migrant workers who had returned from the city and villagers who had experienced fever, infection or autoimmune diseases or who had recent occupational exposure to arsenic (within 1 month) or X-rays (within 6 months) were excluded from the blood sample collection. For the arsenism areas in Gansu Province, approximately 3 mL of fasting venous blood was extracted in the morning from villagers who were willing to provide blood samples; these samples were blended and collected in ethylenediaminetetraacetic acid (EDTA) anticoagulation tubes for mixing. After collection, the specimens were immediately transported to the laboratory in a −18 °C vehicle refrigerator and then stored at −80 °C. For the arsenism areas in Shanxi Province and the Inner Mongolia Autonomous Region, the blood samples were divided into two groups: for one group, the samples were centrifuged, and serum was collected as a back-up; the samples for the second group consisted of anticoagulated blood. Water samples were collected in 50-mL acid-washed tubes rinsed 3 times with water before sampling and stored at room temperature. Household drinking water samples were collected if a subject’s arsenic exposure history was not clear. Arsenic concentrations in water were determined by atomic absorption spectrophotometry (AA-6800, Shimadzu Co., Kyoto, Japan). IAsIII, arsenate (iAsV), MMAV and DMAV in the 850 urine samples were measured using high-performance liquid chromatography (HPLC) for separation and hydride generation atomic fluorescence methods for detection. Before quantification, the urine samples were thawed naturally and centrifuged for 10 min at 12,000 rpm, and the supernatant was filtrated through a 0.45-μm membrane.

Genotyping of SNPs

QIAamp51106 DNA Extraction Kit (Qiagen, Germany) was used to extract DNA from the 1040 whole blood samples. Sample DNA (10 ng) was amplified by polymerase chain reaction (PCR) according to the manufacturer’s recommendations, and SNP genotyping was performed using a custom-by-design 2 × 48-Plex SNPscanTM Kit (Cat#:G0104; Genesky Biotechnologies, Inc., Shanghai, China). This kit was developed according to patented SNP genotyping technology by Genesky Biotechnologies Inc. based on double ligation and multiplex fluorescence PCR, as previously described61,62. To validate the genotyping accuracy using SNPscanTM Kit, eight of 92 SNP loci were analyzed by single-nucleotide extension using Multiplex SNaPshot Kit (Applied Biosystems Inc., Foster City, CA, USA) for 48 samples, with greater than 99% concordance rates. SNP genotyping was performed using the TaqMan SNP genotyping assay (Applied Biosystems Inc). The genotyping success rates were more than 99% in Stage II samples, and the concordance rates were more than 99% based on 5% duplicate samples.

Enzyme Linked Immunosorbent Assay (ELISA)

In every sample, GSTO1, GSTO2, and PNP activities were detected using commercial enzyme-linked immunosorbent assay (ELISA) kits (Kete Biological Technology Co., Ltd, Jiangsu, China) following the manufacturer’s instructions. Briefly, 50 μl of standard solutions or DNA samples was added to 96-well plates and incubated for 30 min at 37 °C. The samples were washed 5 times, and the conjugate reagent was added and incubated for 30 min at 37 °C.The samples were again washed 5 times, and chromogen solutions A and B were added and incubated for 10 min at 37 °C. Absorbance at 450 nm was determined using a Microplate Reader (BioTek, USA).

Statistical analysis

The statistical analysis was performed using SPSS (version 13.01 S; Beijing Stats Data Mining Co. Ltd.). We performed an independent t-test to calculate significant differences in age between the cases and controls. Differences in the distribution of categorical data (gender, smoking status, drinking status and ethnicity) were tested using the Chi-square test. The genotype and allelic gene frequencies of each site in GSTO1, GSTO2, AS3MT and PNP were calculated, and HWE was assessed in the controls using the Chi-square test. We used bivariate unconditional logistic regression analysis to estimate odds ratios (ORs) and 95% confidence intervals (CIs) between the cases and controls for the skin lesion risk with GSTO1, GSTO2, AS3MT and PNP gene polymorphisms. Multivariate unconditional logistic regression analysis with adjustment for age, gender, smoking status, drinking status and ethnicity was performed to calculate adjusted ORs and 95% CIs. The total concentration of the four different forms of arsenic (iAsIII + iAsV + MMAV + DMAV) was considered as the approximate total arsenic (tAs) value. Normality and homogeneity tests of variance were performed for all urinary arsenic determination indices; indices satisfying the described conditions are indicated with*, which represents the indices for the parameter test. The t-test was used for cases and controls, with a t-statistics value. The statistical tests were two-tailed probability tests with a test level of α = 0.05, and P < 0.05 was considered significant.

LD analysis and haplotype reconstruction was performed using Haploview 4.2. Common haplotypes with frequencies of >0.01 were compared between cases and controls. The P value was adjusted by a permutation test.