Main

Sulphatases are members of a highly conserved gene family and share extensive sequence homology and a high degree of structural similarity.1,2 Sulphated glycosaminoglycans, glycolipids, glycoproteins and hydroxysteroids are hydrolysed by sulphatases, each of which has exquisite specificity towards its individual substrate in vivo.3,4 A subset of sulphatases, each with very different natural substrates, is active in vitro against a common set of small aromatic substrates, hence the name arylsulphatases. This functional correlation reflects a high degree of predicted amino acid sequence similarity along the entire length of the enzymes. This amino acid sequence conservation strongly suggests that the sulphatases are members of an evolutionarily conserved gene family sharing a common ancestor, which has undergone duplication events in several species.1,5 A more recent duplication has generated a cluster of sulphatase genes located on the distal short arm of the human X chromosome, in Xp22.3.6 Twelve human sulphatases have been identified and, on the basis of their subcellular localisation, they can be divided into two main categories: those acting at acidic pH sharing a lysosomal localisation, and those with a neutral pH optimum found in the endoplasmic reticulum and Golgi apparatus.2 Most of these sulphatases have been identified by linking a clinical condition to a deficient function of one of these enzymes.7,8,9 Additional evidence that all sulphatases are functionally related comes from the study of a rare and intriguing genetic disease named multiple sulphatase deficiency (MSD) in which all known sulphatases are deficient.2 A major breakthrough in the understanding of the function of sulphatases has recently emerged from the observation that sulphatases undergo a common and unique co- or post-translational modification. This occurs in the endoplasmic reticulum and involves the oxidation of a conserved cysteine residue which is required for catalytic activity and is defective in MSD.10 All these data underline the importance of the sulphatase family and suggest that the identification of additional sulphatase genes creates novel opportunities to study human metabolism and disease mechanisms.

Materials and methods

Plasmids and constructs

ARSG expression vectors were obtained by cloning both the full-length and the coding sequence of ARSG cDNA into EcoRI-digested pcDL and EcoRI/Sal-digested pmt 21 vectors.11 ARSG (Ala) and ARSG (Pro) (see below) were both cloned in the two different vectors.

Cos7 transfection and protein extract

Cos7 cells were grown in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% FCS, 100 g penicillin/ml, and 100 g streptomycin/ml, at 37°C in 5% CO2 atmosphere. Ten micrograms of wild-type ARSG cDNA expression vector constructs were introduced into Cos7 cells, by electroporation using a BioRad Gene Pulser apparatus, and were seeded in 75-cm2 flasks. The cells were harvested 24–72 h after transfection and were resuspended in extraction buffer (150 mM NaCl; 100 mM Tris-HCl, pH 7.5; and 1% Triton X-100) and cell lysates were clarified by centrifugation at 13 000 rpm to remove cell debris. The supernatant was mixed with SDS–PAGE sample buffer, boiled, and loaded on the gel.

Production of polyclonal antibody

The region of amino acids 389–525 of ARSG was fused to the six-histidine tag bacterial expression vector pQE (Qiagen) and was produced in DH5α cells, after induction with isopropyl-β-D-thiogalactoside (IPTG). The His-tagged protein was purified on a NiNTA agarose column (Qiagen) and was used to immunise rabbits. ARSG antiserum was precipitated with ammonium sulphate and the antibody was purified on a protein A-Sepharose column.

Immunoblot

Ten micrograms of soluble cellular protein was boiled for 5 min in sample buffer, electrophoresed through 10% SDS–PAGE, and electroblotted onto PVDF membrane (Amersham). The membrane was treated with 5% dry milk in Tris-buffer saline with the addition of Tween 20 (TTBS (20 mM Tris-HCl, pH 7; 50 mM NaCl; and 0.1% Tween 20)) to inhibit nonspecific binding. Anti-ARSG antibody was used at a 1 : 700 dilution in TTBS. Visualisation of antibody binding was performed with a secondary anti-rabbit IgG antibody conjugated with peroxidase.

Immunofluorescence

Immunofluorescence was performed on paraformaldehyde (PFA)-fixed transfected Cos7 cells. Cells were permeabilised with 0.2% Triton X-100, blocked with porcine serum, and incubated with anti-ARSG antibody (1 : 700). Staining was obtained after incubation with secondary fluorescein isothiocyanate (FITC)- or tetramethylrhodamine isothiocyanate (TRITC)-conjugated isotype-specific antibody. Specific ER staining was obtained even by double staining with a commercially available anti-ER polyclonal antibody, ERAB (K-20) (Santa Cruz Inc). This antibody recognise a peptide mapping near the carboxy terminus of ERAB (endoplasmic reticulum amyloid beta-peptide-binding protein.)

Endoglycosidase H and F treatment

For glycosylation analysis, protein samples derived from transfected Cos7 cells were digested overnight with endoglycosidase H and N-glycosidase F (Boerhinger Mannheim) in an appropriate buffer (50 mM phosphate buffer, pH 7.4; 50 mM EDTA; 1% NP40; 0.1% SDS; 1% β-mercaptoethanol; 4 μg of aprotinin per ml; 2 μg of leupeptin per ml; 2 μg of pepstatin per ml) and were analysed by SDS–PAGE.

Enzyme assay

4-MU sulphate assay

Cos7 cells were transfected with 10 μg of either ARSG (Ala) or ARSG (Pro) cDNA constructs by electroporation and harvested by trypsinisation 72 h after transfection for enzyme assay. Cells were resuspended in 0.1 M Tris-HCl (pH 7.5), 1% Triton X-100, and 150 mM NaCl. The incubation mixture was prepared with different pH buffer, from pH 4 to 9. The enzymatic assay was run at 37°C for 2 h, the reaction was stopped adding 1 ml of glycine-carbonate buffer (pH 10.7) and fluorescence was determined on Hoefer TKO 100 fluorometer.

P-nitrocatechol sulphate assay

Cos7 cells were transfected with 10 μg of ARSG cDNA constructs by electroporation and harvested by trypsinisation 72 h after transfection for enzyme assay. Cells were resuspended in 0.1 M Tris-HCl (pH 7.5), 1% Triton X-100, and 150 mM NaCl. The incubation mixture was prepared with two different pH buffer, pH 4.9 and 7.5. The enzymatic assay was run at 37°C for 2 h, the reaction was stopped adding 700 μl of NaOH (0.64 N) and absorbance was determined at 515 nm on spectrophotometer.

mRNA analysis

Northern blots containing human RNAs from several tissues were purchased from Clontech and hybridised with a cDNA fragment of 885 bp as a probe, using the conditions recommended by the manufacturer (Clontech). Washing conditions were 2×SSC, 0.05% SDS at 58°C (Wash Solution I) and 0.1×SSC, 0.1% SDS at room temperature (Wash Solution II). RT–PCR experiments were performed with oligonucleotide primers from ARSG that amplify a 360 bp product on samples from different murine tissues. PCR was carried out for 40 cycles. Annealing temperature was 60°C for ARSG primers.

Results

Identification of the ARSG gene

Bioinformatic searches of the EST and nr (non-redundant) databases identified a novel cDNA sequence that shares a high degree of homology with all sulphatases and in particular with arylsulphatases, hence the name of Arylsulfatase G (ARSG). This gene encodes a 525 amino-acid protein that shows the highest level of homology (37% identity and 50% similarity) with Arylsulfatase A (Figure 1). Sequence alignment between this novel sulphatase with ARSA and ARSB revealed a high degree of amino-acid similarity along the entire length of the protein, particularly in the amino-terminal region.12,13 Moreover, the 10 residues that form the catalytic site of the protein are strongly conserved (Figure 1). ARSG cDNA (KIAA 1001) was generated by the large transcript identification project of the Kazusa Institute, Japan. The availability of mapped ESTs and of genomic sequences corresponding to the ARSG transcripts allowed us to map this gene to 17q23-q24 and to define the intron-exon boundaries of the 11 exons that compose the gene (Table 1). The gene spans the genomic region from 72 287 512 to 72 383 112 (95 600 bp). Comparison among the sequence of the KIAA 1001 clone with the available database sequences revealed a nucleotide change. The 501 codon GCA (Ala) in the genomic and EST sequences (BG994996-BG744786-BE740712-AV654375-BM708401) is substituted by CCA (Pro) in the KIAA1001 clone. To test if this mutation affects the localisation and/or the activity of the protein we mutagenised the KIAA 1001 clone. All the experiments described below have been performed with both the original (ARSG Ala) and the mutated (ARSG Pro) clones and no different results were obtained. Moreover, through protein–protein blast (blastp) searches, we have identified the ARSG murine orthologue that shows 87% of identity with the human protein. In agreement with the human genomic and EST sequences, the mouse ARSG presents an alanine in the position corrisponding to the human 501 residue.

Figure 1
figure 1

Multiple alignment of human arylsulphatases (A, B, G). Dark grey boxing indicates residues that are identical in at least two out of the three proteins; light grey indicates residues that are conserved in at least two out of the three proteins. Amino acids that are conserved in the catalytic pocket of all sulphatases are indicated with black arrows. The grey arrow indicates the conserved cysteine, contained in the consensus signature that undergoes conversion to 2-amino-3-oxopropionic acid in all eukaryotic sulphatases.

Table 1 Exon-intron boundaries of ARSG gene

Characterisation of ARSG in transfected Cos7 cells

We raised polyclonal antibody against the C-terminal portion of the ARSG protein. The region of amino acids 389–525 was fused to a six-histidine tag bacterial expression vector and was then expressed in DH5α cells and used to immunise rabbits. The antiserum was raised against a region where the sulphatases share the lowest degree of homology to prevent cross-reaction with other members of the family. This antiserum (anti-ARSG Q57) was used in Western blotting analysis of protein extracts from Cos7 cells transfected with tagged and non-tagged vectors, and a band of 70 kDa was identified (Figure 2). The amino-acid sequence of the predicted ARSG protein reveals the presence of four putative N-glycosylation sites (asparagine residues 117, 215, 356, and 497) (Figure 2). Treatment with endoglycosidase H and F reduced the size of the 70 kDa polypeptide by 8 kDa, suggesting that the increase in molecular size observed in the maturation of ARSG is due to N-glycosylation (Figure 3). Assuming that the average mass of an oligosaccharide is 2 kDa, all four N-glycosylation sites maybe utilised, accounting for the 8 kDa shift observed in SDS–PAGE.

Figure 2
figure 2

Western blotting analysis: soluble cellular proteins from cells extracts were analysed by SDS–PAGE followed by immunoblot with anti-ARSG antibody. A band of 70 kDa was detectable in Cos7 cells transfected with myc-ARSG (lane 1) and with ARSG (lane 2).

Figure 3
figure 3

Glycosylation of the ARSG gene product. Cos7 cells were transfected with myc-ARSG, cell extracts were subjected to endoglycosidase H and N-glycosidase F treatment and were analysed by SDS–PAGE, together with the untreated extracts. After endoglycosidase treatment, the 70 kDa polypeptide was converted into the 62 kDa form.

Intracellular localisation of ARSG

Several sulphatases identified thus far are lysosomal proteins. Non lysosomal sulphatases include STS, which has a microsomal localisation, ARSD and ARSF which are located in the endoplasmic reticulum and ARSE which is located in the Golgi apparatus.1,2 The subcellular localisation of ARSG (Ala) in transfected Cos7 cells was established by immunofluorescence techniques using a myc-tagged construct, recognised by anti-myc antibody. The use of a specific anti-ARSG antibody confirmed this result and revealed a typical reticular distribution of the protein. The reticular localisation was also confirmed by double staining with a specific anti-ER antibody, ERAB (K-20) (Figure 4). The same results were obtained with the ARSG (Pro) (data not shown).

Figure 4
figure 4

Localisation of ARSG in transfected cells. Cos 7 cells were transfected with myc-ARSG (A, B) and with ARSG (CF). (A, B) Staining using anti-Myc antibody, followed by FITC anti-mouse antibody, and anti-ARSG antiserum, followed by TRITC anti-rabbit antibody, respectively. (C, D) Staining using purified anti-ARSG antibody, followed by FITC anti-rabbit antibody. (E, F) Double staining immunofluorescence using anti-ARSG antibody, followed by FITC anti-rabbit antibody, and anti-ER antibody, followed by TRITC anti-goat antibody respectively.

Enzymatic analysis

To test a putative arylsulphatase activity of the protein, ARSG (Ala) and ARSG (Pro) cDNA constructs were transiently transfected into Cos7 cells by electroporation. After 72 h the total cell extracts were assayed at different pH (pH range from pH 4 to 9) for their ability to hydrolyse the sulphate group from 4-methylumbelliferyl (4-MU) sulphate. As control we used Cos7 cells transfected with ARSE cDNA construct.11 No arylsulphatase activity was detectable in the extracts obtained from cells transfected with the constructs (Figure 6). These results indicate that in these conditions the ARSG gene product is unable to hydrolyse the 4 MU sulphate. The enzymatic assays were also performed using the p-nitrocatechol sulphate, another artificial substrate. We obtained the same result with no activity toward this substrate (data not shown).

Figure 6
figure 6

Biochemical study of ARSG expressed in COS7 cells. Evaluation of ARSG arylsulphatase activity on 4-MUS at different pH values.

ARSG expression pattern

Northern blot analysis revealed a low level and ubiquitous expression pattern of human ARSG mRNA. Pancreas, kidney and brain showed slightly higher levels of expression (data not shown). Detection of murine ARSG mRNA by RT–PCR allowed us to investigate the expression pattern of this gene in mouse. Reverse-transcribed RNA samples from different tissues were amplified by PCR using oligonucleotide primers generated from the murine cDNA sequence. A ubiquitous expression pattern was observed similarly to what we detected in human tissues (Figure 5). In addition to RT–PCR and Northern blot analysis, we performed RNA in situ hybridisation experiments. Analysis of sagittal and coronal sections of E12.5 and E14.5 mouse embryos revealed a low level and ubiquitous expression pattern confirming the Northern blotting and RT–PCR results (data not shown).

Figure 5
figure 5

RT–PCR analysis of murine ARSG gene. Oligonucleotide primers from murine ARSG cDNA were used for the amplification of reverse-transcribed total RNA from different mouse tissues. ARSG primers amplify a 360 bp product.

Discussion

In most instances the identification of the sulphatase genes resulted from the biochemical characterisation of a known human disease.7,8,9 More recently, we have identified a cluster of sulphatase genes by positional cloning11,14. Here we describe the identification of a novel sulphatase gene, ARSG, by a bioinformatic approach using the DBWatcher tool. This allows a periodic and systematic screening of the expressed sequence and genomic databases using sequences from genes or proteins of interest as queries. This strategy does not rely on the knowledge of the gene product or of its substrate specificity and enzymatic properties. In addition no genetic disease related to an enzymatic defect links to the region where the gene maps, consequently it is difficult to predict the function of the protein. ARSG protein shares a high degree of homology with all sulphatases and in particular with arylsulphatases. Sequence alignment of the putative protein product of this new gene with other sulphatases, in particular with ARSA and ARSB, has revealed a high degree of amino acid similarity. The three sequences are very similar along the entire length of the protein, particularly in the amino-terminal region, where the consensus sulphatase signature is located (Figure 1). The active site of sulphatases has been characterised and shown to display unique features.12,13 A modified cysteine residue and a metal ion are located at the base of a substrate-binding pocket.10 The amino-acid residues conserved throughout the sulphatase family play a role in stabilising the calcium ion and the sulphate ester in the active site.15 To try to understand the physiological role of this new gene we started a biochemical characterisation of the ARSG protein. We detected two molecular forms of ARSG, a 62 and 70 kDa protein, in cell extracts after transfection. The 62 kDa polypeptide is converted into a 70 kDa, which is likely to represent the mature enzyme, with an increase in size due to N-glycosylation. The amino acid sequence contains four potential N-glycosylation sites (asparagine residues 117, 215, 356, and 497). Assuming that the average mass of an oligosaccharide is 2 kDa, all four N-glycosylation sites are potentially utilised, accounting for the 8 kDa shift observed in SDS–PAGE. ARSA and ARSB are able to hydrolyse sulphated artificial substrates containing a phenolic ring (arylsulphatase activity), such as p-nitrocatechol sulphate or 4-methylumbelliferyl sulphate (4-MU sulphate) and are lysosomal in their subcellular localisation.1 Surprisingly, ARSG does not show activity towards these substrates under the specified conditions and localises in the endoplasmic reticulum. Similarly to ARSG, a previous identified sulphatase, ARSD, is highly homologous to other arylsulphatases but does not show arylsulphatase activity.14 In conclusion, the identification of this additional sulphatase gene creates novel opportunities to study human metabolic pathways. Further studies are needed to characterise the biological role of ARSG. To this purpose we are planning to generate mice deficient for this enzyme by a gene targeting approach. This strategy will allow us to study, in the animal model, the pathological consequences of the enzyme deficiency and to clarify the metabolic role of this new sulphatase.