Introduction

Protein disulfide isomerase-like protein of the testis (PDILT) is a member of the Protein Disulfide Isomerase (PDI) family, a family responsible for catalyzing formation of native disulfide bonds in the endoplasmic reticulum (ER). This multifunctional and diverse family consists of approximately 20 proteins, each containing at least one thioredoxin-like domain1. PDIA1, the founding member and the most studied protein in the PDI family, displays the abbxa′ domain architecture, where a and a′ are catalytic domains, b and b′ are non-catalytic and x denotes a conserved linker (Fig. 1a). PDILT is a 67-kDa protein that resides exclusively in the ER of male germ cells. Only discovered a few years ago, PDILT is distinguished from other PDI members for its tissue specific distribution and the absence of typical catalytic motifs2. Full-length PDILT shares 27% of sequence identity with PDIA1 and the a′ domain is the most strongly conserved between these proteins with 48% identity2. However, PDILT differs from PDIA1 and other PDIs in several intriguing aspects.

Figure 1
figure 1

Crystal structure of the b′ domain of PDILT.

(a) Domain organization of human PDILT and PDIA1. Catalytic motifs are shown in a and a′ thioredoxin-like domains. (b) Structure of PDILT b′ domain displays a typical thioredoxin-fold with a central five-stranded β-sheet (β1–β5) and four flanking α-helices (α1–α4). An overlay of chains A (green) and C (magenta) shows divergence in the C-terminal region (C-term) of the construct. (c) Structure-based sequence alignment of the b′ domains of several members of PDI family. Amino acids forming the hydrophobic pocket in the PDILT structure and conserved between aligned proteins are highlighted in gray. Residues of ERp57 involved in the interaction with calreticulin (CRT) and calnexin (CNX) are highlighted in cyan. Secondary structure corresponds to the PDILT structure.

The most striking difference between PDIA1 and PDILT lies in the a and a′ domains. Instead of the classical redox-active CXXC motif, PDILT lacks the catalytic cysteines needed for oxidoreduction and instead possesses SKQS and SKKC motifs in the a and a′ domains, respectively (Fig. 1a). The SXXC motif, also present in TMX2, is highly conserved in PDILT among different species. This motif is also found strictly conserved among bacterial thioredoxin proteins, but its functional implications remain unknown3. The absence of catalytic cysteines suggests that PDILT primarily functions as a chaperone, one of the principal functions of PDIA1. PDIA1 directly binds to unfolded proteins via its b′ domain4.

PDILT does not display oxidoreductase activity in vitro and interacts with the Δ-somatostatin peptide and the BPTI protein in non-native conformation, so it is likely that PDILT operates as a folding assistant for spermatid-specific proteins5. PDILT is highly glycosylated, is the only PDI-family member under developmental control6,7,8 and contains a particularly long C-terminal extension of 90 residues with no known homology to other proteins2.

Recent studies have yielded new hints about the role that PDILT plays in the testis. Interactions of PDILT with binding partners, such as lectin chaperones calreticulin-3 (CRT3) and calmegin (CMG), also specifically expressed in testis, suggest that it plays a role in specific complexes that catalyze the folding of spermatogenesis-specific proteins in the late stages of maturation and differentiation of the spermatids5,9. CMG and CRT3, along with PDILT, are only expressed in the most mature germ cells and their absence causes infertility9. Additionally, PDILT has been reported to form a chaperone complex with ADAM3 (a disintegrin and metallopeptidase domain 3). This protein is important in fertility and requires the assistance of at least seven other genes in order to be delivered properly onto the sperm surface9,10,11. This suggests that PDILT plays a role in the folding pathway of certain and specific spermatogenesis-proteins and enzymes required for the proper functioning of the sperm. PDILT possesses the b′ domain, which is associated with binding of substrates and protein partners in the PDI family members12, but the molecular details of PDILT interactions with protein substrates remain elusive.

Here, we report the high-resolution crystal structure of the non-catalytic b′ domain of human PDILT determined at 2.0 Å. The structure identifies a hydrophobic pocket that is conserved in a number of PDI family members and suggests a preference for binding aromatic residues in protein substrates.

Results

Structure of the b′ domain

In order to better understand how PDILT interacts with its substrates, we obtained crystals of the b′ domain of human PDILT (residues 258–386) that diffracted to 2.0 Å using synchrotron radiation. The three-dimensional structure was solved by molecular replacement using b′ domain of human PDIA1 (PDB accession number 3BJ5), which shares 32% identity with PDILT b′ domain sequence, as a search model. A summary of data collection and refinement statistics is given in Table 1.

Table 1 Data collection and refinement statistics

The structure of the PDILT b′ domain displays a thioredoxin-like fold comprising a five-stranded β-sheet with the β1β3β2β4β5 topology, flanked by four helices (Fig. 1b). All strands are parallel, except strand β4, which is antiparallel. There are four molecules (chains A–D) in the asymmetric unit, which are very similar to each other, as all chains are superimposed using backbone Cα atoms with an RMSD oscillating from 0.1 to 0.8 Å. However, there is a feature that divides them into two groups according to the position of the x-linker (residues 369–386 of the construct). Chains A and B have the x-linker pointing away from the thioredoxin domain, while the x-linker folds towards the β-sheet in chains C and D (Fig. 1b). The x-linker residues Q375-P381 were not observed in the electron density map of chains C and D likely due to its intrinsic mobility.

The conserved hydrophobic pocket in PDILT is primed for binding hydrophobic residues

Analysis of the structure reveals a hydrophobic pocket between helices α1 and α3, the location of the principal substrate-binding pocket in PDIA1 (Fig. 2). Interestingly, in the crystal this pocket is occupied by two aromatic residues (Y383 and W384) from the x-linker of another PDILT molecule (Fig. 2a and Supplemental Fig. S1). These residues most likely mimic the hydrophobic residues exposed in misfolded substrates. Specifically, the side chain of Y383 makes hydrophobic interactions with the side chains of V322, F326, L279 and L339 of the b′ domain, while the side chain of W384 interacts with the side chains of L341 and I269 of the b′ domain of PDILT (Fig. 2b). Also, polar interactions help stabilize the binding. In particular, the hydroxyl group of Y383 forms a hydrogen bond with the imidazole ring of H277. A hydrogen bond is also observed between the backbone carbonyl group of W384 and the side chain of K266 in the α1-helix of the b′ domain. Compared with W384, Y383 goes deeper into the hydrophobic pocket and likely plays a more important role in the binding.

Figure 2
figure 2

A large hydrophobic pocket on the b′ domain of PDILT is able to interact with aromatic side chains.

(a) The residues Y383 and W384 from the C-terminal tail of the crystallized fragment bind to a hydrophobic pocket located between helices α1 and α3 of an adjacent PDILT molecule. (b) The base of the large pocket is lined with hydrophobic residues with the exception of H277, which forms a hydrogen bond with the hydroxyl group of Y383. (c) Overlay of the crystal structures of b′ domain from PDILT (green) and PDIA1 (yellow; PDB accession number 3BJ5) reveals differing orientations of helix α1, resulting in a larger substrate-binding pocket in PDILT. (d) Overlay of the crystal structures of b′ domain from PDILT (green) and the C-terminal domain of ERp27 (orange; PDB accession number 4F9Z).

The structural alignment with the b′ domain of PDIA1 using Dali database13 gave a Z-score of 17.6 and an RMSD of 1.4 Å over 113 residues, showing that the structures are very similar (Fig. 2c). The major difference is in the orientation of the helix α1, which creates a larger and deeper hydrophobic pocket in PDILT. Also, the overall charge surrounding the hydrophobic pocket is negative for PDIA1, while this area is very positively charged in PDILT. Additionally, K266, S270, L339 and L341 in the substrate-binding pocket of PDILT are not conserved between PDI family members (Fig. 1c). These differences may be the key determinants for PDILT specificity towards in vivo substrates. The size of the hydrophobic pocket in the PDILT structure is more similar to that in the C-terminal domain of ERp27 in the polyethylene glycol-bound conformation with a Z-score of 18.6 and RMSD of 1.7 Å over 111 residues (Fig. 2d)14. Overall, the structure-based sequence alignment shows a high conservation of residues forming the hydrophobic pocket in PDILT, PDIA1, PDIp and ERp27, explaining why these PDIs are able to directly interact with unfolded proteins (Fig. 1c). Interestingly, H277 of PDILT is positioned at a base of the hydrophobic pocket and is strictly conserved for these PDIs. Our structure demonstrates that the side chain of this histidine forms a hydrogen bond with the hydroxyl group of interacting tyrosine residue and could be employed similarly by other PDIs. The corresponding histidine H256 of PDIA1 was predicted to hydrogen bond with 3-hydroxyl group of 17β-estradiol15.

Small amphipathic peptides Δ-somatostatin (AGCKNFFWKTFTSC) and mastoparan (INLKALAALAKKIL) have been used in several studies of the PDI family proteins in order to mimic unfolded substrates. Interactions of these peptides with PDIA1 bb′ domains have been characterized previously4,16,17. Interestingly, Δ-somatostatin binds to the hydrophobic pocket of PDIA1 with a higher affinity compared to mastoparan (Kd of 35 μM and 130 μM, respectively)16. Since PDIA1 and PDILT display similar substrate-binding pockets, NMR titrations of 15N-labeled PDILT b′ domain with mastoparan are expected to show significant shifts of the NMR signals from PDILT residues of the hydrophobic pocket, similar to those observed for PDIA1 bb′ domains. Surprisingly, mastoparan did not interact with the PDILT b′ domain, since NMR spectra remained unchanged after the addition of up to 1:10 protein/peptide molar ratio (Supplemental Fig. S2). Interestingly, the binding of somatostatin to PDILT was previously reported5 and could be due to the presence of aromatic residues that are absent in mastoparan.

ERp57-CNX recognition mechanism is not conserved in PDILT

Calnexin (CNX) and calreticulin (CRT), which are CMG and CRT3 homologs, use their negatively charged arm-like P-domains to bind to a large positively charged patch in the b′ domain of ERp57, another member of the PDI family18,19,20. In order to determine whether PDILT interacts with CMG and CRT3 in a similar manner, we performed NMR titrations of the 15N-labeled P-domains of CMG or CRT3 with unlabeled b′ domain of PDILT. Addition of the b′ domain did not result in any chemical shift changes, demonstrating no interaction (Supplemental Fig. S3). To explain this result, we compared the binding site for CNX and CRT in ERp57 b′ domain with the PDILT structure. Both domains could be superimposed using their backbone atoms of 111 residues (out of 129 residues in PDILT b′ domain) with an RMSD of 2.6 Å and a Z score of 13.4 using Dali database (Fig. 3). In ERp57, the CNX-binding surface is centered on helix α2 and is characterized by a pronounced positive charge (Fig. 3a). Interestingly, PDILT does not display a positively charged patch in the corresponding surface. This is consistent with the fact that PDILT and ERp57 b′-domains share only 11% identity and the amino acids implicated in the ERp57/CNX interaction are not conserved in PDILT (Fig. 1c).

Figure 3
figure 3

Structural comparison between PDILT and ERp57 b′ domains.

Cartoon representations and electrostatic surfaces for ERp57 (a) and PDILT (b) are displayed in the same orientation. The residues K274 and R282 of ERp57 b′ domain essential for binding to CNX and CRT are shown as stick representations. The positively charged CNX-binding site (circled) is not conserved in PDILT.

Discussion

Despite many years of studies, molecular details of substrate recognition by protein disulfide isomerases remain elusive. In particular, structural characterization of substrate-binding sites provides important information about this process. Here, we determined a high-resolution structure of the substrate-binding domain of PDILT, which is also the first structure of any PDILT domain. Importantly, the structure reveals a large hydrophobic pocket that seems to be primed for binding aromatic residues. In PDIA1, this hydrophobic pocket is a binding site for protein substrates and stabilizes hydrophobic residues exposed in unfolded regions of protein substrates16,17,21,22,23. The higher affinity of the b′ domain for aromatic tryptophan and tyrosine residues was previously observed for the other tissue-specific PDI, PDIp. Using a seven amino acid peptide phage-display library, Ruddock and co-workers showed that tyrosine or tryptophan residues within a folding polypeptide trigger the recognition and binding to PDIp24. It is possible that PDILT uses similar preference for aromatic residues reflecting the selectivity for spermatogenesis-specific proteins. However, further studies are required to determine how PDILT recognizes its substrates in vivo.

PDILT interacts in vivo with testis-specific lectin-type chaperones, calmegin (CMG) and calreticulin 3 (CRT3)5,10. Our results suggest that the mechanism of binding of PDILT to CMG and CRT3 is different from the one that occurs in ERp57/CNX and other domains are necessary to mediate binding. For instance, the C-terminal tail of ERp57 is an additional site of interaction with CNX19. While the C-terminus alone does not interact with CNX, the basic stretch (residues 494KPKKKK500K of ERp57) is needed for high-affinity binding. Interestingly, the C-terminal tail of PDILT also contains a positively charged region (residues KKKTSEEEVVVVAKPKGPPVQKKKPK); therefore, it is possible that the C-terminus of PDILT in combination with other domains is required for binding to CMG and CRT3. More studies are needed to determine the mechanism underlying these interactions. Noteworthy, ERp57 is not able to bind to its substrates directly and requires CRT or CNX for substrate recognition25, while the PDILT b′ domain is more PDIA1-like and should be able to directly bind the non-native substrates through the hydrophobic pocket.

The specificity of substrate binding by protein disulfide isomerases is still poorly understood. PDIA1 acts on very broad range of substrates26,27, while PDILT is expected to interact with spermatogenesis-specific protein substrates unique to this cell type. This structure is another step towards understanding of how substrate binding is mediated by PDI family members and PDILT in particular. The study of ER protein folding in the testis will lead to better understanding of pathways associated with male infertility.

Methods

Protein expression and purification

The PDILT b′ domain (V258-Q386) was amplified by PCR and sub-cloned into pGEX-6P-1 (Amersham Pharmacia) using BamHI and XhoI restriction sites. Two modules of the CMG P-domain comprising the repeats 3 and 4 (residues 300–369) were cloned in pET15b (Novagen) containing N-terminal His-tag. For CRT3, the full-length P-domain (residues 198–291, 3-modules) was also cloned into pET15b using NdeI and BamHI restriction sites. Clones were confirmed by sequencing.

Protein was expressed in E. coli BL21 (DE3) cells. Expression of PDILT b′ domain with a N-terminal GST-tag were accomplished by growing cells in LB medium at 37°C and shaking until the optical density (OD) at 600 nm reached 0.8, then protein production was induced by adding isopropyl-thiogalactopyranoside (IPTG) to a final concentration of 1 mM. Cells were incubated with shaking at 30°C for another 4 hours. The GST fusion protein was purified by affinity chromatography on glutathione-Sepharose resin (GE Healthcare). Once eluted from the resin, the tag was removed by cleavage with PreScission protease leaving a N-terminal GPLGS sequence. The resulting PDILT b′ domain protein with a molecular weight of 15.5 kDa was additionally purified using size-exclusion (Superdex-75) chromatography using 20 mM HEPES pH 7.5, 0.15 M NaCl buffer and then concentrated using a Centricon tube (Millipore) with the 3-kDa cut-off. His-tagged proteins were purified by affinity chromatography on Ni-NTA (nickel-nitrilotriacetic acid) resin (QIAGEN). Eluted proteins were then cleaved with thrombin protease to remove the His-tag, leaving a N-terminal GSHM sequence. Proteins were further purified as described above.

For NMR experiments, the 15N-labeled proteins were expressed by growing in M9 minimal medium with 15N-ammonium chloride as the unique source of nitrogen. Expression and purification conditions were the same as for unlabeled proteins.

Protein crystallization

Crystallization conditions for PDILT b′ domain were identified using commercial PACT suite (QIAGEN). The best PDILT b′ domain crystals were obtained by equilibrating a 0.6 μl drop of the protein solution (7.2 mg/ml) in 20 mM HEPES pH 7.5, 150 mM NaCl, mixed with 0.6 μl of reservoir solution containing 0.2 M sodium formate, 0.1 M Bicine pH 9.0, 18% PEG3350, 0.1 M cobaltous chloride and suspended over 1 ml of reservoir solution. Crystals grew in 4 days at 22°C. For data collection, crystals were cryoprotected by addition of 20% (v/v) of ethylene glycerol and flash cooled in a N2 cold stream. Crystals of PDILT b′ domain belong to the primitive monoclinic system, space group P21, with four protein molecules per asymmetric unit corresponding to a solvent content of 36.9%.

Structure solution and refinement

Diffraction data from native PDILT b′ domain crystals were collected using a single wavelength of 0.977 Å regime with an ADSC Quantum-210 CCD detector (Area Detector Systems Corp.) on beamline A1 at the Cornell High-Energy Synchrotron Source (CHESS). Data processing and scaling were obtained with HKL-200028. The structure of human PDIA1 bx domain (residues 230–368, PDB accession code 3BJ5) was used as the search model for molecular replacement. The initial model obtained with Phaser29 was then completed manually with the program Coot30. The resulting model was improved by several cycles of refinement, using the program REFMAC31 and model refitting, followed by TLS (translation-libration-screw) refinement32. Structure figures were made with PyMOL (http://www.pymol.org).

NMR titrations

All NMR experiments were performed at 25°C on a Bruker DRX 600 MHz spectrometer. NMR samples were prepared with 90% of NMR buffer (20 mM HEPES pH 7.5 and 150 mM NaCl) and 10% of D2O. NMR titrations were carried out by acquiring 1H- 15N heteronuclear single quantum correlation (HSQC) spectra of 0.2 mM of 15N-labeled protein. Subsequent spectra were taken after the addition of an unlabeled ligand. 15N-labeled CMG P-domain (residues 300–369) was titrated with unlabeled PDILT b′ domain to the final ratio of labeled to unlabeled protein of 1:2. Purified mastoparan (INLKALAALAKKIL) peptide was added to the 15N-labeled PDILT b′ domain to a final protein/peptide ratio of 1:10. Also, 15N-labeled PDILT b′ domain was titrated with unlabeled CRT3 P-domain (residues 198-291) to a ratio of labeled/unlabeled protein of 1:2. Specta were processed by NMRPipe33 and analyzed by NMRview.

Additional information

Database: Coordinates have been deposited into the Protein Data Bank database under the accession number 4NWY