Introduction

L-Citrulline (N5-carbamoyl-L-ornithine) is a non-proteinogenic amino acid that is an intermediate in the Krebs-Henseleit urea cycle in animals1. It is produced by the enterocytes of the small bowel in humans and its accumulation in plasma can cause citrullinemia, an autosomal recessive disorder characterized by increased citrulline secretion in the urine and neuropsychiatric symptoms2. Citrulline is also a biological precursor for nitric oxide and its therapeutic administration has been proposed for the mitochondrial MELAS syndrome3. Citrulline further results from free arginine by citrullination, which entails replacement of the guanidino group with an ureido group through deimination. This removes the positive charge of the arginine side chain and liberates ammonia.

Of greater physiological relevance, however, is the citrullination of arginines in peptides and proteins through post-translational modification4,5. Given the limited number of genes in the genomes of higher organisms, such post-translational modifications increase the structural and functional diversity of the proteomes4. Citrullination may result in changes in fold, function and half-life of proteins and peptides and the reaction is catalyzed in a calcium-dependent manner by peptidylarginine deiminases (PADs). These occur only in vertebrates, where five close paralogs (PAD1-PAD4 and PAD6) have been described6,7,8. Their activity is essential for skin keratinization, neuron insulation and plasticity of the central nervous system as well as histone core-protein regulation5,7. Furthermore, through involvement of PADs in apoptosis, autophagy and NETosis, citrullination plays a major role in the immune system.

However, citrullination also has an established role in pathology, which has lately catapulted interest in the reaction since increased levels of citrullinated proteins are found in several if not all inflammatory diseases9 and have been directly implicated in Alzheimer’s disease, prion diseases, psoriasis, multiple sclerosis and tumorigenesis5. In a specific genetic background, citrullinated proteins act as autoantigens to generate anti-citrullinated protein antibodies, which participate in an abnormal autoimmune response, a hallmark of rheumatoid arthritis (RA10). The latter is a common systemic disease affecting ~1% of the general population in the developed world that is characterized by chronic inflammation of the synovial joints, eventually leading to progressive joint destruction and, despite many years of intensive research, its mechanisms of disease progression are still poorly understood. As to etiology, genetic factors, environmental influences—such as smoking and oral contraceptives— and concomitant microbial infections are risk factors for developing RA11.

Inflammation is also a hallmark of chronic periodontal disease (PD), which is among the most prevalent infectious diseases of mankind12. In its severe form, the disease affects the gums of 10–15% of adults, potentially leading to tissue destruction and tooth loss13. Its major causal agent is Porphyromonas gingivalis, a bacterium that is also implicated in cardiovascular diseases, respiratory diseases, diabetes, osteoporosis and pre-term low birth-weight. More recently, epidemiological studies have further reported an increased prevalence of PD in RA14,15, which is consistent with the antique claim made by Hippocrates ~2,400 years ago that removal of bad teeth cures arthritis16.

Within the virulence-factor armamentarium of P. gingivalis are several secreted cysteine peptidases such as lysine (Kgp) and arginine gingipains A and B (RgpA and RgpB). These are cysteine endopeptidases cleaving after lysines and arginines, respectively and they participate not only in nutrient acquisition but also in host-tissue destruction and defense inactivation17. Uniquely among microbes to date, P. gingivalis also produces a secreted PAD (called PPAD18), which protects P. gingivalis during acidic cleansing in the mouth through ammonia generated during host and endogenous protein citrullination19. PPAD does not require calcium for catalysis20 and is genetically unrelated with animal PADs and, like the latter and cysteine peptidases, its main catalytic residue is a cysteine (C351 in PPAD20).

Host PADs process arginines within polypeptide chains but not at their termini, i.e. they are efficient endodeiminases but poor exodeiminases4. In contrast, PPAD citrullinates C-terminal arginines like those generated by the prior action of Rgps17, which may be facilitated by the surface co-localization of Rgps and PPAD20. In this way, PPAD complements endogenous PADs and creates new exogenous epitopes for autoimmune response, which have been associated with RA disease progression15. Taken together, all these results suggest that the link between RA and P. gingivalis-induced PD may result from PPAD-mediated citrullination15.

To shed light on the molecular aspects of this key enzyme for pathogenicity, we analyzed the structure and function of PPAD in various functional states and proposed a working model for the enzyme based on mutational studies, which places PPAD in a wider context with PADs and functionally more distant enzymes.

Results and Discussion

Molecular structure of PPAD

PPAD was recently reported to belong to a family of secreted P. gingivalis proteins, which includes Kgp and Rgps20. These proteins possess a ~75-residue C-terminal domain (CTD) for maturation and translocation through the outer membrane via the PorSS, PerioGate or Type-IX secretion system, which removes the CTD upon secretion21. Accordingly and similarly to Kgp and Rgps, full-length PPAD would span a pro-peptide, a catalytic domain (CD), an immunoglobulin-superfamily domain (IgSF) and a CTD18. We obtained a fragment of PPAD by homologous overexpression in P. gingivalis that was equivalent to the purified form from P. gingivalis supernatant18 and, thus, lacked the pro-peptide and the CTD (residues A44-A475 ; see Table 1). We solved three distinct structures to high resolution from different protein preparations, which crystallized in different space groups: substrate-free (to 1.5 Å resolution), a substrate-mimic (1.4 Å) and a substrate complex (1.8 Å; see Table 2).

Table 1 Primers used for PPAD single-point mutagenesis.
Table 2 Crystallographic data.

The PPAD two-domain moiety (CD plus IgSF; Fig. 1) shows approximate maximal dimensions of 55 Å(height)×57 Å(width)×50 Å(depth) according to the orientation of Fig. 1a and lacks any bound calcium ion, thus explaining why it is not needed for activity. Overall, it resembles a tooth—with the 316-residue CD featuring the crown and IgSF the root—, which is reminiscent of the gross overall shape of Kgp and RgpB despite completely different functions and CD architectures (see Fig. 2b in22 and Fig. 2a in23). The neck is the interface between the two domains and the active site is at the cusp, on the grinding surface (see below). The CD (A44-K359; see Fig. 1a–c) comprises eight helices and 20 β-strands and is a flat cylinder made up by a distorted five-fold α/β-propeller arranged around a central shaft. The PPAD CD cylinder has an upper entry base, which coincides with the tooth cusp and an opposite lower exit base at the neck (Fig. 1a). Around the shaft, five propeller blades (I to V) spanning between 47 (blade III) and 76 (blade I) residues are sequentially arranged counterclockwise according to Fig. 1b,c. Each blade starts on the entry base with a loop connected to the previous blade and consists at least of a three-stranded twisted β-sheet with an inner, a middle and an outer strand, plus one helix. The inner strand runs across the cylinder to the exit base paralleling the central shaft. A short loop links the inner strand with the antiparallel middle strand, which runs in the opposite direction towards the entry base. This strand is connected through another loop with the helix, which lines the cylinder side wall. Finally, the helix is linked to the outer strand, which parallels the middle strand and likewise lines the cylinder side wall. Into this minimal architecture—found only in blade V (Fig. 1c)—, additional structural elements are inserted in each blade, thus accounting for overall blade asymmetry and chain lengths. In particular, a sodium ion is pinched by the inner strand and the consensus helix of blade II and is bound in an octahedral manner by six oxygens at distances of 2.30–2.63 Å: D148O, D158O and two solvent molecules coplanar with the cation; and apically by D147Oδ1 and D158Oδ1.

Figure 1
figure 1

Overall structure and topology of PPAD.

(A) Ribbon-type plot of PPAD in a lateral view revealing its tooth-like shape, which consists of regions assignable to cusp, crown, neck and root. The upper N-terminal cylindrical catalytic domain (CD; residues 44–359; top entry base and bottom exit base) is shown with the N-terminal segment in yellow and each of its constituting blades (I to V) in one color (blue, magenta, orange, red and green). The C-terminal IgSF-like domain (residues 360–465) is shown in grey for its β-strands (labeled β22-β29) and white for loops and coils. A sodium ion is shown as a blue sphere and a black arrow pinpoints the Michaelis-loop. (B) Top view onto the entry base of the CD cylinder after a horizontal 90°-rotation of (A). The helices (α1-α8) and strands (β1-β20) of the CD are labeled. Catalytic-triad-residue (C351, H236 and N297) side chains are shown and labeled in red to highlight the active site in the center of the α/β-propeller. A black arrow pinpoints the Michaelis-loop. (C) Topology scheme of the five-bladed PPAD CD with strands as arrows and helices as cylinders with their respective limiting residues; coloring as in panels (A) and (B). The three catalytic residues of (B) are shown as pink asterisks and the Michaelis-loop is denoted by a black arrow.

Figure 2
figure 2

Active-site architecture.

(A) Stereo image of substrate-free PPAD CD, which actually corresponds to a thiopyridine modified state, with the Michaelis-loop (V226-V237) shown in red. Selected side chains are displayed with their carbons in light blue and labeled. (B) Same view as in (A) of the substrate-mimic complex, with the Michaelis-loop in green, unmodified cysteines, side-chain carbons in tan and relevant solvent molecules to illustrate the NH3-exit/H2O-entry and hydroxide channels as spheres in light blue (see also Fig. 4a). The bound aspartate-glutamine dipeptide is further shown with its carbons in turquoise. labels D130 and labels D238. (C) Same view as in (A) and (B) of the substrate complex, with the Michaelis-loop in green, unmodified cysteine C239 (C351 is replaced by alanine) and side-chain carbons in pink. The bound methionine-arginine dipeptide is further depicted with its carbons in purple. labels D130 and labels D238.

Preceding the first blade, an N-terminal extension (A44-R63) is found attached to blade II on the cylinder side wall running from the entry base to the exit base (Fig. 1a–c). Here, the polypeptide undergoes a kink and, paralleling the inner strand of blade II, runs along the exit base between blades II and III until the central shaft. There, it runs upward as the middle strand of blade I. The C-terminal segment after blade V enters blade I and provides an extra helix followed by the inner strand of the consensus topology, thus internally fastening the molecule like a Velcro strip. Thereafter, the polypeptide reaches the exit base of the CD and enters the C-terminal 106-residue IgSF domain.

The IgSF domain (G360-E465) is a distorted 4 + 5-stranded β-sandwich (strands β21-β29) with an antiparallel back sheet (β21↓-β23↑-β26↓-β25↑) and a mixed front sheet (β22↓-β28 + β29↓-β27↑-β24↓-β25↑) whose planes are rotated away by ~25°. The right lateral flank of the domain is closed by strand β25, whose N-terminal and C-terminal halves participate, respectively, in the front and back sheets. The left lateral flank is much wider and open and contains a bulge dividing the second strand from the left of the front sheet in two (β28 and β29). This bulge interacts with the exit base of the CD (Fig. 1a). Overall, the topology and strand-connectivity of PPAD IgSF is strongly reminiscent of that of Kgp and RgpB22,23,24, but while the width (~25 Å) and depth (~20 Å) of the domains are similar, the length—along the strands of the sheets—is much greater in PPAD than in gingipains (~50 Å vs. ~35 Å).

Active site of PPAD

The propeller shaft in PPAD is rather solid, with a shallow cavity on its entry base coinciding with the tooth cusp that contains the active site (Fig. 1a,b). The latter is mainly a narrow funnel-like hole, which accommodates an arginine side chain of a peptidic or protein substrate. It is framed by the main chain and side chains of the loops connecting blades I and II, II and III, III and IV and V and I; segment β7-loop β7α3-α3 of blade II; and helix α8 of blade I (Figs 1b and 2a–c). In the substrate-free structure, which was obtained with DTDP-treated wild-type (wt) protein, catalytic C351, nearby C239 and distal C462 residues are covalently modified by what was conservatively interpreted as a 4-thiopyridyl moiety. The C239 side chain is even found in two alternate conformations, one bound to thiopyridine and the other with the sulfur as sulfoxide (Fig. 2a). This indicates overall flexibility of active site residues in PPAD due to the absence of a bound substrate and suggests that the covalent modifications of the Sγ atoms do not distort the general unbound conformation of PPAD. In addition, segment V226-V237 of the loop connecting blades III and IV, hereafter the “Michaelis loop”, is in an open conformation, thus consistent with a structure that can bind a substrate. In particular, Y233 at the most exposed part of the loop points to bulk solvent (Fig. 2a). We further obtained a substrate-mimic complex of DTDP-untreated wt PPAD with dipeptide aspartate-glutamine and a true substrate complex of DTDP-untreated PPAD–C351A with dipeptide methionine-arginine. The identification of the peptides was based on high-resolution Fourier maps and surrounding binding partners. The complexes were obtained serendipitously and all attempts to obtain complexes with other substrates or products failed. We hypothesize that DTDP-treatment precludes substrate binding and, thus, protects the unbound conformation, while lack of such treatment causes the enzyme to trap substrates or mimics during biosynthesis or purification. The complex structures are equivalent, including the backbone of the bound dipeptides (Fig. 2b,c), except for some minimal displacement and the differing side chains (Fig. 2b,c), so the substrate complex is taken hereafter as reference except for issues dealing with C351Sγ, for which the substrate-mimic complex will be referred to.

The complex structures allowed us to identify PPAD elements required for substrate binding and catalysis. Comparison with the substrate-free structure revealed overall coincidence of the complexes except for the rearrangement of the Michaelis loop (maximal displacement 7.5 Å at N230Cα), which adopts a closed conformation that traps the substrate arginine side chain (Fig. 2b). This causes H196 to be rotated ~100° around its χ1 angle toward bulk solvent and Y233 to be displaced by 4.1 Å and slightly reoriented for its side chain to bind the substrate (see below). Michaelis-loop rearrangement further causes ~90° rotation of H236 around its χ2 angle, so that its Nδ1 atom is apical to the guanidinium plane (3.2 Å away from arginine Cζ atom; see below) and may play a role in catalysis (see below). Catalytic C351Sγ, at the bottom of the cleft, occupies the opposite apical position and the atom is further in binding distance from N297Oδ1 (3.3 Å), which could potentially assist in catalysis (see below). N297Nδ2, in turn, is in binding distance of D238Oδ1 (3.3 Å). The guanidinium group is further tightly bound by D238 through a double salt bridge with arginine Nη1 and Nη2 atoms (2.9 Å and 3.0 Å), by the main-chain carbonyl of T346 (3.2 Å away from atom Nη1) and by D130 through a second double salt bridge with arginine Nε and Nη2 atoms (2.8 Å and 2.9 Å). D130 becomes rotated around its χ1 angle by ~60° upon substrate binding, thereby exchanging its tight hydrogen bond with T180Oγ1 (2.6 Å) in the substrate-free structure with binding of the substrate guanidinium group. These five interactions of the guanidinium group occur roughly in the plane of the latter. The aliphatic part of the arginine is bound between the hydrophobic side chains of I234 and W127 (both 3.7 Å apart). The latter is held in place by a hydrogen bond between its Nε1 atom and D347Oδ1 (2.9 Å), which also confers to the tryptophan a potential role in overall structure maintenance due to its stabilizing function of the loop connecting blades V and I (see below). Interestingly, two small solvent-accessible channels are found roughly on either side of the guanidinium plane, on the right and the left in Fig. 2b. The left channel, hereafter “NH3-exit/H2O-entry channel,” is framed by segments T290-N297, N230-E232, G345-T346, R252, H258, and, in particular, C239, which is closest to the substrate guanidinium and thus acts as a gatekeeper of the channel. The right channel, in turn, is shallower and does not reach the substrate but rather H236Nε2, which is bound to two solvent molecules (see below). This “hydroxide-entry channel” is framed by Y233-N235, N151-R152, I197 and E201.

On the outer border of the active-site cavity, the main chain of the substrate is tightly bound through six interactions. The C-terminal carboxylate is linked by a double salt bridge with R152Nη2 (3.0 Å) and Nε (2.8 Å). In addition, one of the carboxylate oxygens is further bound by R154Nη1 (2.9 Å) and the other by Y233Oη (2.8 Å). The latter atom also binds the main-chain amido nitrogen (3.4 Å) and the preceding peptide carbonyl is hydrogen-bonded by R154Nη2 (2.7 Å). This interaction seems to be the main factor responsible for the selectivity of PPAD for peptidylarginines over free arginine18. In addition, these interactions draw an intricate network to fix the substrate in the cleft, which makes it difficult to imagine how a substrate with C-terminal extension to the arginine, i.e. an endodeiminase substrate, would be bound by PPAD, as a C-terminally extended peptide would collide with Y233 and R152 side chains (Fig. 2b,c). Finally, lack of specific interactions with atoms upstream of the last peptide bond of the substrate accounts for PPAD’s capacity to non-specifically turn over both peptides and proteins with C-terminal arginines, i.e. as long as the C-terminus is freely accessible.

Peptidylarginine deiminase activity and mutant studies in vitro

PPAD is an efficient deiminase of peptides including bradykinin and benzoylglycylarginine18, EGF and anaphylatoxin C5a20 and Rgp-derived fibrinogen peptides, as well as a large set of bacterial cell-envelope proteins truncated by Rgps. To provide additional data on the endo- and exodeiminase activities of PPAD in vitro, we tested two octapeptides of equivalent charge derived from the physiologically-relevant bradykinin precursor sequence, respectively with an arginine at position six (G-F-S-P-F-R-S-S; Fig. 3a) and at the C-terminus (P-P-G-F-S-P-F-R; Fig. 3b). We found that peptidylarginine exodeiminase activity of PPAD was nearly 5,500 times higher than endodeiminase activity. This supports the structural findings above. In addition, detailed inspection of the final refined Fourier maps and thermal displacement parameters of atoms Nη1, Cζ and Nη2 of all twelve internal arginines of the substrate-mimic complex of PPAD, which was refined with data to very high resolution (1.4 Å; see Table 2), revealed no significant evidence for citrullination, strongly suggesting that PPAD produced by homologous overexpression in P. gingivalis is not endocitrullinated. Taken together, all these findings strongly support that PPAD is an exodeiminase, as already suggested in the initial report in 199918 and that N-terminal arginines of peptides, endosubstrates and standalone arginines are only modified at a much lower rate, if at all18.

Figure 3
figure 3

PPAD activity assays.

(A) Endo- and (B) exo-deimininase activity assays in vitro of P. gingivalis W83 wt PPAD against peptides of sequence G-F-S-P-F-R-S-S and P-P-G-F-S-P-F-R, respectively. Peptides are shown before (blue HPLC chromatograms) and after reaction with PPAD (red HPLC chromatograms). Citrullination caused a shift in the retention time of the peptides when compared with the original ones and was confirmed by mass spectrometry. Based on peak integration, the velocity of reaction was calculated for both peptides, which indicated that peptidylarginine exodeiminase activity of PPAD was nearly 5,500 times higher than endodeiminase activity based on reaction velocity (32,700 vs. 6 pmol·mU−1·h−1). (C) Stereo image depicting the 11 positions subjected to point mutagenesis and activity measurements (see (C) and (D)). The Michaelis-loop is shown in green for reference. (D) PPAD expression monitoring through Western-blot analysis of whole bacterial cultures resolved on SDS-PAGE and probed with an anti-PPAD antibody. The samples correspond to those of the abscissa of panel (E). (E) Relative deiminase activity in front of N-acetylarginine of wt W83 strain supernatant (W83), of a PPAD-deletion mutant strain (Δppad), of the latter containing plasmid pTPP for wt PPAD overexpression (pTPP; reference 100%) and a cohort of single point mutants around the active site encoded by pTPP variants.

In order to discern the functional role of the distinct residues identified in the structures above, we constructed a cohort of 18 single-point mutants of positions 127, 130, 152, 154, 180, 182, 236, 238, 239, 297 and 351 (Table 1 and Fig. 3c) and assessed the deiminase activity of the respective cell cultures relative to the wt. Difficulties in the production of wt and mutant PPADs, which were obtained from P. gingivalis cultures, precluded more extensive enzymatic analyses with purified protein. Mutant expression levels were equivalent to those of the wt as monitored by Western-blot analysis, thus pointing to properly folded proteins. The sole exception was W127A, which in accordance with a structural role in addition to a substrate-binding role (see above), was not produced in detectable amounts (Fig. 3d). As expected, activity was completely abolished when mutating catalytic C351—to either alanine or serine—, but also when replacing D238 or H236—to either alanine or asparagine—, which participate in substrate guanidinium Cζ atom pinching (Fig. 3e). N297, in binding distance of C351Sγ, likewise yielded an inactive enzyme when replaced with alanine. D130, which strongly binds the guanidinium, is also indispensable and C239, the gatekeeper of one of the two solvent channels, is also relevant as its alanine and serine mutants were just ~8% active and its glutamate mutant was completely inactive. G182, in turn, is required to be side-chain depleted as it shuts the bottom of the pocket and is close to H236 and D130. Its replacement with alanine yielded a complete loss of activity. In contrast, T180, which interacts with the two latter residues, is unessential and its alanine mutant still had ~66% activity. Interestingly, R152, which establishes a double salt bridge with the substrate carboxylate, is absolutely indispensable for activity, while the second carboxylate-binding arginine, R154, is less relevant, with its alanine mutant still showing ~30% activity. Its glutamate mutant, however, which introduces a negative charge next to the also negatively-charged substrate carboxylate, thus causing repulsion, was less than ~10% active.

Mechanism of peptide citrullination by PPAD

We propose the following chemical mechanism of function of PPAD, which includes a catalytic triad (C351-H236-N297) and seven steps proceeding over two tetrahedral and one planar-thiouronium covalent reaction intermediates (Fig. 4a,b).

Figure 4
figure 4

Proposed peptide citrullinating mechanism of PPAD.

(A) Composite picture in stereo of the active site of PPAD (see also Fig. 2) based on the substrate-mimic complex ribbon plot colored as in Fig. 2b. Only elements engaged in substrate binding and catalysis are depicted. Residue side chains taken from the substrate-mimic complex are shown with carbons in light blue (C351), those from the substrate complex in white (Y233, H236, D238, N297, R152, R154 and W217) and those from the unbound structure in pink (Y233 and H236). The Michaelis loop is shown in the open conformation of the unbound structure in pink and in the occluded conformation of the substrate(-mimic) complexes in red, a purple straight arrow highlights the rearrangement upon substrate binding. The substrate arginine depicted belongs to the substrate complex (carbons in turquoise). Solvent molecules from the substrate-mimic complex in light blue highlight the NH3-exit/H2O-entry channel on the left and those in purple the hydroxide-entry channel on the right. The rotation of the H236 side chain from the substrate-unbound to the bound conformation is pinpointed by a curved purple arrow. (B) Proposed biochemical mechanism of action of an enzymatic activity cycle in seven steps (I to VII). The substrate arginine and product citrulline are shown with bonds in bold, hydrogen bonds are shown as dashed lines.

In the substrate-free state, the Michaelis-loop containing Y233 is in an open conformation, which enables peptides with a C-terminal arginine to be accommodated at the active site. The arginine becomes firmly anchored through electrostatic interactions of the guanidinium group with the side chains of D238 and D130, being positioned in an extended conformation and appropriately oriented for catalysis. In addition, R152 and R154 bind the C-terminal carboxylate of the arginine and the carbonyl of the preceding peptide bond. Moreover, formation of this Michaelis complex (Fig. 4b, I) entails major rearrangement of the Michaelis loop, which occludes the active site and causes Y233 to further bind the C-terminal carboxylate of the substrate. Rearrangement further entails that the side chain of H236 is rotated, as a result of which the plane of the guanidinium group becomes pinched between H236Nδ1 and C351Sγ and H236Nε2 is solvent-bound in the hydroxide-entry channel (Fig. 4a,b). This geometry was determinant for the identification of H236 as the general base/acid of the mechanism and of the guanidinium Nη1 atom as the nitrogen atom of the leaving ammonia product. In addition, C351Sγ is hydrogen-bonded to N297Oδ1, which probably enhances the nucleophilicity of the catalytic sulfur. In the first step of the reaction, C351Sγ performs a nucleophilic attack on the sp2-like planar Cζ atom of the substrate guanidinium (Fig. 4b, I), giving rise to the first neutral tetrahedral reaction intermediate and yielding an sp3-like Cζ atom. Concomitantly, H236, which acts first as a general base, abstracts a proton from Nη1 and the latter captures the proton from the catalytic thiol group. The histidine is now in a diprotonated state (Fig. 4b, II). The tetrahedral intermediate collapses to a positively-charged planar thiouronium covalent intermediate and ammonia, which receives a proton from H236Nδ1, now acting as a general acid (Fig. 4b, II and III). Ammonia leaves the active site through the NH3-exit/H2O-entry channel (Fig. 4a) and reaches the surface of the enzyme. In the next step, a solvent molecule—probably a water—occupies the former position of ammonia and becomes polarized by the side chain of D238 and H236Nδ1. The latter again acts as a base and abstracts a proton from the water molecule, which performs a nucleophilic attack on the central carbon of the thiouronium (Fig. 4b, IV). This yields the second neutral intermediate centered on sp3-like tetrahedral Cζ and diprotonated H236 (Fig. 4b, V). The intermediate itself collapses to a citrullinated product and the intact catalytic cysteine mercapto group, which becomes hydrogen-bonded to N297Oδ1. The repulsion between D238 and the carbonyl oxygen of the neutral reaction product may provide the driving force for clearance of the latter from the active-site cleft (Fig. 4b, VI). Finally, a hydroxide resulting from the reaction of ammonia with water may enter the active site through the hydroxide-entry channel and replace one of the two solvent molecules bound to H236Nε1. The latter histidine transfers a proton to the hydroxide and a proton shift from Nδ1 to Nε2 restores the functional monoprotonated state of H236, thus leaving the active site posed for a new round of reaction (Fig. 4b, VII).

Structural similarity of PPAD catalytic domain

PPAD CD conforms to the structural requirements of the guanidino-group modifying enzyme superfamily (GME; see Fig. 5a–c), which adopts similar five-fold α/β-propeller folds and catalyzes chemical processing of (methylated) guanidine groups as found in the citrullinating GME members: PADs, PPAD, agmatine deiminases (AgDIs) and arginine deiminases (ADIs), which are all dimers or tetramers with the exception of PPAD25. AgDIs deiminate isolated agmatine (1-[4-aminobutyl]-guanidine) to N-carbamoylputrescine and ammonia as part of mechanisms by which energy is harnessed for growth26 and they are missing in higher eukaryotes25. ADIs, in turn, citrullinate standalone arginine and protect cells from acidic environments. They are found in plants and microorganisms but are likewise absent from animals27. Both families do not have extra domains further to the catalytic α/β-propeller.

Figure 5
figure 5

Structural similarities.

(A) Superposed ribbon-plots in stereo of PPAD in its substrate-mimic complex (cyan) and human PAD4 (coral; PDB >4DKT54) as found in its covalent thiouronium reaction intermediate mimic complex. The side chains of the respective catalytic triads (labeled for PPAD only), as well as the two calcium ions of PAD4 (red spheres) and the sodium ion of PPAD (blue sphere) are shown, as is the methionine-arginine dipeptide from the PPAD substrate complex (carbons in tan). Most loops connecting the blades and the consensus secondary elements within each blade differ in length and conformation. (B) Close-up of (A). The side chains of the catalytic triad (not labeled) and Y233 (labeled in black) of PPAD are depicted (carbons in cyan), as are several representative residues from human PAD4 (carbons in coral; labeled in blue italics) and the covalently bound intermediate (carbons in goldenrod). The mechanistically-relevant equivalent positions (see Fig. 4a,b) in PPAD/human PAD4 (in italics) are C351/C645, H236/H471, N297/N588, D238/D473, D130/D350, W217/W347, Y233/S468, R152/R372 and R154/R374. A red ellipse highlights the clash an endodeiminase substrate would have with PPAD Y233. The latter is equivalent to S468 in human PAD4, which allows for free space for C-terminally elongated substrates. (C) Same as (A) showing PPAD (cyan) and AgDI from Enterococcus faecalis (purple; PDB >2JER26) as found in a covalent adduct with an agmatine-derived amidine reaction intermediate. The respective catalytic triads are depicted and that of PPAD is also labeled. AgDI main-chain segments diverging from PPAD and mainly accounting for a closed active site are pinpointed ( to ). The mechanistically-relevant equivalent positions (see Fig. 4a,b) in PPAD/AgDI (in italics) are C351/C357, H236/H218, N297/N306, D238/D220, D130/D96 and W217/W93.

To date, only the structures of human PAD2 and PAD4 have been determined among PADs8,28 and they comprise a ~375-residue calcium-dependent α/β-propeller domain preceded by two IgG domains (see Fig. 1 in8 and28), which are unrelated to PPAD IgSF further to being all-β protein domains. Among AgDIs, structures have been reported from Enterococcus faecalis (Ef; Protein Data Bank (PDB) access code 2JER26) and Helicobacter pylori (Hp; PDB 3HVM29) and other potential relatives have been deposited with the PDB but not functionally analyzed or published (PDB 2EWO, 1XKN, 1ZBR, 2CMU, 3H7C and 1VKP). Finally, ADI structures have been reported from Streptococcus pyogenes (Sp; PDB 4BOF30), Pseudomonas aeruginosa (Pa; PDB 1RXX31) and Mycoplasma arginini (Ma; PDB 1S9R32). Among all these, closest structural similarity of PPAD is found with AgDIs (Z-score of 35 according to program DALI33; see Fig. 5c), followed by PADs (Z = 18–21; Fig. 5a,b) and ADIs (Z = 18–19).

Superposition of the PPAD α/β-propeller on that of human PAD4 (Fig. 5a,b), EfAgDI (Fig. 5c) and PaADI, SpADI and MaADI (data not shown) reveals good overall conservation of the five-blade architectures, although several decorations in the distinct blades of each family account for large differences, especially in the loops surrounding the active-site cleft. In particular, PADs evince a large partially helical insertion between β14 and α6 of PPAD blade IV and lack α2 of blade I (Fig. 5a). ADIs, in turn, evince a large helical sub-domain replacing α2 and β4 of PPAD blade I. In common, all propellers are closed by the blade V-blade I Velcro mechanism (see above and25) and the catalytic cysteines and histidines are conserved, as well as the two aspartates anchoring the guanidine group to the bottom of the active site. In addition, PPAD shares with ADIs and PADs the two arginines binding the main chain of the substrate. While these firmly bind the substrate C-terminus in PPAD and ADIs, in PADs they are slightly reoriented and only bind what would be one of the two carboxylate oxygens in addition to the upstream peptide carbonyl (Fig. 5b). This, together with the replacement of PPAD Y233 of the Michaelis loop by serine (S468; PAD4; residue numbering of proteins distinct from PPAD in italics) or threonine (T468; PAD2), provides enough space in PADs to allow for a C-terminal extension of the substrate. Furthermore, calcium-dependence of PADs is characterized by several calcium-binding sites8,28, two of which occur within the propeller domain: one close to the active site with evident implications for function and the other at the domain periphery (Fig. 5a). Interestingly, the latter coincides with the sodium site of PPAD, so a predominantly structural role for both is suggested. In contrast to PADs and PPAD, AgDIs and ADIs, which only process standalone residues, possess completely closed active sites (Fig. 5c).

Most notably, superposition also revealed that all these families possess an equivalent of PPAD asparagine N297, i.e. with a potential role in catalysis (PAD2, N590; PAD4, N588; EfAgDI, N306; HpAgDI, N274; PaADI, N360; MaADI, N352; and SpADI, N355). To our knowledge, this was previously unnoticed since this residue, which is strictly conserved across citrullinating GMEs, was merely recognized as an important residue for proper active-site conformation conserved in the consensus helix of blade V of all families (see Fig. 4 in25). In PADs, this asparagine is also conserved in distant orthologs from zebrafish and chicken within a shared motif (M/L-V-N-M34), which complements the consensus motif encompassing the catalytic cysteine residue (G-E-I/V-H-C-G-T/S). The only notable exception is human PAD6, which lacks both motifs and the calcium sites that are essential for activity in the other paralogs and orthologs34. This absence, together with the lack of direct evidence for activity in vitro with the assays routinely employed for the other PADs, poses the question as to whether PAD6 is an active peptidylarginine deiminase or whether it may require further factors or interacting partners for activity35. In any case, it is likely to follow a different catalytic mechanism.

In all the above structures, the asparagine is at suitable distances and in appropriate orientations to polarize the catalytic cysteine, as found in papain-like cysteine peptidases—in particular, Kgp and RgpB have N510—so we suggest that citrullinating GMEs all have a cysteine-histidine-asparagine catalytic triad as shown for PPAD (see above). However, in contrast to cysteine peptidases, the three residues do not establish a charge-relay system for proton transfer, but rather cysteine-asparagine and histidine act separately on opposite faces of the plane of the guanidinium (Fig. 4a,b).

Concluding remarks

Structural considerations identified PPAD as a closer relative of AgDIs, which are found across bacteria, than of PADs, which are found only in vertebrates. This, in turn, enables us to hypothesize that PPAD was acquired through horizontal gene transfer of a bacterial single-domain agmatine-citrullinating enzyme. The latter would then have evolved in a different bacterial environment under fusion to two new C-terminal domains like those found in cognate RgpB, to be secreted through a distinct system. This evolution further yielded a unique function among citrullinating enzymes: deimination of peptides with a C-terminal arginine. This activity, which complements that of R-type gingipain virulence factors (gingipain-null mutants are devoid of endogenous citrullination), has been demonstrated for several substrates.

Pathogenic bacteria have evolved sophisticated mechanisms in response to the changing environment and host antimicrobial defense systems. Post-translational modifications are hailed as one of the main factors of pathogens to breach immune tolerance. Among these modifications, citrullination of endogenous proteins seems to be a key process in the initiation of autoimmune reactions. To date, P. gingivalis is the only prokaryote that is able to citrullinate proteins and peptides. It has been proposed as a mechanistic link between PD and RA through its potential capacity of generating citrullinated epitopes distinct from endogenous PADs, thus contributing to aggravation of RA. This activity is induced by the sole bacterial peptidylarginine deiminase reported to date, PPAD, which also has a role in the interaction with host cells, so it may be considered as a double target for PD and RA. In contrast, other abundant odontopathogens responsible for PD such as Prevotella intermedia and Fusobacterium nucleatum, which both lack a PPAD ortholog, do not evince a link with RA.

Methods

Protein production, purification and characterization

P. gingivalis PPAD (UniProt database [UP] access code Q9RQJ2 or GenBank entry WP_005873463.1 for NCBI gene tag PG_1424) was obtained through small-scale homologous overexpression as a secreted protein from plasmid-transformed P. gingivalis W83 PPAD-deletion mutant strain Δppad. Briefly, plasmid pT-COW, which confers resistance against tetracycline36, was used as expression vector and plasmid derivatives encoding the wild type (wt) and a total of 18 PPAD point mutants (W127A, D130A, D130N, R152A, R154A, R154E, T180A, G182A, H236A, H236N, D238A, D238N, C239A, C239E, C239S, N297A, C351S and C351A; see Table 1) were generated. For this, the wt gene sequence plus 1081 upstream base pairs and 267 downstream base pairs was amplified from P. gingivalis W83 genomic DNA with primers pTCowPPADf and pTCowPPADr (see Table 1), which contained recognition sequences for restriction endonucleases NheI and SphI, respectively. The PCR fragment obtained was ligated into pT-COW, previously digested with NheI and SphI, to yield plasmid pTPP. Point mutations were thereafter introduced into pTPP by the SLIM method37 using primers listed under Table 1 and confirmed by DNA sequencing. Plasmid pTPP or its PPAD-mutating variants were introduced into P. gingivalis W83 Δppad by conjugation and bacteria were grown under anaerobic conditions (85% N2, 5% H2 and 10% CO2) in liquid Schaedler broth supplemented with hemin (5 mg/ml), menadione (0.5 mg/ml), L-cysteine (50 mg/ml), 1 μg/ml tetracycline and in the presence or absence of 4,4’-dithiodipyridine (DTDP). Expression levels were monitored by Western-blot analysis. For this, 30 μl of P. gingivalis liquid cultures at OD600 = 1.0 were separated on SDS-PAGE and transferred to PVDF membranes. Primary PPAD antibodies (kindly provided by Patrick Venables, Oxford) were used in 1:1,000 dilution, secondary HRP-conjugated goat anti-rabbit (Amersham) antibodies were used at 1:10,000 dilution. Cell cultures obtained in the absence of DTDP were used for functional tests (see below). In addition, preparations at a somewhat larger scale—limited by the intrinsic difficulties of cultivating P. gingivalis—for structural studies were performed for wt PPAD (DTDP–treated and –untreated) and DTDP-untreated PPAD mutant C351A (PPAD–C351A) and purified according to20.

Protein identity and purity were assessed by 15% Tricine-SDS-PAGE stained with Coomassie blue, peptide-mass fingerprinting of tryptic protein digests (PMF), N-terminal sequencing through Edman degradation and mass spectrometry (MS). Ultrafiltration steps were performed with Vivaspin 15 and Vivaspin 500 filter devices of 10 kDa cut-off (Sartorius Stedim Biotech). Protein concentrations were estimated applying the respective theoretical extinction coefficients by measuring A280 in a spectrophotometer (NanoDrop). Concentrations were also measured by the BCA Protein Assay Kit (Thermo Scientific) with bovine serum albumin as a standard.

Activity assays

PPAD endo- and exodeimininase activities were determined against kininogen-derived peptides of sequence G-F-S-P-F-R-S-S and P-P-G-F-S-P-F-R, respectively. Briefly, peptides (30 μg) were incubated for 2 h at 37 °C in 100 mM Tris-HCl, pH 7.5 supplemented with 10 mM L-cysteine in the presence of P. gingivalis PPAD (0.12, 1.2, 12 and 120 mU) in 30 μl-reaction volumes (final peptide concentration 1 mg/ml). Respective controls were prepared with the same amount of peptide incubated in the reaction buffer alone. Reactions were stopped by addition of 80 μl 0.5% trifluoroacetic acid (TFA) in HPLC-quality water and the samples were further analyzed by HPLC using an ÄKTA Micro chromatography system (GE Healthcare) coupled with an Aeris Peptide XB-C18 4.6/150 column (Phenomenex). Peptides were resolved in 19 column volumes using a 2–80% gradient of phase A (0.1% TFA) and phase B (80% acetonitrile, 0.08% TFA) at 1.5 ml/min flow rate. Eluted peaks were fractionated and citrullination was assessed by MS using a HCT Ultra ETD II ESI Iontrap mass spectrometer (Bruker). To determine the velocity of deimination, peptides were incubated with 0.12 mU (1 h) and 120 mU (2 h) PPAD, respectively, in triplicates. Peak integration data were used to determine the amount of modified peptide in each peak (~11% and ~4.5% for P-P-G-F-S-P-F-Cit and G-F-S-P-F-Cit-S-S, respectively) and estimate the reaction velocity (in pmol·mU−1·h−1±SD). We found that when P-P-G-F-S-P-F-R became completely citrullinated after overnight incubation (0.12–12 mU), G-F-S-P-F-R-S-S was not modified. Only at ten-fold higher PPAD concentration (120 mU) was certain time-dependent citrillunation of the endosubstrate observed, with 5% of peptide being modified after 2 h. Comparatively, 11% of the exosubstrate was citrullinated after 1 h at a thousand-fold lower PPAD concentration (0.12 mU; see Fig. 3a,b).

Competence of wt and mutant PPADs was assessed by the amount of citrulline produced according to a sensitive colorimetric assay38. Results obtained from tree independent assays were adjusted to OD600 = 1.0 and presented as % of the activity of pTPP-transformed Δppad producing wt PPAD.

Crystallization and diffraction data collection

Prior to crystallization, DTDP–treated and –untreated wt PPAD and DTDP–untreated PPAD–C351A were dialyzed overnight against buffer A (20 mM Tris-HCl, 20 mM sodium chloride, pH 7.5) and further purified by ionic-exchange chromatography on a TSKgel DEAE-2SW column (TOSOH Bioscience) equilibrated with buffer A. A gradient of 4–60% buffer B (20 mM Tris-HCl, 500 mM sodium chloride, pH 7.5) was applied over 30 ml and samples were collected and pooled. Finally, each pool was concentrated by ultrafiltration and subjected to size-exclusion chromatography on a Superdex 75, 10/300 column (GE Healthcare Life Sciences) equilibrated with buffer C (20 mM Tris-HCl, 150 mM sodium chloride, pH 7.5).

Crystallization assays were performed by the sitting-drop vapor diffusion method. Reservoir solutions were prepared by a Tecan robot and 100 nL crystallization drops were dispensed on 96 × 2-well MRC plates (Innovadyne) by a Phoenix nanodrop robot (Art Robbins) or a Cartesian Microsys 4000 XL (Genomic Solutions) robot at the joint IBMB/IRB Automated Crystallography Platform at Barcelona Science Park. Plates were stored in Bruker steady-temperature crystal farms at 4 °C and 20 °C. Successful conditions were scaled up to the microliter range in 24-well Cryschem crystallization dishes (Hampton Research).

The best crystals of wt PPAD with 4-thiopyridine but without substrate (PPAD–TP; substrate free) resulting from DTDP treatment during production (see above) were obtained at 20 °C from 1 μl:1 μl drops with protein solution at 20–25 mg/ml concentration in 20 mM Tris-HCl pH 7.4, 100 mM sodium chloride and 100 mM sodium acetate (pH 4.5), 25% [w/v] polyethylene glycol 3,350 as reservoir solution. PPAD mutant C351A in complex with the dipeptide methionine-arginine (PPAD–C351A+M-R; substrate complex) was crystallized similarly but with 100 mM tri-sodium citrate, 20% [w/v] polyethylene glycol 3,000, pH 5.5–6.5 as reservoir solution instead. Finally, wt DTDP-untreated PPAD in complex with the dipeptide aspartate-glutamine (PPAD+D-Q; substrate-mimic complex) was crystallized with 100 mM tri-sodium citrate, 2 M ammonium sulfate, pH 5.5–6.5 as reservoir solution. All crystals contained protein spanning A44-A475 as determined by Edman degradation and MS analysis. Crystals were cryo-protected by rapid passage through drops containing increasing concentrations of glycerol (up to 15% [v/v]). Complete diffraction datasets were collected at 100 K from liquid-N2 flash cryo-cooled crystals (Oxford Cryosystems 700 series cryostream) on a Pilatus 6 M pixel detector (from Dectris) at beam line XALOC of ALBA synchrotron (Barcelona, Spain39). Further data were collected on the same detector type at beam line ID23-1 of ESRF synchrotron (Grenoble, France) within the Block Allocation Group “BAG Barcelona.” Diffraction data were integrated, scaled, merged and reduced with program XDS40. PPAD–TP, PPAD–C351A+M-R and PPAD+D-Q crystals all contained one protein molecule per asymmetric unit (solvent content, respectively, 41%, 44% and 48%), had the symmetry of the space groups P212121, C2 and P212121, respectively and had different cell constants (see Table 2 for data processing statistics).

Structure solution and refinement

A similarity search with programs PSI-BLAST and HHPRED identified only low homology models (PDB 3HVM, 1ZBR, 1XKN, 2JER, 3H7C and 2EWO), which failed to render a solution by conventional molecular replacement and Patterson-search methods. At this point, wt PPAD–TP crystal diffraction data were used for structure solution with ARCIMBOLDO41,42,43. Therefore, 16 datasets with resolutions ranging from 3.0 Å to 1.5 Å from different native protein crystals or heavy-ion soaks with similar cell dimensions were merged with program XPREP. A collection of structure fragments was generated from the six aforementioned distant structural relatives and ARCIMBOLDO runs were set up in parallel with these fragments and libraries41,42. These calculations eventually enabled structure solution (see44,45 for details) and the resulting phase set was subjected to density modification and autotracing with SHELXE46, which yielded an improved set of phases and a partial model. These phases and the resulting Fourier map enabled subsequent manual model building with the COOT program47, which alternated with crystallographic refinement with PHENIX48 and BUSTER/TNT49 under inclusion of TLS refinement, until the final refined model of PPAD–TP was obtained. This consisted of residues A44-N464, one structural sodium ion, seven glycerols, 460 solvent molecules and 4-thiopyridine moieties respectively attached to the Sγ atoms of C351, C462 and C239. The final Fourier map indicated that the side chain of the latter residue was present in two alternate conformations, one bound to thiopyridine and the other with the sulfur as sulfoxide. See Table 2 for final refinement and model quality statistics.

The structure of PPAD–C351A+M-R was solved with PHASER within the PHENIX50 package using the refined coordinates of PPAD–TP. The adequately rotated and translated molecule yielded accurate phases, which enabled calculation of an initial Fourier map. Subsequent model completion and refinement proceeded as above. The final model of PPAD–C351A+M-R contained residues A44-M463, one structural sodium cation, a dipeptide of tentative sequence methionine-arginine, five glycerols, one chloride, two azides, 426 solvent molecules and a free cysteine disulfide-bonded to C462. See Table 2 for final refinement and model quality statistics.

The structure of PPAD+D-Q was solved similarly. Model completion and refinement proceeded as above. The final model comprised residues A44-E465, one sodium cation, a dipeptide of tentative sequence aspartate-glutamine (the distinction between aspartate/asparagine and glutamate/glutamine was performed based on surrounding interacting partners), three glycerols, five phosphates, one chloride, one azide and 689 solvent molecules. See Table 2 for final refinement and model quality statistics.

Miscellaneous

Ideal coordinates and parameters for crystallographic refinement of non-standard ligands were obtained from the PRODRG server51. Structural similarity searches were performed with DALI33 and structure figures were prepared with programs COOT and CHIMERA52. Experimental structures were validated with MOLPROBITY53. The final coordinates of P. gingivalis PPAD–TP (substrate free), PPAD–C351A+M-R (substrate complex) and PPAD+D-Q (substrate-mimic complex) are deposited with the PDB at www.pdb.org (access codes 4YT9, 4YTG and 4YTB).

Additional Information

How to cite this article: Goulas, T. et al. Structure and mechanism of a bacterial host-protein citrullinating virulence factor, Porphyromonas gingivalis peptidylarginine deiminase. Sci. Rep. 5, 11969; doi: 10.1038/srep11969 (2015).