Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts

Ma, Liang; Chen, Zehua; Huang, Da Wei; Kutty, Geetha; Ishihara, Mayumi; Wang, Honghui; Abouelleil, Amr; Bishop, Lisa; Davey, Emma; Deng, Rebecca; Deng, Xilong; Fan, Lin; Fantoni, Giovanna; Fitzgerald, Michael; Gogineni, Emile; Goldberg, Jonathan M.; Handley, Grace; Hu, Xiaojun; Huber, Charles; Jiao, Xiaoli; Jones, Kristine; Levin, Joshua Z.; Liu, Yueqin; Macdonald, Pendexter; Melnikov, Alexandre; Raley, Castle; Sassi, Monica; Sherman, Brad T.; Song, Xiaohong; Sykes, Sean; Tran, Bao; Walsh, Laura; Xia, Yun; Yang, Jun; Young, Sarah; Zeng, Qiandong; Zheng, Xin; Stephens, Robert; Nusbaum, Chad; Birren, Bruce W.; Azadi, Parastoo; Lempicki, Richard A.; Cuomo, Christina A.; Kovacs, Joseph A.

doi:10.1038/ncomms10740

Download PDF

Article
Open access
Published: 22 February 2016

Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts

Liang Ma¹^na1,
Zehua Chen²^na1,
Da Wei Huang³^nAff7,
Geetha Kutty¹,
Mayumi Ishihara⁴,
Honghui Wang¹,
Amr Abouelleil²,
Lisa Bishop¹,
Emma Davey¹,
Rebecca Deng¹,
Xilong Deng¹,
Lin Fan²,
Giovanna Fantoni¹,
Michael Fitzgerald²,
Emile Gogineni¹,
Jonathan M. Goldberg²^nAff8,
Grace Handley¹,
Xiaojun Hu³,
Charles Huber¹,
Xiaoli Jiao³,
Kristine Jones³,
Joshua Z. Levin²,
Yueqin Liu¹,
Pendexter Macdonald²,
Alexandre Melnikov²,
Castle Raley³,
Monica Sassi¹,
Brad T. Sherman³,
Xiaohong Song¹,
Sean Sykes²,
Bao Tran³,
Laura Walsh¹,
Yun Xia¹,
Jun Yang³,
Sarah Young²,
Qiandong Zeng²,
Xin Zheng³,
Robert Stephens³,
Chad Nusbaum²,
Bruce W. Birren²,
Parastoo Azadi⁴,
Richard A. Lempicki³,
Christina A. Cuomo ORCID: orcid.org/0000-0002-5778-960X²^na2 &
…
Joseph A. Kovacs¹^na2

Nature Communications volume 7, Article number: 10740 (2016) Cite this article

12k Accesses
115 Citations
92 Altmetric
Metrics details

Subjects

Abstract

Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses.

Genomic insights into the host specific adaptation of the Pneumocystis genus

Article Open access 08 March 2021

Intracellular parasitism, the driving force of evolution of Legionella pneumophila and the genus Legionella

Article 04 May 2019

Combinatorial selection in amoebal hosts drives the evolution of the human pathogen Legionella pneumophila

Article 27 January 2020

Introduction

Pneumocystis organisms were first described in 1909, but Pneumocystis became a prominent cause of morbidity and mortality only in the late 20th century, as more humans become susceptible due to HIV/AIDS or immunosuppressive therapies developed for cancer, transplants and inflammatory diseases. The genus Pneumocystis comprises a group of highly diversified microorganisms that reside in the lungs of humans and other mammals. Pneumocystis jirovecii, the species that infects humans, causes life-threatening pneumonia (termed Pneumocystis pneumonia or PCP) exclusively in immunosuppressed patients, and has been responsible for many thousands of deaths over the past 30 years. Despite improvements in diagnosis and treatment, PCP still has an estimated mortality rate of 5–30%. Recently, the frequency of PCP has been increasing in non-HIV immunosuppressed patients, especially renal transplant patients, in whom multiple outbreaks have been reported in the past decade^1,2. Moreover, colonization with Pneumocystis is being recognized in an expanding population of patients with underlying pulmonary disease, potentially contributing to accelerated deterioration in pulmonary function³. While a number of drugs, including trimethoprim-sulfamethoxazole and atovaquone, are effective in treating and preventing PCP, there have been rising concerns of drug resistance to the best available therapeutic agents⁴.

Unique among fungal species, Pneumocystis have adapted to and co-evolved with individual mammalian host species such that each Pneumocystis species can infect only a single host species; the basis for this specificity is currently unknown. Pneumocystis has two primary stages: spherical cysts, which have a thick cell wall that is rigid, and the more numerous, pleomorphic (amoeboid) trophic forms, which have a thin, flexible cell wall that is often tightly attached to type I pneumocytes in lung alveoli. Animal studies suggest that the cyst form is responsible for transmission between hosts⁵. Although the life cycle remains undefined, it has been hypothesized that trophic forms can undergo binary fission or, alternatively, conjugate and undergo meiosis, leading to the development of cysts with up to eight intracystic bodies, which can subsequently be released as new trophic forms⁴.

Despite its medical importance, our knowledge of the basic biology of Pneumocystis remains very limited, in large part due to an inability to reproducibly culture the organism in vitro, notwithstanding intense efforts by numerous investigators over the past three decades. Genetic manipulation by transformation is currently untenable. As a consequence, options for developing rational new therapies are currently limited.

To help address these deficiencies, we integrate here whole-genome sequencing and biochemical analysis to gain deeper insights into the biology of P. jirovecii (the human pathogen) and two related species infecting mice (Pneumocystis murina) and rats (Pneumocystis carinii) that serve as important models for studying the pathogenesis and treatment of PCP. While efforts to sequence a Pneumocystis genome began two decades ago⁶, the currently available assemblies and annotations of the P. carinii⁷ and P. jirovecii⁸ genomes remain largely fragmentary (with 4,278 and 358 contigs, respectively) and incomplete; the P. murina genome has not been previously sequenced. Furthermore, accurate assembly has been hampered by the presence of highly repetitive sequences in the Pneumocystis genome, including the very large multi-copy gene families encoding the major surface glycoproteins (Msg) and kexin proteases, which have been almost entirely excluded from the current genome assemblies of both P. carinii⁷ and P. jirovecii⁸. Analysis of these draft genomes has provided insights into the sexual reproductive system⁹ and several metabolic pathways such as the amino acid biosynthesis, nitrogen and sulfur assimilation, glyoxylate cycle, myo-inositol biosynthesis, thiamine biosynthesis and purine degradation^{8,9,10,11,12,13}. However, the fragmentary and incomplete status of both genomes has prevented a comprehensive and precise analysis of chromosome structure and gene gain and loss.

In the present study, we generate high-quality genome assemblies that approach the chromosomal level for these three species, and perform RNA-Seq to improve predicted gene structures and to examine gene expression during infection. We also utilize long-read sequencing to characterize the msg and kexin gene families, and perform biochemical analysis of the key cell wall components chitin and mannan, to experimentally validate genome predictions. We present the major differences in genome content and functional attributes of Pneumocystis in comparison with other fungi and highlight the unique features of Pneumocystis species, including cell wall modifications and metabolic shifts that suggest new mechanisms of adaptation to growth as obligate pulmonary pathogens.

Results

Conservation and content of highly reduced genomes

We generated three high-quality genome assemblies, including the first P. murina genome assembly and new assemblies for P. carinii and P. jirovecii, all of which are at or near the chromosome level and contain the complete gene set except for a small number of msg (in all three species) or kexin genes (in P. carinii alone) in several subtelomeric regions that may remain unidentified. Compared with these assemblies, the previously reported assemblies for P. carinii⁷ and P. jirovecii⁸ are less complete and continuous, with some genes missing entirely or in part, as evidenced by the shorter average gene and protein lengths, the absence of ∼30% of exons in P. jirovecii (Supplementary Table 1), and the lack of both msg and kexin genes in P. carinii and msg genes in P. jirovecii.

The P. murina genome assembly is 7.5 Mb in size across 17 scaffolds, which range in size from 292 to 588 kb (Supplementary Fig. 1a), consistent with the number and size of chromosomes revealed by electrophoretic karyotyping and Southern blotting (Supplementary Figs 2 and 3; Supplementary Note 1). The P. carinii genome assembly is 7.66 Mb in size across 17 scaffolds, which range in size from 268 to 635 kb (Supplementary Fig. 1b), in line with previously reported chromosome number and size from electrophoretic karyotyping experiments¹⁴. The P. jirovecii genome assembly is 8.4 Mb in size across 20 scaffolds, ranging in size from 72 to 635 kb (Supplementary Fig. 1c). Previous studies have estimated the P. jirovecii genome to be 7.0 Mb in size distributed over 12–13 chromosomes¹⁵, though this estimate may be inaccurate given the poor quality of available P. jirovecii samples.

The genomes of P. murina and P. carinii are very similar in total length, chromosome structure and gene organization, and show limited rearrangements while the genome of P. jirovecii is highly rearranged (Fig. 1a; Supplementary Fig. 4; Supplementary Note 2). Between two P. jirovecii strains, there is ∼0.3% genetic variation, with subtelomeric regions showing the highest diversity (Fig. 1b) as observed in some other fungi¹⁶. Despite these rearrangements, conserved syntenic regions include nearly all genes of P. murina compared with either P. carinii (96.1%) or P. jirovecii (92.9%). In contrast, comparison to other phylogenetically related fungi identified only small blocks of three to seven genes with a conserved order in comparison of P. murina to Schizosaccharomyces pombe¹⁷ (8.1%) or Taphrina deformans¹⁸ (3.3%). The high number of inter-chromosomal rearrangements in Pneumocystis contrasts to the primarily intra-chromosomal pattern reported for Saccharomycetes¹⁹ and other ascomycetes.

**Figure 1: Conservation of *Pneumocystis* genome structure.**

Compared with other fungi, all three Pneumocystis species have a small genome size and a reduced gene set; between 3,675 and 3,812 genes (including transfer RNA (tRNA) and ribosomal RNA (rRNA) genes) are predicted (Table 1) compared with an average of 5,044 genes in the four related Schizosaccharomyces genomes¹⁷. Other unusual features of the Pneumocystis genomes are their possession of only a single copy of rRNA genes (Supplementary Fig. 5), a minimal number of tRNA genes and an extremely low GC content, all of which are among the lowest in eukaryotes (Table 1). These features may reflect a slow transcription and translation machinery (Supplementary Note 3), which, together with the loss of many biosynthetic pathways as discussed below, may lead to the slow growth of Pneumocystis organisms as observed in animal models, where doubling times are estimated to be 5–8 days²⁰, as well as to the failure to grow in vitro. The reduced genome size and content likely reflects adaptation to and dependence on human and other mammalian hosts, as has been previously noted for Microsporidia²¹.

Table 1 Comparison of Pneumocystis and related fungal genomes.

Full size table

Gene family expansions and contractions

To examine the predicted functional impact of genome reduction, we examined gene gain and loss in Pneumocystis relative to seven closely related Ascomycete species (Supplementary Fig. 6) using protein homology and subcellular localization analysis tools (Supplementary Methods). To identify common features of reduced genomes, we also compared these changes in gene content with those in two microsporidial species²¹, both of which are intracellular organisms and have the smallest known genomes in the fungal kingdom (Fig. 2). Based on a phylogenetic tree inferred from 413 single copy core orthologues (Supplementary Methods), the three Pneumocystis species cluster with Taphrina deformans¹⁸ as a sister group to the Schizosaccharomyces species (Supplementary Fig. 6). As previously recognized based on mitochondrial genome sequences²², P. murina and P. carinii are more closely related to each other than either is to P. jirovecii, although all three species show substantial divergence.

**Figure 2: Protein domains depleted or enriched in *Pneumocystis*.**

Protein domains enriched in Pneumocystis include host-interacting cell-surface proteins and proteins required for basic cellular functions. The most significantly enriched domains in Pneumocystis are the Msg domains, which are shared among Pneumocystis species but absent in all other sequenced species. Msg is encoded by a large multi-copy gene superfamily (Fig. 3; Supplementary Table 3; Supplementary Data 1; Supplementary Note 4). Msg genes account for about 3–6% of an otherwise highly reduced genome, suggesting that the encoded proteins play an essential role in the organism’s survival. We found extraordinary diversity in genes encoding the Msg superfamily; there were 64 to 179 unique genes per species, which represent the largest surface protein family identified to date in the fungal kingdom²³ and which show conservation among most Msg families or subfamilies across species, but also species-specific amplifications. Based on domain structure and phylogenetic analysis, the Msg superfamily is classified into five families, designated as Msg-A, -B, -C, -D and -E families (Fig. 3). The high diversity of these Msg families therefore could specify different proteins at the cell surface, both between strains and between species. Msg is thought to play an important role in host–pathogen interactions and potentially facilitates evasion of host immune responses through antigenic variation^23,24. The fact that different P. jirovecii strains have unique msg repertoires^2,25 dramatically expands the potential for antigenic variation, possibly against T-cell rather than B-cell host responses²⁶. Msg antigenic variation by gene recombination is supported by the presence of homologoues to all the key genes involved in homologous recombination in Saccharomyces cerevisiae (Supplementary Data 2).

**Figure 3: The Msg superfamily in three *Pneumocystis* species.**

Peptidases of the S8 and M16 families are also highly enriched (Fig. 2; Supplementary Data 3; Supplementary Fig. 7). A large difference between the Pneumocystis species is the expansion of the S8B peptidase subfamily (kexin) in P. carinii, which contains 39 copies compared with 0–1 copy in P. murina, P. jirovecii and the other analysed fungi. Kexin is potentially involved in the processing of Msg proteins in the Golgi^27,28,29, though in P. carinii many of the kexin genes encode proteins predicted to be on the cell surface through a glycosylphosphatidylinositol anchor²⁹ (Supplementary Fig. 7b). The protein encoded by the single-copy kexin gene in both P. murina and P. jirovecii has been predicted to localize to the Golgi apparatus^27,28. Other enriched domains include another cell-surface family, the cysteine-rich CFEM (common in fungal extracellular membrane) domain (Supplementary Fig. 8), as well as proteins involved in regulation of transcription, translation and other cellular activities (histone deacetylase family, ATPase family and the RNA recognition motif (RRM), Supplementary Data 4–6), all of which potentially facilitate the survival of Pneumocystis organisms in the host as discussed below.

By contrast, all three Pneumocystis species show extensive reduction of multiple gene families, as expected from their small genome size (Fig. 2). The most significantly reduced Pfam domains in Pneumocystis comprise the following three major categories: (1) transporters, with <25 transporters in each Pneumocystis species compared with 131–217 in other ascomycetes for the 6 depleted Pfam domains; (2) transcription factors, with only 3 genes in each Pneumocystis species in the 3 depleted Pfam domains, an order of magnitude fewer than other fungi except for Microsporidia; (3) enzymes, including oxidoreductases, hydrolases, transferases and coenzymes. Most of the significantly depleted domains in Pneumocystis are also depleted in Microsporidia (Fig. 2), suggesting common dependencies of these obligate pathogens on their hosts to complement these biologic functions. While additional transcription factor and transporter-associated domains are conserved (Supplementary Data 7 and 8; Supplementary Fig. 9), the total number of transcription factors and transporters in Pneumocystis is among the lowest in fungi.

Substantial reduction and unique features of metabolic pathways

Since the chromosome-level genome assemblies we generated are more complete than previously reported assemblies^7,8, we mapped all major metabolic pathways for all three Pneumocystis species (Figs 4 and 5; Supplementary Figs 10 and 11; Supplementary Data 9–19; Supplementary Note 5). Among the most significantly reduced pathways are those involved in amino acid metabolism. As previously noted^8,12, each Pneumocystis species lacks ∼80% of genes involved in de novo amino acid synthesis in yeast (Supplementary Data 9). In addition, Pneumocystis has impaired capacity for assimilation of inorganic nitrogen and sulfur¹⁰. Consequently, none of the 20 standard amino acids can be synthesized de novo although a few can be synthesized from others. Moreover, in contrast to these earlier reports, we have identified only 1 potential amino acid transporter (Ptr2), which is predicted to localize to the plasma membrane in each species compared with over 20 such transporters in yeasts. Nevertheless, intracellular transport appears highly conserved; nearly half of the 26 mitochondrion- and vacuole-associated amino acid transporters in yeast are preserved. While it is believed that polyamines are ubiquitous in all organisms and serve diverse functions, none of the three Pneumocystis genomes encodes any of the enzymes necessary for de novo synthesis of polyamines, though there is one potential polyamine transporter in each species, consistent with previous in vitro studies of P. carinii³⁰. All three Pneumocystis species have retained the genes required for de novo nucleotide synthesis but are missing nearly all the genes for nucleotide salvage pathways (Supplementary Data 10).

**Figure 4: Reduction of carbohydrate and lipid metabolism in *Pneumocystis*.**

**Figure 5: Summary of mechanisms of adaptation to host lungs by *Pneumocystis*.**

For carbohydrate metabolism, loss of a subset of pathways further highlights nutritional dependency. All three Pneumocystis species have a full complement of genes necessary for uptake and catabolism of glucose via glycolysis and the tricarboxylic acid (TCA) cycle (Fig. 4; Supplementary Data 11). In addition, all three species have all the enzymes necessary to convert fructose and mannose to glucose, and to synthesize and utilize glycogen and trehalose. However, key enzymes that convert galactose and sucrose to glucose are missing. Additional notable losses include two enzymes for glyoxylation⁸, one key enzyme for gluconeogenesis, and all enzymes for pyruvate fermentation. These findings, together with the identification of almost all genes involved in oxidative phosphorylation (Supplementary Data 16), suggest that energy production in Pneumocystis largely relies on glucose through oxidative pathways.

Lipid metabolism genes in Pneumocystis are also greatly reduced in number, thus resulting in distinctive differences in predicted lipid content compared with other fungi (Fig. 4; Supplementary Data 12). All three Pneumocystis species are able to synthesize fecosterol and episterol, but unable to convert them to ergosterol, which potentially accounts for the resistance of Pneumocystis to classical antifungal agents as noted previously¹¹. Moreover, only P. jirovecii but not P. murina or P. carinii can synthesize cholesterol¹³ (Supplementary Fig. 11). These findings support the hypothesis that P. murina and P. carinii, but not necessarily P. jirovecii, may scavenge cholesterol from their hosts³¹, though the genes involved in sterol uptake have not been identified. Utilization of cholesterol rather than ergosterol may be contributing to a less rigid cell wall, thus allowing development of the trophic form of Pneumocystis since cholesterol-containing membranes are more flexible than ergosterol-containing membranes. All three Pneumocystis species lack not only the de novo synthesis pathways for myo-inositol, choline, complex sphingolipids, ether lipids, phosphatidylinositol, phosphatidylcholine and fatty acids (the cytosolic pathway involving fas1 and fas2 genes) but also the transporters for direct uptake of these lipids from external sources (Fig. 4; Supplementary Data 12). Nevertheless, alternative mechanisms could supply cells with phosphatidylinositol, phosphatidylcholine, inositol and choline (Fig. 4; Supplementary Note 5).

Strikingly, most of the genes involved in fatty acid β-oxidation are missing in all three species, suggesting that fatty acids are not an energy source for Pneumocystis, further supporting a high reliance on glucose as the main energy source. Of note, all three Pneumocystis species lack the enzymes for synthesis of glycerol from glycerone-phosphate or monoacylglycerol but encode homologues of yeast proteins (Gup1 and Fps1) responsible for glycerol uptake and export (Fig. 4; Supplementary Fig. 9b). Although glycerol is the only non-sugar carbon source that can enter into the TCA cycle, and is also required for synthesis of glycerolphospholipids, these two transporters may also play an important role in maintaining osmotic balance given the loss of critical cell wall components as discussed below.

Cofactor metabolism is also largely reduced in Pneumocystis (Supplementary Data 13). All three Pneumocystis species are missing almost all enzymes required for de novo synthesis and membrane transport of pantothenate, but retain all enzymes needed to convert pantothenate to CoA, as well as a mitochondrial carrier protein (Leu5) for CoA (Supplementary Fig. 10), implying possible scavenging of pantothenate or its downstream metabolites from the host by other mechanisms (for example, endocytosis. Fig. 5; Supplementary Data 14). All three Pneumocystis species lack enzymes required for de novo synthesis of vitamins B1 and H, as well as ubiquinone and siderophores, but have a potential plasma membrane transporter for each of these cofactors (Supplementary Data 13; Fig. 4; Supplementary Fig. 9c,d). P. jirovecii also lacks enzymes for de novo synthesis of NAD, while retaining one salvage pathway using exogenous nicotinic acid mononucleotide imported by a transporter (Tna1). In contrast, both P. murina and P. carinii have all the enzymes and the transporter needed for de novo synthesis and salvage of NAD. All three Pneumocystis species show a near complete absence of proteins necessary for reductive iron assimilation and siderophore biosynthesis³² (Supplementary Data 13). Since each Pneumocystis species encodes five proteins with cysteine-rich CFEM domains (Supplementary Fig. 8), it is possible that, like Candida albicans³³, Pneumocystis is able to scavenge iron from host haem and haemoglobin using one or more of these proteins as an extracellular haem receptor.

Our analysis suggests that endocytosis may serve as a mechanism for Pneumocystis to obtain nutrients in the absence of multiple different receptors. In support of this hypothesis, we found that each Pneumocystis species encodes nearly all proteins involved in clathrin-dependent endocytosis (Supplementary Data 14). In addition, genes encoding various degradative enzymes (including proteases, lipases, ATPase families and proteasome shown in Supplementary Data 3,5,12 and 15) and transporters localized in mitochondria and vacuoles are expanded or retained, and some of them are highly expressed (Supplementary Data 20), while biosynthetic pathways for many nutrients (including amino acids, lipids and cofactors, as noted above) and plasma-membrane-associated transporter families are completely lost or reduced. Loss of these transporters may reflect low concentrations of their targets in the host lung milieu.

Lack of chitin, outer chain N-mannans and α-glucan in cell wall

Fungal cell walls typically are composed primarily of chitin, chitosan, glucans, mannans and glycoproteins, which are covalently cross-linked together to protect the cell from changes in environmental stresses, while allowing the organism to interact with its environment. The structure and biosynthesis of such cell walls are unique to fungi and thus serve as an excellent target for antifungal agents. The formation and assembly of the Pneumocystis cell wall remain poorly understood, though numerous studies have found it to be rich in glycoproteins and, in cysts only, β-glucans^34,35.

Remarkably, none of the three Pneumocystis species encodes the key enzyme chitin synthase required for chitin synthesis, or chitinases involved in chitin degradation during cell wall remodelling (Fig. 6; Supplementary Data 17). The absence of these two gene families strongly suggests that Pneumocystis does not contain chitin in its cell wall. While each Pneumocystis genome does encode homologues of a few accessory proteins not directly involved in chitin synthesis or degradation (such as Chs5 reported elsewhere³⁶), none of these shares any signature protein domains found in chitin synthase or chitinase (Supplementary Fig. 12). In Saccharomyces cerevisiae, Chs5 functions as a component of the exomer complex involved in export of chitin synthase and other membrane proteins³⁷; in Pneumocystis Chs5 could serve only the latter function, given the absence of chitin synthase. While the detection of chitin in P. carinii has been previously reported^36,38 based on reactivity with non-specific lectins such as wheat germ agglutinin, this staining may be detecting other molecules based on the absence of chitin synthases in the genome.

**Figure 6: Analysis of chitin in *Pneumocystis* and related fungi.**

We assayed for the presence of chitin in the Pneumocystis cell wall using a recombinant chitin-binding domain (CBD). While we found strong reactivity with Candida cell walls by in situ labelling, reactivity with Pneumocystis was totally absent (Fig. 6). In addition, we directly examined the cell wall content of partially purified Pneumocystis organisms and cultured S. cerevisiae cells using mass spectrometric analysis. No chitin-related oligosaccharides were identified in Pneumocystis, while they were detected in S. cerevisiae; glucan-related oligosaccharides were detected in both (Supplementary Note 6). The combination of these genomic and experimental results demonstrates that Pneumocystis is the first identified member of the fungal kingdom that does not have chitin.

β-glucans have been identified in the wall of the cyst form of Pneumocystis, but are absent from the more abundant trophic form^34,39. Pneumocystis genomes encode all the enzymes required for β-1,3- and β-1,6-glucan synthesis and degradation^34,39,40,41 (Supplementary Data 18). However, Pneumocystis species do not have any genes involved in synthesis and degradation of α-glucan, which has been found in many fungi, and which can block innate immune recognition by the β-glucan receptor⁴².

Although all Pneumocystis species have abundant surface glycoproteins, especially Msg, little is known about their glycosylation state and other post-translational modifications. We found that Pneumocystis species encode enzymes required for synthesis of the N- and O-linked glycan core structure (containing up to nine mannose residues), which are localized to the endoplasmic reticulum, but lack genes for enzymes residing in the Golgi apparatus, which add mannose outer chains, including all the enzymes comprising mannan polymerase complex I and complex II, and α-1,6-, α-1,2- and α-1,3-mannosyltransferase (Supplementary Data 19). The absence of these enzyme genes suggests that, unlike other fungi, Pneumocystis cell wall proteins including Msg are not highly mannosylated. N-linked profiling of PNGase-F-released N-glycans showed that M5N2 is the predominant N-linked glycan on P. carinii Msg proteins (Supplementary Fig. 13; Supplementary Note 7). Although trace amounts of M6N2 to M9N2 were detected as minor components, mannan type N-glycans with more than nine mannose residues were not detected. Moreover, glycopeptide mapping of Msg tryptic digest by using liquid chromatography-tandem mass spectrometry (LC-MS/MS) identified 31 N-linked glycans in 15 Msg isoforms, all of which carried M5N2 as the predominant component and only 1 of which carried M6N2 as an additional, minor component (Fig. 7; Supplementary Fig. 13; Supplementary Data 21). The lack of N-linked outer chain mannan may allow the organism to avoid recognition by innate immune responses, since in Candida such mannosylation is required for recognition by dectin-2, DC-SIGN and the macrophage mannose receptor of dendritic cells, while mutants that can synthesize only the core structure are poorly recognized^43,44.

**Figure 7: Lack of hyper-mannose (mannan) glycosylation in *Pneumocystis*.**

Transcription enrichment during infection

The expression level of each annotated gene was estimated using RNA-Seq data from three heavily infected animals each for P. murina and P. carinii. Overall, we find evidence of expression for nearly all genes (99%, see Methods) but variation in expression level over 5 orders of magnitude. Using Gene Set Enrichment Analysis (GSEA⁴⁵), we identified functional categories that were enriched in highly expressed genes in each species (Supplementary Data 20; Supplementary Fig. 14). The most highly enriched categories include Msgs and predicted secreted proteins (24–25 other than Msgs in each species). The enrichment of the latter suggests that additional secreted proteins may play an important role during infection. In addition, many functions involved in general metabolism of RNA and proteins, including the RNA RRM and LSM domain, are enriched among highly expressed genes in Pneumocystis.

Each Pneumocystis genome contains an exceptionally high intron density, with an average of 5 introns per gene, similar to only a few fungal genomes such as Cryptococcus with high levels of splicing and only slightly fewer than that in mammalian genomes (7–9 introns per gene)⁴⁶. This is unusual for highly compacted fungal genomes where intron loss is usually observed, as in S. cerevisiae⁴⁷ and Microsporidia⁴⁸. The transcription and splicing process for genes with many introns requires higher energy and cellular resources; this appears correlated to the relative expansion and high-level expression of RRM domain-containing genes (Fig. 2; Supplementary Fig. 14; Supplementary Data 6) and genes involved in spliceosome and mRNA surveillance in all Pneumocystis genomes (Supplementary Data 22). Using the RNA-Seq data, we identified high-level alternative splicing events in P. murina and P. carinii, with intron retention being the most common (detected for 42–49% of introns) and other types being infrequent (≤3% of introns for each type) (Supplementary Data 23). We assembled full-length alternatively spliced isoforms without premature termination codons for 263 and 275 genes of P. murina and P. carinii, respectively, though there is no functional enrichment of these genes. The high rate of intron retention correlates with the presence of all the components of the nonsense-mediated mRNA decay machinery in each Pneumocystis species (Supplementary Data 22). Alternative splicing of intron-containing genes could increase transcript diversity and regulate gene transcription or mRNA stability in this otherwise reduced genome, as suggested in earlier studies of P. carinii⁴⁹ and other organisms⁵⁰.

Adaption to a host lung environment

Our genome analysis has revealed new insights into the dependence of Pneumocystis on mammalian hosts. Pneumocystis has been identified almost exclusively in the lungs of humans and other mammals, where it remains extracellular but preferentially attaches to type I pneumocytes. Although an environmental reservoir has been hypothesized, there is no convincing evidence of such a reservoir; the current genome data strongly suggest that the entire life cycle occurs in the host. Genes involved in mating and meiosis are present and transcribed in all three Pneumocystis species (Supplementary Data 24), suggesting that sexual reproduction is actively occurring in lung tissue, as postulated in previous studies^9,11. Thus, Pneumocystis presumably must obtain nutrients and proliferate in the lung environment, and at the same time it must withstand host defenses for sufficient periods to allow direct transmission to another susceptible host.

During the process of adaption as an obligate pathogen, the Pneumocystis genome has contracted substantially compared with other fungi, though not to the extent of Microsporidia, which are intracellular organisms, while Pneumocystis are exclusively extracellular. The retention of all components for glucose uptake and catabolism while the glyoxylate and gluconeogenesis pathways were lost strongly suggests a high dependency on glucose from the host as the primary energy and carbon sources. Transcriptional regulation is also greatly simplified, implying that the host lung provides an optimal environment for Pneumocystis, with little need for sensing and responding to stress, as evidenced by the loss of genes involved in the pH sensing, osmotic and oxidative stress responses and cAMP-mediated signalling (Supplementary Data 25). The loss of the pyruvate fermentation pathway and other genes known to be essential for anaerobic growth (Supplementary Data 11 and 26) likely reflects adaptation exclusively to aerobic respiration in the lung environment with a stable oxygen supply. Another very unusual loss in Pneumocystis is the gene encoding carbonic anhydrase (Supplementary Data 9), which catalyses the interconversion between CO₂ and bicarbonate, a key mechanism for providing bicarbonate to cells and for regulation of intracellular pH, and which is widely distributed in all life kingdoms⁵¹. This enzyme may be unnecessary in the CO₂-rich lung environment (5–6%), while its absence suggests Pneumocystis is unable to survive under ambient air conditions (0.03–0.04% CO₂).

Discussion

Our study suggests that Pneumocystis is highly adapted to existence in the host lung, with strict dependence on the mammalian host for nutrients and a stable environment, while utilizing efficient strategies for immune evasion, thus facilitating colonization and persistence in its host. Among the most striking findings in this study is the loss of two important fungal cell wall components, chitin and outer chain N-mannans, both predicted by genome analysis and further supported by biochemical assays. Furthermore, β-1,3-glucan has previously been shown to be absent from the cell wall of the trophic form of Pneumocystis^34,39, which is ∼10- to 20-fold more abundant than the cyst form. While β-glucan is present and exposed in some Pneumocystis cysts, we have found that in most cysts the β-1,3-glucan is masked by Msg and other surface proteins (unpublished observations). Given that chitin, β-glucans and mannan are all known pathogen-associated molecular patterns (PAMPs) that trigger innate host defense responses through interactions with host pattern recognition receptors (PRRs) such as dectin-1 and DC-SIGN^43,52, their simultaneous absence or masking may be beneficial in evading or delaying recognition of Pneumocystis, especially the trophic form, by the innate immune system. Antigenic variation provided by Msg isoforms extends potential immune evasion to include adaptive immune responses. Retention of β-glucan in cysts presumably is critical to organism survival, possibly by providing a rigid spherical cell wall that is aerodynamically efficient for transmitting infection to other hosts, while protecting the organism from the harsher environmental conditions outside the lung⁵. The absence of chitin and β-glucan on the cell wall (Fig. 6) suggests that trophic forms are more fragile than other fungi. Greater malleability of the trophic cell wall could facilitate intimate contact with host cell membranes, and improve the ability of Pneumocystis to obtain nutrients by direct cell-to-cell contact or by endocytosis. The evolutionary adaptation by Pneumocystis is in sharp contrast to other pathogenic fungal species that normally live outside the mammalian host while being able to infect different host species and different tissues.

The availability of three high-quality Pneumocystis genomes provides important information about the biology of these species, and potentially provides insights that will guide developing in vitro culturing of this organism, a critical goal of the Pneumocystis research community. Moreover, given the loss of many biosynthetic and metabolic pathways, the remaining pathways, including the limited number of transporters (Supplementary Table 4), provide attractive targets for anti-Pneumocystis drug development.

Methods

Pneumocystis organisms from rodents and humans

P. murina-infected lung samples were obtained from CD40 ligand knock-out female mice²⁶. P. carinii-infected lung samples were obtained from immunosuppressed Sprague-Dawley male rats²⁴. P. jirovecii-infected autopsy lung samples were from one patient with AIDS, who was infected with only one P. jirovecii strain based on genotyping at five different genomic loci⁵³. Animal and human subject experimentation guidelines of the National Institutes of Health were followed in the conduct of these studies.

Preparation of genomic DNA samples

Pneumocystis-infected lung tissues were cut into small pieces, homogenized in Qiagen Tissuelyser (5 times for 20 s at 1/30 frequency), then centrifuged at 15,000g for 6 min. The pellet was resuspended in Trypsin/EDTA Solution (Lonza), incubated at 37 °C for 30 min and centrifuged at 15,000g for 6 min. The pellet was washed once in PBS, resuspended in lysis buffer containing 2.9% (w/v) collagenase type I (Gibco) and 0.1% (w/v) DNase I (Sigma), incubated at 37 °C for 30 min, and centrifuged at 15,000g for 6 min. This sequential enzyme digestion was expected to remove most of the host cells and DNA as well as potentially some of Pneumocystis trophic forms. The pellet was washed three times in PBS, resuspended in 100 μl of 0.5% (w/v) Zymolyase solution (Zymo Research) and incubated at 37 °C for 1 h. Genomic DNA was extracted using the MasterPure Yeast DNA Purification Kit (Epicentre). RNA was removed using DNase-free RNAase (Epicentre).

All DNA samples were analysed by quantitative real-time PCR (qPCR) assays using fluorescence resonance energy transfer probes⁵⁴ to measure the fraction of Pneumocystis DNA compared with host DNA in each sample. In qPCR for both P. murina²⁰ and P. carinii⁵⁴, the target was the single-copy dihydrofolate reductase (dhfr) gene of Pneumocystis and the target for the host was a highly conserved region of the single-copy polycystic kidney disease 1 (pkd1) gene²². The targets for P. jirovecii and human qPCR were the msg gene family⁵⁵ and the single-copy β-globin gene⁵⁶, respectively. The purity of each DNA sample was estimated by the ratio of Pneumocystis to host genome copy number and based on an estimated genome size of 8 Mb for Pneumocystis and 2.7–3.3 Gb for the host. The qPCR results were validated by 454 or Illumina MiSeq sequencing of selected samples. DNA samples extracted from enriched P. murina or P. carinii preparations contained up to 90% Pneumocystis DNA, and from enriched P. jirovecii preparations, 10–25% P. jirovecii DNA, while those extracted directly from heavily infected lung tissues contained <0.4–1% Pneumocystis DNA.

P. murina genome sequencing and assembly

To assess the quality of the P. murina DNA preparations for whole-genome sequencing, small-scale sequencing was first conducted using a 454 GS FLX Titanium Sequencer (Roche Applied Science) at Leidos, Inc. (Frederick, MD) according to the 454 standard shotgun sequencing protocols. A total of 4 million reads (with a mean length of 344 bases) were generated, with ∼40% of them having blast hits with P. carinii⁷. Subsequently, deep sequencing was performed using Illumina HiSeq technology at the Broad Institute (Cambridge, MA). Seven different P. murina DNA preparations (each with 60–90% purity for P. murina by qPCR) were used to construct small insert libraries; each was sequenced, and the library (preparation B123 from a single mouse, center project G11228) with the lowest percent of contaminating host mouse DNA was used for assembly. From this library, a total of 34 million 101 base paired-end reads with a mean insert size of 153 bases were generated. A second larger insert library (preparation C2 from another mouse, center project G11230) with a mean insert size of 1,247 bases was prepared, and a total of 83 million 101 base paired-end reads were generated. After removal of mouse sequences (version mm9) and P. murina mitochondrial sequences²², these reads were assembled with Allpaths (version R37380) with default parameters⁵⁷; the resulting assembly was screened for contaminating sequences (mouse and bacteria) and to remove P. murina mitochondrial sequences²² by aligning to the non-redundant nucleotide database from NCBI and by removing the scaffolds with high coverage that matched non-fungal organisms. The draft assembly of 24 scaffolds was further improved using long 454 reads and PCR to support scaffold joins. The final assembly contained 17 scaffolds. All internal contig gaps were closed by PCR and Sanger sequencing. The ends of scaffolds without msg genes or telomere repeats were extended by PCR and/or PacBio sequencing (Supplementary Methods).

P. carinii genome sequencing and assembly

After preliminary 454 sequencing demonstrated a purity of 63%, 1 P. carinii DNA preparation (no. B80 from 2 heavily infected rats, 80% purity for P. carinii by qPCR) was used to construct 2 libraries with a mean insert size of 180 bases and 793 bases, respectively. Each was sequenced using 101 base paired-end reads on the Illumina MiSeq platform at the Broad Institute, and, after removal of host rat sequences and P. carinii mitochondrial sequences²², assembled using AllPaths-LG (version R47825) with default parameters, resulting in a total of 53 scaffolds. After joining scaffolds using long 454 reads and confirming by PCR, the final genome assembly contained 17 scaffolds. All internal contig gaps were closed by PCR and Sanger sequencing. The ends of six scaffolds overlapped with the seven telomere sequences reported elsewhere⁵⁸; they were merged and confirmed by PCR and/or aligning merged scaffolds with Illumina and 454 raw reads. The ends of several scaffolds without msg genes, kexin genes or telomere repeats were extended by PCR and/or PacBio sequencing (Supplementary Methods).

P. jirovecii genome sequencing and assembly

Three enriched DNA samples (RU7, RU12 and RU817) from a single autopsy lung sample RU (1.4–2 μg each, with 10–25% P. jirovecii DNA by qPCR) were used to construct three libraries (156 base average insert size); each was sequenced on the Illumina HiSeq platform at the Broad Institute. To improve on initial assemblies with high numbers of contigs, additional sequence was generated from hybrid-selected DNA samples as previously described⁵⁹; 120 base oligo bait probes were designed to target regions present in the genome assembly as well as to pull in uncovered regions and transcripts; probes were designed for: existing assemblies⁸ including higher probe density at contig ends, unassembled reads⁸ that did not match the human genome, assembled RNA-Seq transcripts⁸ that did not match existing assemblies or the human genome and unanchored msg sequences²⁵. Host sequences were removed by aligning to human genome assembly 19 (GRCh37/hg19), and P. jirovecii mitochondrial sequences²² were removed. The remaining 101 base paired-end reads were assembled separately with Spades 2.5.1 (at the Broad Institute), Spades 3.0 (at Leidos) and Abyss (Biowulf at NIH), which resulted in 400, 149 and 312 contigs, respectively. All these contigs were merged with Sequencher (Gene Codes Co., Ann Arbour, MI), resulting in a total of 20 scaffolds. All internal gaps were closed by PCR and Sanger sequencing. Scaffolds were validated by examining raw reads aligned by Seqman Pro (DNAStar, Madison, WI) to ensure there were no false joins. To improve the coverage of msg genes in the assembly, PacBio sequencing was performed as described in Supplementary Methods.

Chromosome electrophoresis and Southern hybridization

To compare the P. murina assembly with chromosome number and length, we used contour clamped homogeneous electrical field (CHEF) electrophoresis to separate chromosomes. P. murina organisms were partially purified from fresh mouse lungs by Ficoll-Hypaque density gradient centrifugation⁶⁰, and then processed for CHEF⁶¹ using the CHEF Yeast Genomic DNA Plug Kit (Bio-Rad). Briefly, partially purified P. murina organisms were embedded in 0.8% CleanCut agarose, treated with 24 U ml⁻¹ proteinase K overnight at 50 °C, then washed four times in 1 × Wash Buffer and stored at 4 °C. Electrophoresis was performed in 1% agarose gels (14 × 21 cm) using the CHEF DR II apparatus (Bio-Rad). Gels were run in 0.5 × TBE buffer for 144 h at 135 V and 12.5 °C, with a 50 s initial pulse that was gradually increased to 100 s. The gel was stained with ethidium bromide (Sigma), then DNA was transferred to Nytran membranes (Schleicher & Schuell) under neutral conditions⁶¹. We used two blots prepared from the same DNA plug, and each blot was hybridized consecutively to different probes, with stripping of the blot between hybridizations. All probes were PCR-amplified double-stranded DNA fragments labelled using the PCR DIG Probe Synthesis Kit (Roche Applied Science) except for the probe for telomere repeats, which was a synthesized oligonucleotide labelled using the DIG Oligonucleotide Tailing kit (Roche Applied Science). All primer and probe sequences are provided in Supplementary Data 27. Hybridization and signal detection were performed by using the DIG Probe Hybridization system (Roche Applied Science) as described previously²².

RNA-Seq, expression levels and gene prediction

Three independent samples of P. murina and P. carinii organisms were partially purified from heavily infected lungs of three mice and rats, respectively, by Ficoll-Hypaque density gradient centrifugation⁶⁰. The partially purified pellets were expected to contain both cyst and trophic forms. Total RNA was isolated using the RNeasy Mini Kit (Qiagen). The ratio of Pneumocystis to host RNA was estimated to be from 1:4 to 8:1 based on the density of RNA bands in agarose gels for the host 28S rRNA and Pneumocystis 26S rRNA. Strand-specific libraries were constructed using the dUTP second strand marking method^62,63, except as noted below. For all samples, poly(A) RNA was purified using the Dynabeads mRNA Purification kit (Life Technologies), with two rounds of selection (with bead regeneration), resulting in <5% rRNA as measured by the Bioanalyzer RNA 6000 Pico chip (Agilent) program. The resulting mRNA (200 ng per sample) was treated with the Turbo DNA-free kit (Ambion), checked by qPCR for the absence of detectable genomic DNA (Supplementary Table 5), and fragmented in 1 × RNA fragmentation buffer (Affymetrix) for 4 min at 80 °C. After first-strand complementary DNA (cDNA) synthesis a 2.0 × volume of RNAClean SPRI beads (Beckman Coulter Genomics) clean-up was used instead of phenol/chloroform/isoamyl alcohol (25:24:1) extraction and ethanol precipitation. Size selection after adapter ligation was done with two 0.7 × Ampure beads (Beckman Coulter Genomics) clean-up steps and eight cycles of PCR were used to generate the library. Libraries were sequenced on the HiSeq2000 platform generating an average of 26.1 million paired 76 base reads for each of the 6 samples.

To examine expression levels, RNA-Seq reads from each of the three samples for each organism were aligned to the extracted coding DNA sequences (CDSs) using bowtie⁶⁵. The alignment bam files were then used to quantify transcript abundances by RSEM⁶⁶. We selected the top 5% of expression levels to examine functional enrichment by GSEA⁴⁵. For each organism, we examined GSEA of gene sets classified based on Pfam, TIGRFAM, KEGG and SignalP functional annotations. We also ran GSEA on some manually curated gene sets such as msg genes.

The relative expression level for each gene was expressed as fragments per kb of exon per million fragments mapped (FPKM). For both P. murina and P. carinii, 99% of all annotated genes were expressed (FPKM>2), with a median coverage of 151 FPKM in both species. Further, expressed genes showed good alignment depth across the transcript, with 98% of all genes in each species having at least fivefold RNA-Seq depth. All genes without detectable expression (26 in P. murina and 23 in P. carinii) contain short (average size of ∼250 bp, potentially pseudogenes) or incomplete CDSs.

The P. murina and P. carinii genomes were annotated using a combination of expression data (RNA-Seq), homology information and ab initio gene finding methods (Supplementary Methods) as previously described⁶⁷. These methods were also used to annotate the RU assembly of P. jirovecii, with the exception that RNA-Seq data was not used (Supplementary Methods). Gene sets were compared based on functional annotation and used as the basis for syntenic and phylogenetic analysis (Supplementary Methods).

Analysis of strain variation in P. jirovecii

To examine genome-wide single nucleotide polymorphisms (SNPs) between sequenced isolates, the RU7 assembly generated in this study was used as the reference compared with the SE8 raw reads⁸. SE8 reads were retrieved from NCBI (ERR135854 to ERR135863) and converted to fastq format. Individual fastq files were concatenated, and the full read set aligned to the RU7 assembly using BWA-MEM⁶⁸ and converted to sorted bam using SAMtools⁶⁴. This resulted in very high alignment depth across all assembly bases, with a median alignment depth of 1,046.

The Genome Analysis Toolkit⁶⁹ (GATK) v2.7–4-g6f46d11 was used to call both variant and reference bases from the alignments. Briefly, the Picard tools (http://picard.sourceforge.net) AddOrReplaceReadGroups, MarkDuplicates, CreateSequenceDictionary and ReorderSamwere were used to preprocess the alignments, followed by GATK RealignerTargetCreator and IndelRealigner for resolving misaligned reads close to indels. Next, GATK UnifiedGenotyper (with the haploid genotype likelihood model (GLM)) was run with both SNP and INDEL GLM. We additionally ran BaseRecalibrator and PrintReads for base quality score recalibration on sites called using GLM SNP and re-called variants with UnifiedGenotyper emitting all sites. A final filtering step was used to remove any position that was called by both GLMs (that is, incompatible indels and SNPs) and to remove positions with low read depth support (<10). The 24,902 SNPs were then mapped to the gene set predicted on the SE8 assembly; a total of 14,135 SNPs are located in CDS regions, with substitutions affecting protein coding regions as follows: 7,410 nonsynonymous, 6,690 synonymous, 26 nonsense and 9 extensions. SNP density across the SE8 assembly was calculated for 5-kb windows.

Detection of chitin by fluorescence labelling

The cDNA sequence encoding the CBD of Bacillus circulans chitinase A1 gene⁷⁰ (2,183–2,338 bp, GenBank accession code M57601.1) was optimized for bacterial expression (GenScript USA Inc.) and modified by the addition of ATG and 3 HA tags. The modified sequence was synthesized (GenScript USA Inc., Supplementary Fig. 15) and cloned into pET-28 vector (EMD Biosciences). The cDNA construct was transformed into Escherichia coli strain BL21 (DE3) RIL (Agilent technologies) and expressed as a His tag fusion protein. The expressed protein was purified in two steps, first using anti-CBD antibody (New England Biolabs) that was immobilized using AminoLink Plus Immobilization Kit (Thermo Scientific), and then the Hispur cobalt purification kit (Thermo Scientific). The purified protein was biotinylated using Lightning-Link Biotin conjugation kit (Type A) (Innova Biosciences Ltd) and used for fluorescence labelling (Histoserv, Inc., Germantown, MD) of P. murina-infected lung tissue sections fixed in Histochoice (Amresco, Inc.). Alexafluor 488-conjugated streptavidin was used for the detection of CBD. Pneumocystis organisms were detected by an anti-Msg antibody²⁶ and cysts by a dectin-Fc recombinant protein (kindly provided by Dr Chad Steele, the University of Alabama at Birmingham, Alabama)⁷¹. Mouse kidney sections infected with Candida (kindly provided by Dr Michail Lionakis, the National Institute of Allergy and Infectious Diseases, Bethesda, Maryland)⁷² were used as a positive control for chitin staining (Fig. 5).

Chitin glycosyl linkage analysis

To prepare cell wall fraction, P. carinii organisms were partially purified from infected rat lungs by Ficoll-Hypaque density gradient centrifugation⁶⁰; S. cerevisiae organisms were obtained from Stratagene (strain YPH499) and grown in standard yeast culture. Cells were resuspended in denaturing buffer (2% SDS, 1 mM EDTA in 50 mM Tris pH 8.0) and heated at 100 °C for 10 min. The cell wall pellet was recovered by centrifugation at 5,000g for 5 min., washed twice with PBS, and dried in a speed vacuum concentrator.

For chitin analysis, cell wall pellets were incubated with 0.1% (w/v) chitinase (Sigma) in 50 mM sodium acetate buffer, pH 5.6, at 37 °C for 72 h. Insoluble materials after the initial digestions were further treated sequentially with pronase (Roche), lyticase (Sigma) and chitinase to look for residual chitin. Protein digestion was performed using 0.01% (w/v) pronase in Tris-HCl buffer pH 8.2 with 10 mM CaCl₂ at 37 °C for 48 h. Glucan digestion was performed with 0.04% (w/v) lyticase in 50 mM sodium phosphate buffer, pH 7.5 at 37 °C for 24 h. Chitinase digestion was performed as above. After each digestion, the enzyme was inactivated by heating at 100 °C for 5 min.

Digested cell wall components were analysed by matrix-assisted laser-desorption ionization time-of-flight MS (MALDI/TOF-MS) and nanospray ionization MS (NSI-MSn). An aliquot of supernatant after each enzyme digestion was analysed by MS to determine oligosaccharide content. Prior to MS analysis, enzymes were removed by passing through a C18 Sep-PAK cartridge. The flow-through was collected in 5% acetic acid and lyophilized. The dried samples were permethylated using previously published methods⁷³ and profiled by MS. MALDI/TOF-MS was performed in the reflector positive ion mode using DHBA (α-dihyroxybenzoic acid, 20 mg ml⁻¹ solution in 50% methanol/water) as a matrix. The spectrum was obtained by using an AB SCIEX TOF/TOF 5800 (AB SCIEX). NSI-MSn analysis was performed on an LTQ Orbitrap XL mass spectrometer (Thermo Fisher) equipped with a nanospray ion source. Permethylated sample was dissolved in 1 mM NaOH in 50% methanol and infused directly into the instrument at a constant flow rate of 0.5 μl min⁻¹. A full Fourier transform MS spectrum was collected at 30,000 resolution. The capillary temperature was set at 210 °C and MS analysis was performed in the positive ion mode. For total ion mapping (automated MS/MS analysis), an m/z range of 200–2,000 was scanned with ion trap MS mode in successive 2.8 mass unit windows that overlapped the preceding window by 2 MU.

For determination of glycosyl linkages, partially methylated alditol acetates (PMAA) were prepared from fully permethylated glycans. The permethylated glycans were hydrolyzed with HCl/water/acetic acid (0.5:1.5:8, by vol.) at 80 °C for 18 h, followed by reduction with 1% NaBD₄ in 30 mM NaOH overnight, then acetylation with acetic anhydride/pyridine (1:1, v/v) at 100 °C for 15 min and analysis by gas chromatograph-MS (GC–MS) on an Agilent 7890A GC interfaced to a 5975C MSD (mass selective detector, electron impact ionization mode). The separation of the PMAA was performed on a 30-m SP2331 fused silica capillary column (Supelco) for the neutral sugar derivatives, and a DB-1 (Agilent) column for amino sugar derivatives.

Msg protein glycosylation identification

To purify P. carinii Msg proteins, partially purified P. carinii organisms⁶⁰ were resuspended in 2% SDS in 50 mM Tris-HCl buffer, pH 8.0, boiled for 10 min and centrifuged. The supernatant was collected. The extraction procedure was repeated two more times, the supernatants were pooled and SDS was removed using the SDS-Out SDS Precipitation Kit (Thermo Scientific). Solubilized Msg was affinity purified using an anti-Msg monoclonal antibody (RA-E7, gift of Drs Peter Walzer and Michael Linke⁷⁴, the University of Cincinnati, Cincinnati, Ohio) immobilized on to AminoLink plus column (AminoLink plus immobilization kit) following manufacturer’s instructions. The protein extract was diluted with an equal volume of Tris buffered saline and incubated with the immobilized antibody in the column by end-over-end mixing overnight at 4 °C. The column was then drained and washed with 0.1 M sodium phosphate buffer pH 6.9. The Msg was eluted with 150 mM ammonium hydroxide and fractions were neutralized with 1 M NaH₂PO₄. The fractions containing Msg were pooled and concentrated using Microcon-30 kDa centrifugal filter unit (Millipore) and the buffer was exchanged to 0.1 M sodium phosphate buffer pH 6.9. Aliquots were stored at −80 °C.

Purified Msg proteins were subjected to glycopeptide detection by LC-MS. Glycoproteomic analysis was carried out as previously described⁷⁵. Briefly, the tryptic digested peptides of purified Msg proteins were separated with a Magic C18 column (15 cm × 75 μm, Bruker-Michrom, CA) and analysed with an Orbitrap Fusion (Thermo Scientific) mass spectrometer after Msg was denatured, reduced, alkylated and desalted. Gradient elution was performed with a 30 min linear gradient of 5–35% acetonitrile in 0.1% formic acid at a flow rate of 300 nl min⁻¹. The MS data were processed automatically by a data-dependent MS scan pipeline, which integrated three dissociation methods, including higher energy collisional dissociation (HCD), electron transfer dissociation (ETD) and collision-induced dissociation (CID). In this pipeline, the full MS spectrum was first acquired from most abundant ions between m/z 350 and m/z 1,550 with a 3-second cycle time at 120,000 resolutions in FT mode. Subsequently the acquired MS data were assessed by MS/MS with HCD-product-dependent ETD or HCD-product-dependent ETD/CID mode with most abundant ions within 3-s cycle time. For the HCD-product ion trigger MS/MS, glycan oxonium ions at m/z 204.0867 (HexNAc), m/z 138.0545 (HexNAc fragment) or m/z 366.1396 (Hexose-HexNAc) in the HCD spectra were used to trigger ETD or CID acquisition. The MS/MS were measured at a resolution of 15,000.

For the mapping of glycopeptides from P. carinii Msg digests, LC-MS data were analysed using Byonic software (Protein Metrics) and the data annotation by the software was manually validated. Byonic parameters were set to allow 3.0 p.p.m. of precursor ion mass tolerance and 3.0 p.p.m. of fragment ion tolerance with monoisotopic mass. Digested peptides were allowed with up to three missed internal cleavage sites, and the differential modifications of 57.02146 and 15.9949 Da were allowed for alkylated cysteine and oxidation of methionine, respectively. Protein database used for protein blast included the protein set of the rat genome at NCBI (version 104) and of the P. carinii genome B80 generated in this study. Glycan database used was 309 mammalian N-glycan database (Default database in the Byonic software). Released N-linked glycan analysis of Msg was performed prior to the glycopeptide mapping.

Data annotation by the software was further filtered as follows. Any protein identification with FDR >1%, log probability <4 or best Byonic score <500 was excluded. Moreover, any glycopeptide spectra annotations with Byonic score <400 were excluded. Glycopeptide identifications that remained after these strict filters were manually validated by looking at fragment ions matched to the theoretical glycopeptide fragmentation, the presence of a series of expected glycan oxonium ions and neutral loss of the glycan moiety for MS/MS-HCD data.

Additional information

Accession codes: Illumina raw reads have been deposited in the NCBI Sequence Read Archive for the genome sequencing of P. murina (accession codes SRX248091 and SRX1435036), P. carinii (accession codes SRX387635 and SRX387636) and P. jirovecii (accession code SRX387638). Illumina raw reads for RNA-Seq of P. murina and P. carinii have been deposited in the NCBI Sequence Read Archive under accession code PRJNA292577. Genome assemblies and annotations have been deposited in the NCBI BioProject database with accession codes PRJNA70803 (P. murina), PRJNA223511 (P. carinii) and PRJNA223510 (P. jirovecii). All sequence data has been linked to NCBI Umbrella project PRJNA223519.

How to cite this article: Ma, L. et al. Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts. Nat. Commun. 7:10740 doi: 10.1038/ncomms10740 (2016).

Accession codes

Accessions

GenBank/EMBL/DDBJ

M57601.1

Sequence Read Archive

References

Rabodonirina, M. et al. Molecular evidence of interhuman transmission of Pneumocystis pneumonia among renal transplant recipients hospitalized with HIV-infected patients. Emerg. Infect. Dis. 10, 1766–1773 (2004).
Article PubMed PubMed Central Google Scholar
Sassi, M. et al. Outbreaks of Pneumocystis pneumonia in 2 renal transplant centers linked to a single strain of Pneumocystis: implications for transmission and virulence. Clin. Infect. Dis. 54, 1437–1444 (2012).
Article CAS PubMed PubMed Central Google Scholar
Morris, A. & Norris, K. A. Colonization by Pneumocystis jirovecii and its role in disease. Clin. Microbiol. Rev. 25, 297–317 (2012).
Article CAS PubMed PubMed Central Google Scholar
Thomas, C. F. Jr & Limper, A. H. Current insights into the biology and pathogenesis of Pneumocystis pneumonia. Nat. Rev. Microbiol. 5, 298–308 (2007).
Article CAS PubMed Google Scholar
Cushion, M. T. et al. Echinocandin treatment of Pneumocystis pneumonia in rodent models depletes cysts leaving trophic burdens that cannot transmit the infection. PLoS ONE 5, e8524 (2010).
Article ADS PubMed PubMed Central Google Scholar
Cushion, M. T. & Arnold, J. Proposal for a Pneumocystis genome project. J. Eukaryot. Microbiol. 44, 7S (1997).
Article CAS PubMed Google Scholar
Slaven, B. E. et al. Draft assembly and annotation of the Pneumocystis carinii genome. J. Eukaryot. Microbiol. 53, (Suppl 1): S89–S91 (2006).
Article PubMed Google Scholar
Cisse, O. H., Pagni, M. & Hauser, P. M. De novo assembly of the Pneumocystis jirovecii genome from a single bronchoalveolar lavage fluid specimen from a patient. MBio 4, e00428–00412 (2012).
Article PubMed PubMed Central Google Scholar
Almeida, J. M., Cisse, O. H., Fonseca, A., Pagni, M. & Hauser, P. M. Comparative genomics suggests primary homothallism of Pneumocystis species. MBio 6, e02250–02214 (2015).
Article CAS PubMed PubMed Central Google Scholar
Cisse, O. H., Pagni, M. & Hauser, P. M. Comparative genomics suggests that the human pathogenic fungus Pneumocystis jirovecii acquired obligate biotrophy through gene loss. Genome Biol. Evol. 6, 1938–1948 (2014).
Article PubMed PubMed Central Google Scholar
Cushion, M. T. et al. Transcriptome of Pneumocystis carinii during fulminate infection: carbohydrate metabolism and the concept of a compatible parasite. PLoS ONE 2, e423 (2007).
Article ADS PubMed PubMed Central Google Scholar
Hauser, P. M. et al. Comparative genomics suggests that the fungal pathogen Pneumocystis is an obligate parasite scavenging amino acids from its host's lungs. PLoS ONE 5, e15152 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Porollo, A., Sesterhenn, T. M., Collins, M. S., Welge, J. A. & Cushion, M. T. Comparative genomics of Pneumocystis species suggests the absence of genes for myo-inositol synthesis and reliance on inositol transport and metabolism. MBio 5, e01834–01814 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hong, S. T. et al. Pneumocystis carinii karyotypes. J. Clin. Microbiol. 28, 1785–1795 (1990).
CAS PubMed PubMed Central Google Scholar
Stringer, J. R. et al. Molecular genetic distinction of Pneumocystis carinii from rats and humans. J. Eukaryot. Microbiol. 40, 733–741 (1993).
Article CAS PubMed Google Scholar
Moran, G. P., Coleman, D. C. & Sullivan, D. J. Comparative genomics and the evolution of pathogenicity in human pathogenic fungi. Eukaryot. Cell 10, 34–42 (2011).
Article CAS PubMed PubMed Central Google Scholar
Rhind, N. et al. Comparative functional genomics of the fission yeasts. Science 332, 930–936 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Cisse, O. H. et al. Genome sequencing of the plant pathogen Taphrina deformans, the causal agent of peach leaf curl. MBio 4, e00055–00013 (2013).
Article CAS PubMed PubMed Central Google Scholar
Seoighe, C. et al. Prevalence of small inversions in yeast gene order evolution. Proc. Natl Acad. Sci. USA 97, 14433–14437 (2000).
Article ADS CAS PubMed PubMed Central Google Scholar
Vestereng, V. H. et al. Quantitative real-time polymerase chain-reaction assay allows characterization of Pneumocystis infection in immunocompetent mice. J. Infect. Dis. 189, 1540–1544 (2004).
Article CAS PubMed Google Scholar
Katinka, M. D. et al. Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature 414, 450–453 (2001).
Article ADS CAS PubMed Google Scholar
Ma, L. et al. Sequencing and characterization of the complete mitochondrial genomes of three Pneumocystis species provide new insights into divergence between human and rodent Pneumocystis. FASEB J. 27, 1962–1972 (2013).
Article CAS PubMed PubMed Central Google Scholar
Deitsch, K. W., Lukehart, S. A. & Stringer, J. R. Common strategies for antigenic variation by bacterial, fungal and protozoan pathogens. Nat. Rev. Microbiol. 7, 493–503 (2009).
Article CAS PubMed PubMed Central Google Scholar
Angus, C. W., Tu, A., Vogel, P., Qin, M. & Kovacs, J. A. Expression of variants of the major surface glycoprotein of Pneumocystis carinii. J. Exp. Med. 183, 1229–1234 (1996).
Article CAS PubMed Google Scholar
Kutty, G., Maldarelli, F., Achaz, G. & Kovacs, J. A. Variation in the major surface glycoprotein genes in Pneumocystis jirovecii. J. Infect. Dis. 198, 741–749 (2008).
Article CAS PubMed Google Scholar
Bishop, L. R., Helman, D. & Kovacs, J. A. Discordant antibody and cellular responses to Pneumocystis major surface glycoprotein variants in mice. BMC Immunol. 13, 39 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kutty, G. & Kovacs, J. A. A single-copy gene encodes Kex1, a serine endoprotease of Pneumocystis jiroveci. Infect. Immun. 71, 571–574 (2003).
Article CAS PubMed PubMed Central Google Scholar
Lee, L. H. et al. Molecular characterization of KEX1, a kexin-like protease in mouse Pneumocystis carinii. Gene 242, 141–150 (2000).
Article CAS PubMed Google Scholar
Lugli, E. B., Bampton, E. T., Ferguson, D. J. & Wakefield, A. E. Cell surface protease PRT1 identified in the fungal pathogen Pneumocystis carinii. Mol. Microbiol. 31, 1723–1733 (1999).
Article CAS PubMed Google Scholar
Lipschik, G. Y., Masur, H. & Kovacs, J. A. Polyamine metabolism in Pneumocystis carinii. J. Infect. Dis. 163, 1121–1127 (1991).
Article CAS PubMed Google Scholar
Furlong, S. T. et al. Lipid transfer from human epithelial cells to Pneumocystis carinii in vitro. J. Infect. Dis. 175, 661–668 (1997).
Article CAS PubMed Google Scholar
Kornitzer, D. Fungal mechanisms for host iron acquisition. Curr. Opin. Microbiol. 12, 377–383 (2009).
Article CAS PubMed Google Scholar
Kuznets, G. et al. A relay network of extracellular haem-binding proteins drives C. albicans iron acquisition from haemoglobin. PLoS. Pathog. 10, e1004407 (2014).
Article PubMed PubMed Central Google Scholar
Kottom, T. J. & Limper, A. H. Cell wall assembly by Pneumocystis carinii. Evidence for a unique gsc-1 subunit mediating beta -1,3-glucan deposition. J. Biol. Chem. 275, 40628–40634 (2000).
Article CAS PubMed Google Scholar
Linke, M. J., Cushion, M. T. & Walzer, P. D. Properties of the major antigens of rat and human Pneumocystis carinii. Infect. Immun. 57, 1547–1555 (1989).
CAS PubMed PubMed Central Google Scholar
Villegas, L. R., Kottom, T. J. & Limper, A. H. Chitinases in Pneumocystis carinii pneumonia. Med. Microbiol. Immunol. 201, 337–348 (2012).
Article CAS PubMed PubMed Central Google Scholar
Sanchatjate, S. & Schekman, R. Chs5/6 complex: a multiprotein complex that interacts with and conveys chitin synthase III from the trans-Golgi network to the cell surface. Mol. Biol. Cell. 17, 4157–4166 (2006).
Article CAS PubMed PubMed Central Google Scholar
Walker, A. N., Garner, R. E. & Horst, M. N. Immunocytochemical detection of chitin in Pneumocystis carinii. Infect. Immun. 58, 412–415 (1990).
CAS PubMed PubMed Central Google Scholar
Kutty, G., Davis, A. S., Ma, L., Taubenberger, J. K. & Kovacs, J. A. Pneumocystis encodes a functional endo-beta-1,3-glucanase that is expressed exclusively in cysts. J. Infect. Dis. 211, 719–728 (2015).
Article CAS PubMed Google Scholar
Kottom, T. J., Hebrink, D. M., Jenson, P. E., Gudmundsson, G. & Limper, A. H. Evidence for pro-inflammatory beta-1,6 glucans in the Pneumocystis cell wall. Infect. Immun. 83, 2816–2826 (2015).
Article CAS PubMed PubMed Central Google Scholar
Villegas, L. R., Kottom, T. J. & Limper, A. H. Characterization of PCEng2, a {beta}-1,3-endoglucanase homologue in Pneumocystis carinii with activity in cell wall regulation. Am. J. Respir. Cell Mol. Biol. 43, 192–200 (2010).
Article CAS PubMed Google Scholar
Rappleye, C. A., Eissenberg, L. G. & Goldman, W. E. Histoplasma capsulatum alpha-(1,3)-glucan blocks innate immune recognition by the beta-glucan receptor. Proc. Natl Acad. Sci. USA 104, 1366–1370 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Cambi, A. et al. Dendritic cell interaction with Candida albicans critically depends on N-linked mannan. J. Biol. Chem. 283, 20590–20599 (2008).
Article CAS PubMed PubMed Central Google Scholar
Saijo, S. et al. Dectin-2 recognition of alpha-mannans and induction of Th17 cell differentiation is essential for host defense against Candida albicans. Immunity 32, 681–691 (2010).
Article CAS PubMed Google Scholar
Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Irimia, M., Rukov, J. L., Penny, D. & Roy, S. W. Functional and evolutionary analysis of alternatively spliced genes is consistent with an early eukaryotic origin of alternative splicing. BMC Evol. Biol. 7, 188 (2007).
Article PubMed PubMed Central Google Scholar
Goffeau, A. et al. Life with 6000 genes. Science 274, 563–547 (1996).
Article Google Scholar
Keeling, P. J. et al. The reduced genome of the parasitic microsporidian Enterocytozoon bieneusi lacks genes for core carbon metabolism. Genome Biol. Evol. 2, 304–309 (2010).
Article PubMed PubMed Central Google Scholar
Ye, D., Lee, C. H. & Queener, S. F. Differential splicing of Pneumocystis carinii f. sp. carinii inosine 5'-monophosphate dehydrogenase pre-mRNA. Gene 263, 151–158 (2001).
Article CAS PubMed Google Scholar
Kornblihtt, A. R. CTCF: from insulators to alternative splicing regulation. Cell Res. 22, 450–452 (2012).
Article CAS PubMed PubMed Central Google Scholar
Elleuche, S. & Poggeler, S. Carbonic anhydrases in fungi. Microbiology. 156, 23–29 (2010).
Article CAS PubMed Google Scholar
Amarsaikhan, N. & Templeton, S. P. Co-recognition of beta-glucan and chitin and programming of adaptive immunity to Aspergillus fumigatus. Front. Microbiol. 6, 344 (2015).
Article PubMed PubMed Central Google Scholar
Ma, L. et al. Analysis of variation in tandem repeats in the intron of the major surface glycoprotein expression site of the human form of Pneumocystis carinii. J. Infect. Dis. 186, 1647–1654 (2002).
Article CAS PubMed Google Scholar
Larsen, H. H. et al. Development of a rapid real-time PCR assay for quantitation of Pneumocystis carinii f. sp. carinii. J. Clin. Microbiol. 40, 2989–2993 (2002).
Article CAS PubMed PubMed Central Google Scholar
Larsen, H. H. et al. Development and evaluation of a quantitative, touch-down, real-time PCR assay for diagnosing Pneumocystis carinii pneumonia. J. Clin. Microbiol. 40, 490–494 (2002).
Article CAS PubMed PubMed Central Google Scholar
Morse, C. G. et al. HIV infection and antiretroviral therapy have divergent effects on mitochondria in adipose tissue. J. Infect. Dis. 205, 1778–1787 (2012).
Article CAS PubMed PubMed Central Google Scholar
Maccallum, I. et al. ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads. Genome Biol. 10, R103 (2009).
Article PubMed PubMed Central Google Scholar
Keely, S. P. et al. Gene arrays at Pneumocystis carinii telomeres. Genetics 170, 1589–1600 (2005).
Article CAS PubMed PubMed Central Google Scholar
Melnikov, A. et al. Hybrid selection for sequencing pathogen genomes from clinical samples. Genome Biol. 12, R73 (2011).
Article CAS PubMed PubMed Central Google Scholar
Kovacs, J. A. et al. Identification of antigens and antibodies specific for Pneumocystis carinii. J. Immunol. 140, 2023–2031 (1988).
CAS PubMed Google Scholar
Lundgren, B., Cotton, R., Lundgren, J. D., Edman, J. C. & Kovacs, J. A. Identification of Pneumocystis carinii chromosomes and mapping of five genes. Infect. Immun. 58, 1705–1710 (1990).
CAS PubMed PubMed Central Google Scholar
Levin, J. Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).
Article CAS PubMed PubMed Central Google Scholar
Parkhomchuk, D. et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 37, e123 (2009).
Article PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Article PubMed PubMed Central Google Scholar
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Article CAS PubMed PubMed Central Google Scholar
Haas, B. J., Zeng, Q., Pearson, M. D., Cuomo, C. A. & Wortman, J. R. Approaches to fungal genome annotation. Mycology 2, 118–141 (2011).
CAS PubMed Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at http://arxiv.org/abs/1303.3997 (2013).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Watanabe, T. et al. The roles of the C-terminal domain and type III domains of chitinase A1 from Bacillus circulans WL-12 in chitin degradation. J. Bacteriol. 176, 4465–4472 (1994).
Article CAS PubMed PubMed Central Google Scholar
Rapaka, R. R. et al. Enhanced defense against Pneumocystis carinii mediated by a novel dectin-1 receptor Fc fusion protein. J. Immunol. 178, 3702–3712 (2007).
Article CAS PubMed Google Scholar
Lionakis, M. S. et al. CX3CR1-dependent renal macrophage survival promotes Candida control and host survival. J. Clin. Invest. 123, 5035–5051 (2013).
Article CAS PubMed PubMed Central Google Scholar
Anumula, K. R. & Taylor, P. B. A comprehensive procedure for preparation of partially methylated alditol acetates from glycoprotein carbohydrates. Anal. Biochem. 203, 101–108 (1992).
Article CAS PubMed Google Scholar
Linke, M. J., Sunkin, S. M., Andrews, R. P., Stringer, J. R. & Walzer, P. D. Expression, structure, and location of epitopes of the major surface glycoprotein of Pneumocystis carinii f. sp. carinii. Clin. Diagn. Lab. Immunol. 5, 50–57 (1998).
CAS PubMed PubMed Central Google Scholar
Yin, X. et al. Glycoproteomic analysis of the secretome of human endothelial cells. Mol. Cell. Proteomics 12, 956–978 (2013).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This project has been funded in whole or in part with federal funds from the Intramural Research Program of the US National Institutes of Health Clinical Center; the National Institute of Allergy and Infectious Diseases; the National Cancer Institute, the National Institutes of Health, under Contract No. HHSN261200800001E; the National Human Genome Research Institute (grant U54HG003067 to the Broad Institute); and the National Institutes of Health (NIH/NCRR)-funded grant entitled ‘Integrated Technology Resource for Biomedical Glycomics’ (grant 1 P41 RR018502-01) to the Complex Carbohydrate Research Center. Y.X. was supported by a scholarship from the Chongqing Medical University (Chongqing, China). X.D was supported by the Public Health Bureau of Guangzhou City (grant 2009-YB-089009) and the Department of Public Health, Guangdong Province (grant 2012513), China. X.S. was supported by the China Scholarship Council. We thank Rene Costello and Howard Mostowski for their support of the animal studies; Michail Lionakis for providing Candida-infected mouse tissue; Chad Steele for providing the dectin-Fc construct; Peter Walzer and Michael Linke for providing antibody RA-E7; the Broad Institute Genomics Platform for Illumina sequencing of DNA and RNA samples; Carsten Russ and Sinéad Chapman for coordination of Illumina sequencing at the Broad Institute; Jennifer Troyer for coordination of Pacific Bioscience sequencing at Leidos Inc.; Zuoming Deng and Biswajit Das for help with genome assembly; Alexander Li and Lily Lin for assistance with rDNA PCR; Jing Zhang for assistance with DNA hybridization; Margaret Priest for assistance with annotation; Lucia Alvarado-Balderrama for assistance with genome data submission and release; Rhys Farrer for assistance with displaying synteny; and Henry Masur, June Kwon-Chung, Peter Williamson, John Bennett and Christopher Desjardins for reviewing the manuscript.

Author information

Da Wei Huang
Present address: Present address: Laboratory of Molecular Biology, National Cancer Institute, Bethesda, Maryland 20892, USA,
Jonathan M. Goldberg
Present address: Present address: Department of Infectious and Immunological Diseases, Harvard T. H. Chan School of Public Health, Boston, Massachusetts 02115, USA,
Liang Ma and Zehua Chen: These authors contributed equally to this work
Christina A. Cuomo and Joseph A. Kovacs: These authors jointly supervised this work

Authors and Affiliations

Critical Care Medicine Department, NIH Clinical Center, National Institutes of Health, Building 10, Room 2C145, 10 Center Drive, Bethesda, Maryland 20892, USA,
Liang Ma, Geetha Kutty, Honghui Wang, Lisa Bishop, Emma Davey, Rebecca Deng, Xilong Deng, Giovanna Fantoni, Emile Gogineni, Grace Handley, Charles Huber, Yueqin Liu, Monica Sassi, Xiaohong Song, Laura Walsh, Yun Xia & Joseph A. Kovacs
Genome Sequencing and Analysis Program, Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, 02142, Massachusetts, USA
Zehua Chen, Amr Abouelleil, Lin Fan, Michael Fitzgerald, Jonathan M. Goldberg, Joshua Z. Levin, Pendexter Macdonald, Alexandre Melnikov, Sean Sykes, Sarah Young, Qiandong Zeng, Chad Nusbaum, Bruce W. Birren & Christina A. Cuomo
Leidos BioMedical Research, Inc., Frederick National Laboratory for Cancer Research, Frederick, 21701, Maryland, USA
Da Wei Huang, Xiaojun Hu, Xiaoli Jiao, Kristine Jones, Castle Raley, Brad T. Sherman, Bao Tran, Jun Yang, Xin Zheng, Robert Stephens & Richard A. Lempicki
Complex Carbohydrate Research Center, University of Georgia, Athens, 30602, Georgia, USA
Mayumi Ishihara & Parastoo Azadi

Authors

Liang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Zehua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Da Wei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Geetha Kutty
View author publications
You can also search for this author in PubMed Google Scholar
Mayumi Ishihara
View author publications
You can also search for this author in PubMed Google Scholar
Honghui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Amr Abouelleil
View author publications
You can also search for this author in PubMed Google Scholar
Lisa Bishop
View author publications
You can also search for this author in PubMed Google Scholar
Emma Davey
View author publications
You can also search for this author in PubMed Google Scholar
Rebecca Deng
View author publications
You can also search for this author in PubMed Google Scholar
Xilong Deng
View author publications
You can also search for this author in PubMed Google Scholar
Lin Fan
View author publications
You can also search for this author in PubMed Google Scholar
Giovanna Fantoni
View author publications
You can also search for this author in PubMed Google Scholar
Michael Fitzgerald
View author publications
You can also search for this author in PubMed Google Scholar
Emile Gogineni
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan M. Goldberg
View author publications
You can also search for this author in PubMed Google Scholar
Grace Handley
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Charles Huber
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoli Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Kristine Jones
View author publications
You can also search for this author in PubMed Google Scholar
Joshua Z. Levin
View author publications
You can also search for this author in PubMed Google Scholar
Yueqin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Pendexter Macdonald
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Melnikov
View author publications
You can also search for this author in PubMed Google Scholar
Castle Raley
View author publications
You can also search for this author in PubMed Google Scholar
Monica Sassi
View author publications
You can also search for this author in PubMed Google Scholar
Brad T. Sherman
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Song
View author publications
You can also search for this author in PubMed Google Scholar
Sean Sykes
View author publications
You can also search for this author in PubMed Google Scholar
Bao Tran
View author publications
You can also search for this author in PubMed Google Scholar
Laura Walsh
View author publications
You can also search for this author in PubMed Google Scholar
Yun Xia
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Young
View author publications
You can also search for this author in PubMed Google Scholar
Qiandong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Robert Stephens
View author publications
You can also search for this author in PubMed Google Scholar
Chad Nusbaum
View author publications
You can also search for this author in PubMed Google Scholar
Bruce W. Birren
View author publications
You can also search for this author in PubMed Google Scholar
Parastoo Azadi
View author publications
You can also search for this author in PubMed Google Scholar
Richard A. Lempicki
View author publications
You can also search for this author in PubMed Google Scholar
Christina A. Cuomo
View author publications
You can also search for this author in PubMed Google Scholar
Joseph A. Kovacs
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.M., Z.C., D.W.H., R.S., C.N., B.W.B., P.A., R.A.L., C.A.C. and J.A.K. were involved in study design. L.M., G.K., L.B., G.F., E.G. and M.S. were involved in sample preparation. A.A., M.F., J.M.G., P.M., A.M., S.Y. and Q.Z. were involved in Illumina sequencing. L.F. and J.Z.L. were involved in Illumina RNA sequencing. D.W.H., X.H., X.J., B.T.S., J.Y. and X.Z were involved in 454 sequencing. D.W.H., K.J., C.R., B.T. and R.S. were involved in PacBio sequencing. E.D., R.D., X.D., E.G., G.H., Y.L., X.S., L.W. and Y.X. were involved in PCR and Sanger sequencing. C.H., X.D., Y.X. and X.S. performed Southern blot. A.A., S.S., S.Y., P.M., M.F., J.M.G., Q.Z., Z.C., D.W.H., X.H., B.T.S. and X.Z. were involved in genome assembly and annotation. Z.C., C.A.C., L.M. and J.A.K. were involved in comparative genomic analysis. G.K., M.I. and P.A. were involved in chitin analysis. G.K., M.I., H.W. and P.A. were involved in msg and glycan MS. L.M., Z.C., C.A.C. and J.A.K. were involved in manuscript writing. All authors reviewed the manuscript. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US Government.

Corresponding authors

Correspondence to Liang Ma, Christina A. Cuomo or Joseph A. Kovacs.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

Supplementary Figures 1-15, Supplementary Tables 1-5, Supplementary Notes 1-7, Supplementary Methods and Supplementary References (PDF 13732 kb)

Supplementary Data 1

Members of the msg superfamily in three Pneumocystis species. (XLSX 55 kb)

Supplementary Data 2

Pneumocystis genes involved in DNA recombination. (XLSX 13 kb)

Supplementary Data 3

Pneumocystis peptidase family. (XLSX 19 kb)

Supplementary Data 4

Pneumocystis genes encoding histone deacetylase. (XLSX 11 kb)

Supplementary Data 5

Pneumocystis genes encoding ATPase. (XLSX 17 kb)

Supplementary Data 6

Pneumocystis RRM-domain-containing proteins. (XLSX 23 kb)

Supplementary Data 7

Transcription factors in Pneumocystis and other fungi. (XLSX 13 kb)

Supplementary Data 8

Pneumocystis transporters. (XLSX 19 kb)

Supplementary Data 9

Pneumocystis genes involved in amino acid metabolism. (XLSX 30 kb)

Supplementary Data 10

Pneumocystis genes involved in nucleotide metabolism. (XLSX 15 kb)

Supplementary Data 11

Pneumocystis genes involved in carbohydrate metabolism. (XLSX 32 kb)

Supplementary Data 12

Pneumocystis genes involved in lipid metabolism. (XLSX 33 kb)

Supplementary Data 13

Pneumocystis genes involved in cofactor metabolism. (XLSX 40 kb)

Supplementary Data 14

Pneumocystis genes involved in clathrin-dependent endocytosis. (XLSX 19 kb)

Supplementary Data 15

Pneumocystis genes involved in proteasome. (XLSX 21 kb)

Supplementary Data 16

Pneumocystis genes involved in oxidative phophorylation. (XLSX 19 kb)

Supplementary Data 17

Fungal genes involved in chitin metabolism. (XLSX 12 kb)

Supplementary Data 18

Pneumocystis genes involved in glucan metabolism. (XLSX 13 kb)

Supplementary Data 19

Pneumocystis genes involved in glycan metabolism. (XLSX 21 kb)

Supplementary Data 20

GSEA of categories enriched in genes highly expressed during Pneumocystis infection. (XLSX 13 kb)

Supplementary Data 21

Glycopeptide mapping of P. carinii Msg protein digests. (XLSX 13 kb)

Supplementary Data 22

Pneumocystis genes involved in spliceosome. (XLSX 33 kb)

Supplementary Data 23

Pneumocystis genes with alternative splicing. (XLSX 21 kb)

Supplementary Data 24

Pneumocystis genes involved in mating and meiosis. (XLSX 23 kb)

Supplementary Data 25

Pneumocystis genes involved in cell signaling and stress responses. (XLSX 31 kb)

Supplementary Data 26

Pneumocystis genes involved in anaerobic growth. (XLSX 12 kb)

Supplementary Data 27

Oligonucleotide primer and probe sequences used in PCR and DNA hybridization. (XLSX 15 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Ma, L., Chen, Z., Huang, D. et al. Genome analysis of three Pneumocystis species reveals adaptation mechanisms to life exclusively in mammalian hosts. Nat Commun 7, 10740 (2016). https://doi.org/10.1038/ncomms10740

Download citation

Received: 25 August 2015
Accepted: 13 January 2016
Published: 22 February 2016
DOI: https://doi.org/10.1038/ncomms10740

This article is cited by

Architecture of the dynamic fungal cell wall
- Neil A. R. Gow
- Megan D. Lenardon
Nature Reviews Microbiology (2023)
Pneumocystis jirovecii colonization and its association with pulmonary diseases: a multicenter study based on a modified loop-mediated isothermal amplification assay
- Ting Xue
- Zhuang Ma
- Chunli An
BMC Pulmonary Medicine (2020)
Genome analysis provides insight about pathogenesis of Indian strains of Rhizoctonia solani in rice
- Srayan Ghosh
- Neelofar Mirza
- Gopaljee Jha
Functional & Integrative Genomics (2019)
Outbreak-Causing Fungi: Pneumocystis jirovecii
- Sarah Dellière
- Maud Gits-Muselli
- Alexandre Alanio
Mycopathologia (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.