Parkinson’s disease (PD) represents the second most common human neurodegenerative disorder. So far, no treatment to stop the neurodegenerative process is available1. PD patients not only suffer from motor-associated and rare cognitive symptoms, but also from gastrointestinal (GI) symptoms2. GI-associated symptoms include constipation, prolonged intestinal transit time or defecation-associated dysfunctions and can precede the classical motor symptoms by several years2,3,4,5,6. Additionally, there is rising evidence that even the enteric nervous system (ENS) becomes compromised before the central nervous system7,8,9. These findings, alongside the growing numbers of published reports in this field, support the hypothesis that PD might begin, at least in a subgroup of patients, in the GI tract and propagate to the central nervous system10,11. The ENS is connected to the central nervous system via the sympathetic nervous system and the vagus nerve which form the so called gut–brain axis12. Experimental data suggest that this neuronal chain between ENS and the brain allows pathological peptides to propagate in a prion-like fashion between the gut and the brain, thereby modulating the course of this neurological disease13,14,15.

Gut-associated PD symptoms, such as constipation, are discussed to be dependent on changes of the intestinal microbiota composition and their metabolic activity16. Recent publications suggest that especially bacterial dysbiosis might play an important role in PD pathogenesis17,18,19,20. Other studies even indicate that the pathological process of PD alongside the gut–brain axis might be modulated or initiated by the gut microbiota21,22. Increased inflammation in PD, which is indicated by increased levels of faecal markers for inflammation and gut permeability, is suggested to be linked to dysbiosis in the gut environment23,24. Additionally, elevated levels of pathogenic or opportunistic pathogenic bacteria were found to be independent of medication25. The changes in relative abundance of some bacterial genera regarded as health promoting was found to be at least partially due to the medication25. To better understand the influence of the gut microbiota on the pathobiology of PD, and the influence of the disease on the microbiota itself, a deeper understanding of compositional changes in the gut microbiota is needed. So far, most PD microbiota studies focused on intestinal bacteria, and there is indeed growing evidence that bacterial dysbiosis may play an important role in PD pathogenesis17,18,19. However, the cause–effect relationship between PD pathogenesis and the structure and function of the gut microbiota is still obscure.

Besides bacteria, the gut microbiota also comprises archaea, viruses and various eukaryotic taxa including fungi. For bacteriophages, associations with shifts in the phagobiota in PD patients were reported, suggesting a closer look at the non-bacterial microbiota associated with this disease26. With estimated 0.01–0.1% of genes in stool samples, the gut mycobiome represents only a small fraction when compared to the faecal bacterial community27. Nevertheless, Candida species were reported to be present in about 70% of healthy adults28. The majority of intestinal eukaryotes still await cultivation and are difficult to identify29. However, the gut mycobiome was also reported to influence the gut–brain axis by secretion of neurotransmitters, modulation of cytokine production and production of short-chain fatty acids30. Additionally, reports showed a connection between gut eukaryotes and anorexia nervosa, a central nervous disease31, as well as autism spectrum disorders32.

In this study, we used next-generation sequencing to study differences in the faecal eukaryotic microbiota composition between a previously well studied group of PD patients and matched controls. To create a starting point and as only approximately 0.01% of human gut metagenomic sequences generated can be mapped to fungal genomes, a targeted, amplicon-based approach was chosen27. We analysed the same set of samples previously used by Unger et al. (2016) and Weis et al. (2019). Here, we used Illumina MiSeq-based next-generation sequencing to sequence the V6 and V7 region of the eukaryotic 18S rRNA gene20,33. Monitoring differences in the composition of faecal eukaryotes between PD patients and suitable healthy controls is a first step to elucidate whether gut eukaryotes mighty play any role in PD. Our study might help identifying organisms showing an association with PD and thereby increase our knowledge on the etiopathogenesis of PD.


Sequencing and bioinformatics

In all, 18 (53%) of the PD patient samples (6 females and 12 males) and 18 (72%) of the control group samples (10 females and 8 males) yielded sufficient 18S rRNA gene amplicons for downstream applications. Additional information about those samples is provided in Supplementary Table 1. The sequencing dataset comprised 493,979 partial 18S rRNA gene sequences after quality control, denoising, paired-end merging and chimera removal. The mean number of sequences per sample was 13,350 (min: 3895; max: 23,080). Assignment of taxonomy was done with the seven taxonomic ranks provided by SILVA, where the sixth rank equals genera for most listed taxa, the fifth lists mainly families, the fourth mainly orders, and so forth. Following rarefication to even depth of sequences per sample, 1612 Amplicon Sequence Variants (ASV) affiliated with 122 genera-equivalent ranks in the SILVA database, 95 family-equivalent ranks and 24 order equivalent ranks were assigned. ASV rarefaction curves (Fig. 1) showed that the sequencing depth was sufficient to detect the vast majority of taxa in all samples.

Fig. 1: Alpha rarefaction curves to visualize the dependence of eukaryotic richness and sequencing depth.
figure 1

Shown are amplicon sequence variation (ASV) richness rarefaction curves for the sequencing sample depth for all samples. Blue: PD samples, orange: controls. Lines represent the mean of 50 sampling steps, each, with the opaque outlines as standard errors.

Structural diversity measures

The three applied indices for alpha diversity (Fig. 2a), Observed, Shannon and Simpson, revealed a significantly lower eukaryotic diversity in PD patients compared to the controls (Ctrl) (pObserved < 0.002, pShannon < 0.001 pSimpson < 0.002).

Fig. 2: Alpha and beta-diversity plots for visualization of the eukaryotic community structure and according to differences between the control group and PD patients.
figure 2

PD samples are displayed in blue and control samples in orange. With Observed, Shannon and Simpson, the three most common used indices for alpha diversity are shown in (a). Alpha-diversity boxplots consist of the median and lower and upper quartiles while the whiskers represent the mininmal and maximal spread of individual samples. Significant differences between the groups are indicated with asterisks (***p < 0.001, **p < 0.01). For beta-diversity, PCoA plots of the weighted (b) and unweighted (c) UniFrac measures are shown. Both beta-diversity measures show significant differences between controls and PD.

For beta-diversity, non-parametric multivariate analysis of variance (ADONIS) was calculated for the different groups (Fig. 2b, c). Between the PD and the control group, unity of fraction (UniFrac) measures revealed significant differences regarding the weighted (pwUniFrac = 0.0073) and the unweighted (pUniFrac < 0.0128) UniFrac measure. For the unweighted UniFrac measure, a significant difference was found between controls and PD patients treated with entacapone (pUniFrac < 0.0483) or l-dopa (pUniFrac < 0.0051) as well as between controls and PD patients without entacapone (pUniFrac < 0.0012) or l-dopa (pUniFrac < 0.0068) treatment. For the weighted UniFrac measure, a significant difference was found between PD patients without entacapone treatment and the controls (pwUniFrac < 0.0003) as well as PD patients with entacapone treatment (pwUniFrac < 0.0129). For the medication with l-Dopa, significant differences in the beta diversity were found between controls and both PD patients without treatment (pwUniFrac < 0.0006) and those with treatment (pwUniFrac < 0.0353).

Compositional differences

Hypothesis testing revealed seven taxa on the genus-equivalent rank, assigned using the SILVA database, to be significantly different in relative abundance between PD and controls. The genera Aspergillus (pFDR < 0.031), Cercomonas (pFDR < 0.037) and Heteromita (pFDR < 0.022), as well as three unknown genus-equivalent features of the phylum division Charophyta (pFDR < 0.019), the order Chromulinales (pFDR < 0.013) and the clade Opisthokonta (pFDR < 0.013), were found to be significantly lower in relative abundance in the group of PD patients when compared to the controls. In contrast, ASVs affiliated with the genus Geotrichum (pFDR < 0.017) were significantly higher in relative abundance in the group of PD patients. As displayed in Table 1, the taxonomy assignment of the representative sequence with nucleotide-based basic local alignment search tool (BLASTN) for the nucleotide database provided by the National Center for Biotechnology Information (NCBI) revealed different species for those features. The genera Geotrichum, Aspergillus, Cercomonas and Heteromita were identified as being closely related (sequence identity >99%) to Geotrichum candidum, Penicillium roqueforti, Paracercomonas sp. and Cercozoa sp. B134, respectively. The highest degree of sequence identity (97–99%) for the unknown Charophyta taxon was found with Linum usitatissimum and with Poterioochromonas malhamensis in case of the unknown Chromulinales taxon. For Opisthokonta no sequences with identity >90% were found.

Table 1 Taxa differing significantly between PD samples and controls.

Using analysis of composition of microbiomes (ANCOM-II34) with the non-condensed ASV table, all 12 ASVs assigned to Geotrichum were found to be significantly different between controls and PD patients (W > 0.8, pFDR < 0.05). All of those ASV were then identified as Geotrichum candidum (sequence identity >98.7%) using BLASTN with the NCBI database. However, no other ASV was found to be significantly differing between PD patients and controls according to ANCOM-II, even with a W-cut-off at 0.6.

Samples from PD patients with elevated levels of faecal calprotectin (n = 6) showed no significant differences in community composition when compared to the controls or PD patients with normal calprotectin levels (n = 12). PD patients treated with l-dopa (n = 12) showed significantly higher relative abundances of Geotrichum (pFDR = 0.0138). However, there were no significant differences between PD patients who were not treated with l-dopa (n = 6) and the control group or the PD patients treated with l-dopa. PD patients treated with Entacapone in addition to l-dopa (n = 7) also showed significant higher shares of Geotrichum when compared to the controls (pFDR = 0.0069). Finally, no significant differences were found between PD patients without Entacapone treatment (n = 11) and the control group, or between PD patients without Entacapone treatment and PD patients treated with Entacapone.

Correlation analysis using Spearman rho and Pearson rho did not reveal any significant correlation between the count data of the Geotrichum-affiliated sequences and the count data of any bacterial genera from the same samples, investigated in our previous study20.


Several publications addressed the composition of the intestinal bacterial microbiota in PD patients by means of qPCR33 or sequencing of different regions of 16S rRNA gene amplicons, using pyrosequencing21,35, Ion Torrent20 and Illumina MiSeq technology20,36,37. However, only little is known about associations and potential functional implications of the eukaryotic gut microbiota with PD. Recently, sequencing of fungal-specific internal transcribed spacer (ITS)-2 amplicons, derived from faecal samples of a cohort of PD patients, did not reveal any genetic differences in relative abundance or diversity between PD and controls38. The authors also reported a very low fungal load for their samples and significantly lower amounts of fungal DNA in PD patients38. This finding might partly explain why 28% (controls) and 47% (PD) of our samples did not yield sufficient amplicons for downstream analyses. However, we found significant differences in relative abundance for several eukaryotic taxa, including fungi, between the remaining PD and control samples.

Our data indicate a significantly decreased richness of eukaryotes according to the observed richness, Simpson metric and Shannon metric (Fig. 2a). This is in contrast to Cirstea et al.38 who did not report changes in richness, which might be due to fundamental differences in the sample groups, such as dietary habits, geographic origin or medical history and status. However, our finding matches previously published decreases in bacterial richness20,36,39. The beta diversity measures also showed significant differences between PD and control samples for the unweighted and weighted UniFrac metric (Fig. 2b, c), suggesting substantial differences in community composition between the eukaryotic microbiota of PD patients and the healthy control group. Therefore, our expectation was to find significant differences in the relative abundance of several taxa.

For the bacterial community of this study cohort, we previously reported a significantly lower observed alpha diversity under PD, too20. However, in contrast to the eukaryotic data presented here, indicators assuming uniformity of the distribution (Shannon and Simpson) and all beta-diversity parameters did not show significant differences between controls and PD patients20, suggesting that eukaryotic and prokaryotic diversity might behave differently under PD. However, these differences might also be due to the smaller sample size analysed here or the use of different sequencing methods.

Even though we found seven genus-equivalent taxa being significantly different in their relative abundance between control and PD samples, only Geotrichum was found to be significantly different when using ANCOM-II, i.e., when using a compositional approach34. It might be speculated that the observed depletion of the other, minor abundant groups is just due to the overgrowth of Geotrichum in the PD samples rather than actual depletion. Further studies might reveal whether there are significant differences in other groups, such as protozoa too. Our data at least suggest that besides moulds like Geotrichum also other eukaryotic genera might differ between PD patients and controls. Although bacterivorous gliding zooflaellates such as Cercomonas (Paracercomonas), Heteromita (Cercozoa sp. B134) and Poterioochromonas are not regarded as typical representatives of the human gut microbiota, grazing of protists is often selective and migth influence the virulence, metabolism and morphology of intestinal bacteria40,41,42,43. Penicillium and Aspergillus are commonly found in the healthy human gut44. Moulds are known to produce various mycotoxins and to influence the mucosal cytokine response, which affect GI homoeostasis and the composition of the bacterial gut microbiota45,46. Finally, some differences in relative abundance of eukaryotic taxa such as Linum usitatissimum or Penicillium roqueforti might just result from differential nutrition.

In contrast to the other genera, the family Dipodascaceae and its affiliated genus Geotrichum showed a markedly higher relative abundance in the PD samples compared to the controls. Dipodascaceae are commonly found in human stool samples and are potentially able to colonize the gut44. Few species are regarded as a health issue and capable of causing geotrichosis, including Geotrichum candidum, which produces several toxins47. However, this species is also used in dairy production48, where it is also regarded as spoilage organism48 causing opportunistic infections in immunocompromised patients49. An increased abundance of lactococcal phages, which are also present in dairy products, was suggested to cause depletion of Lactococcus in PD patients26. Together with our data, such observations suggest a special influence of dietary habits, in particular dairy products, on the intestinal microbiota of PD patients. It might be also be speculated that PD patients offer more suitable intestinal conditions for this species than healthy patients do. Due to its marked and significant differences in relative abundance between controls and PD patients, Geotrichum was used for a correlation analysis with our previously published bacterial 16S rRNA gene data20. However, after false discovery rate correction no significant correlation with any bacterial genus was found, suggesting no direct fungus–bacterium associations here. However, also technological reasons (e.g. the differing sequencing technologies) or the time gap between the studies might at least partially explain these missing correlations.

Clearly, the respective subsample sizes in our study were small, also due to the discarded samples, so that all hypotheses need to be verified. Additionally, some findings might also be caused by different food preferences of the investigated patients, although all of them reported an omnivorous European diet. Clearly, further analyses are needed to verify whether the observed differences in community composition might be of biological/medical relevance in the case of PD. To do so, more complex analyses using multiple variable regions of the 18S rRNA gene sequence and ITS region sequences or even a metagenomics approach will be needed50. Since only 0.01% of metagenomics sequences generated from human gut samples can be aligned to fungal genomes, such metagenomics approaches require sufficient sequencing depth27.

We were able to show that PD faecal samples contained a eukaryotic microbial community of lower diversity and that several eukaryotic taxa differed in relative abundance between PD and control faecal samples, suggesting an association of PD and the eukaryotic microbiota. Within the PD group, we particularly observed a very high relative abundance of Geotrichum, a fungal genus commonly found in the human gut and usually not regarded as an intestinal pathogen. In addition, difference in a few minor abundant groups were detected. Future studies will have to show, whether these findings are reproducible, of functional relevance or diagnostic value. Despite sampling issues (sample age, small sample size), environmental factors, such as geographic and cultural background of the studied cohort and the used sequencing techniques might explain differences to previous studies. Clearly, studies with larger patient and control cohorts are needed for deeper insights into the potential links between gut eukaryotes and PD, ideally combing metagenomic, metatranscriptomic and/or metabolomic approaches.


Cohort, sample collection and DNA isolation

All 34 PD patients (10 females, 24 males), whose faecal samples were used in this work, were diagnosed according to the UK PD Society Brain Bank Clinical Diagnostic Criteria51. The 25 control persons (14 females, 11 males) were age matched and did not report any pre-existing medical conditions or medication. All subjects followed an omnivorous diet without special dietary habits, and did not report on acute or chronic gastrointestinal disorders or intake of antibiotics, pro- or prebiotics up to 3 months prior to sampling. For more sample details, sample collection and DNA isolation, the reader is referred to our previous publication on the bacterial community composition of the same study cohort20. In short, subjects were provided with sterile faecal collector sets (MED AUXIL, Süsse Labortechnik, Gudensberg, Germany) for home sampling. Samples were stored and transported at least at −20 °C and DNA was isolated using the FastDNA SPIN kit for faeces (MP Biomedicals, Heidelberg, Germany). The study was approved by the ethics committee of the medical association of the German federal state of Saarland and is recorded therein with the identification number 111/12. All enrolled subjects provided written informed consent for their participation.

Library preparation and sequencing

Sequencing library preparation of the V6 and V7 region of eukaryotic 18S rRNA genes were performed using the 18S-specific primers 1152F (5′- TGAAACTTRAAGRAATTGACGGA-3′) and 1428R (5′-GGRCATMACDGACCTGYTAT-3′)27 with additional adapter sequences for the Illumina Nextera indexing to produce amplicons of approximately 250–380 bp length. Replicate PCR amplification per sample, amplicon verification and purification were performed twice using the KAPA HiFi Polymerase-Kit (Roche, Mannheim, Germany) as described elsewhere52. Fifty microliters PCRs were set up with a 5 µl DNA template, 30 amplification cycles (initial denaturation: 3 min, 95 °C; 30 cycles: 30 s, 95 °C, 30 s, 55 °C, 45 s, 72 °C; final elongation: 5 min, 72 °C), Saccharomyces cerevisiae DNA as positive controls, and water-template controls. Verification of amplification by agarose gel electrophoresis using Midori Green as DNA-dye (Biozym, Olderndorf, Germany) showed that 18 control samples and 18 PD samples produced sufficient amplicons for downstream analyses. Metadata for these remaining samples is provided in Supplementary Table 1. Successful replicate PCR reactions of the same samples were pooled and purified using Agencourt AMPure beads (Beckman Coulter, Krefeld, Germany).

Unique index barcodes with sequencing adaptors were added to the amplicon targets as described previously52 using the unique indexing primer combinations of the Nextera XT index kit v2 set D (Illumina), 5 µl of purified template, and the PCR profile detailed above, albeit with 8 cycles and a total volume of 25 µl. After another purification step, quality checks, checks for library size and DNA concentration using Agilent DNA 1000 chips in the Agilent bionalyzer (Agilent Technologies, Waldbronn, Germany) and the Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Schwerte, Germany) were carried out as described before52; libraries were normalized to 4 nM and pooled for sequencing. By following the manufacturer’s instructions, the pooled libraries were sequenced in a final concentration of 6 pM on an Illumina MiSeq platform (Illumina). For the targeted amplicon sequencing procedure, a 600 cycle paired-end format (2 × 300 bp + 2 × 8 bp index cycles) was chosen using the MiSeq Reagent Kit v3 with 20% phiX spike in in the pooled library.

Bioinformatics and statistics

Sequence data were processed using QIIME 2 version 2019.7 (ref. 53). Quality cut-offs were performed with the median at Q ≥ 25. Minimum and maximum sequence lengths of 200 and 1000 bp were used (QIIME2 default). Denoising, merging paired-ends and removal of chimeras were performed using the DADA2 pipeline of QIIME2. Amplicon sequence variations (ASV) were chosen within 99% sequence identity. A primer-fitted taxonomy classifier, trained with the SILVA database release 132 using the classify-sklearn algorithm in QIIME2, was used to assign taxonomy and align sequences54.

Using a QIIME2-generated ASV table, normalized by rarefaction to even depth, alpha-diversity indices for Observed, Shannon and Simpson metrics, as well as beta-diversity indices for weighted and unweighted UniFrac were calculated with the according QIIME2 plugins. PCoA plots were also produced with QIIME2 using the previously calculated distance matrices. Hypothesis testing of the alpha- and beta-diversity measures were also performed in QIIME2 using Kruskal–Wallis statistics for alpha diversity and PERMANOVA with 10,000-fold permutations for beta-diversity measures.

Further statistical analyses were calculated within R (version 4.0.2). The phyloseq package (version 1.32.0) was used as interface between QIIME2 and R to prepare the previously rarefied table provided by QIIME2 by condensing ASVs onto the genus-equivalent rank and calculating alpha diversity boxplots with the QIIME2 provided rarefied table55. To find significant differences in feature counts between PD and controls, a two-sided Wilcoxon–Mann–Whitney test for unpaired and non-normally distributed samples in a 10,000-fold Monte-Carlo simulation within the R package coin (version 1.3.1) was used56. For compositional data analysis, the statistical framework “Analysis of Composition of Microbiomes” (ANCOM57) in the zero inflation avoiding version ANCOM-II34 was used in an ASV-wise manner with the non-condensed ASV table to compare compositional differences in the the Control and PD groups. Zero- and out- cut-off were kept at the default of 0.90 and 0.05, respectively. The FDR adjusted alpha-value was set as 0.05 and for the W-statistic cut-off the arbitrary numbers 0.7 and 0.8 were used. All p values from hypothesis testing were corrected for the false discovery rate (FDR) in their corresponding group58.

Correlation analysis for Spearman correlation index and Pearson correlation of the merged count data of the Geotrichum genus with all bacterial genera published before20 of the same samples successfully sequenced in this work. Correlations were calculated using the cor.test function of the stats package in R59. Additionally, correlations between Geotrichum and bacterial genera were calculated for the PD group only.

For all ASVs or ASV clusters differing significantly in relative abundance in the performed hypothesis testing, the representative sequence was used for further comparison with the National Center for Biotechnology Information (NCBI) nucleotide database. Here, the nucleotide-based basic local alignment search tool (BLAST) with the megablast option without further exclusions or limitations against the standard databases provided by NCBI was used.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.