Recent developments in Next Generation Sequencing (NGS) technologies have allowed culture-independent and deep molecular analysis of the microbial diversity in faecal samples, and have provided new insights into the bacterial composition of the distal gut microbiota. Studies of the microbiome in different patient groups using metagenomics or 16S rRNA gene sequencing are increasing our knowledge of how the microbiota influences health and disease. The majority of recent advances in our understanding of human microbiota structure and dynamic changes in disease were made through phylogenetic interrogation of small subunit (SSU) rRNA (Paliy and Agans 2012). However, until recently such studies have generally failed to include data on common eukaryotic, endobiotic organisms such as single-celled parasites and yeasts (‘micro-eukaryotes’). This deficiency may strongly bias the interpretation of results and ignoring an entire kingdom of organisms is a major limitation of human microbiome studies.

It has been known for many decades that yeasts constitute part of the intestinal flora. Recently, overgrowth of Candida spp. was found to be negatively associated to Clostridium difficile infection (Manian and Bryant, 2012). Iliev et al. (2012) reported on the eukaryotic fungal community in the gut (the ‘mycobiome’) that coexists with bacteria, and emphasized the substantial expansion of the repertoire of organisms interacting with the intestinal immune system to influence health and disease. Recent data from our lab show that up to 50% of individuals in some Danish cohorts may be colonized by other micro-eukaryotes, such as the protists Blastocystis and Dientamoeba (unpublished observations). By contrast these parasites are practically absent in patients with active inflammatory bowel disease, including ulcerative colitis and Crohn’s Disease (unpublished observations). Although efforts continue to identify relevant genetic markers for phylogenetic interrogation studies, the very role and impact of these micro-eukaryotes on the host and surrounding microbiota remain an enigma.

Although the genome of Dientamoeba fragilis is still not available, studies of the Blastocystis genome have led to the identification of genes encoding a variety of effector proteins, including proteases, hydrolases, polyketide synthases and protease inhibitors (Poirier et al., 2012). However, in the absence of data from transcriptome analysis, little is known about the potential virulence of Blastocystis. According to our knowledge, phagocytosis of bacteria has not been described for this parasite so it is unlikely that Blastocystis is predating on the intestinal microbiota. Both Blastocystis and Dientamoeba are parasites of the colon, and are comparable in size to protists such as Entamoeba (5–20 μm). Future research should include the development of models that can calculate the relative contributions of these parasites to the intestinal biomass and investigations into the enzymes and metabolites that they release, including their impact on the bacterial environment and vice versa.

Meanwhile, stable intestinal micro-eukaryotic communities are widespread in both healthy individuals and many groups of patients with functional and infectious bowel disease (Scanlan and Marchesi, 2008; Engsbro and Stensvold, 2012). Many questions remain to be addressed. For example, what factors are responsible for the fact that many individuals are colonized by Blastocystis and/or Dientamoeba, while others are not? Do these parasites require certain members of the intestinal microbiota to establish? If these parasites are linked for instance to enterotypes (Arumugam et al., 2011) or certain bacterial phyla they can potentially be used as cost-effective surrogate markers for such in the event that these enterotypes/bacterial phyla are linked to certain disease/health phenotypes. Or, do differences in parasite colonization boil down to differences in exposure or genetic susceptibility? Does immunity has a role? Parasitic helminths that often lodge in the body for years if left untreated are known to produce or stimulate the production of immunomodulatory molecules, and the concept of an inverse relationship between parasitic infections and allergic and autoimmune disease prevails (Weinstock, 2012). However, the interaction between parasitic micro-eukaryotes and their hosts, including any immunomodulatory effects, remains to be identified. Finally, the effect of diet on microbiota, including eukaryotes, is only recently being investigated in large scale (De Filippo et al., 2010; Wu et al., 2011; Claesson et al., 2012).

To date, only two different methods have been applied to determine microbiome composition, shotgun metagenomics and sequencing of PCR amplicons. While shotgun metagenomics data stemming from faecal DNAs indeed include sequence data from micro-eukaryotes, such data surprisingly remain unpublished. This could be due to incomplete eukaryotic sequence reference databases, although more likely it is due to lack of focus. The authors are currently extracting and analysing Blastocystis sequences from metagenomic data (publication in preparation).

So far, the ‘human intestinal eukaryotome’ has been studied to a very limited extent. Scanlan and Marchesi (2008) used broad-specificity primers to amplify the SSU rRNA gene (18S) and the internal transcribed spacer regions of eukaryotes present in faecal samples from 17 individuals, and subsequently used cloning and denaturing gradient gel electrophoresis to analyse their products. They also compared their findings with culture results. Similarly, Pandey et al., (2012) used PCR with general eukaryotic primers followed by cloning to analyse the eukaryotic flora in infants and their respective mothers. Both studies conveyed information on the diversity and stability of eukaryotic species in faecal samples, but to date, no studies have analysed the clinical significance of entire eukaryotic communities by phylogenetic and metagenomic interrogation, nor have we seen the emergence of data on the potential interaction between pro- and eukaryotic communities.

Furthermore, some methodological limitations can be identified. In silico analysis was performed using (National Center for Biotechnology Information) NCBI’s Primer BLAST web tool. The universal eukaryotic 18S primers Euk1A and Euk516R (Scanlan and Marchesi, 2008) were BLASTed against eukaryotic DNA sequences available at NCBI’s nr database (previously known as non-redundant: All GenBank+RefSeq Nucleotides+EMBL+DDBJ+PDB sequences; Excluding HTGS0,1,2, EST, GSS, STS, PAT and WGS). All Primer-BLASTs were performed by excluding predicted Refseq transcripts and uncultured/ environmental samples. The in silico analysis of eukaryotes was performed using a cut-off primer stringency setup of two mismatches within the last five nucleotides in the 3′ end and a total of five nucleotide mismatches along the entire primer (standard settings). The analysis resulted in potential amplification of 3689 different eukaryotic species, ranging from 27–4828 nt PCR products (mean of 581nt and a s.d. of 223.8 nt).

Pandey et al., (2012) accumulated a total of 158 sequences from 18S PCR, vector insertion and subsequent sequencing, for eukaryotic species determination in two female individuals. Scanlan and Marchesi (2008) accumulated 36 sequences from each of 11 different samples. This could be referred to as the read depth in NGS terminology. The required read depth for accurate determination of bacterial taxa in human samples has been proposed to be much higher (1.3 million reads with an additional 30 000 reads needed to identify any additional taxa using Illumina short reads, and 40 000–200 000 reads using 454 sequencing according to Lazarevic et al. (2009). Importantly, our in silico analysis suggests that the Euk1A and Euk516R primers amplify only one species of Candida (C. parapsilosis sensu lato), and not the more common species of Candida that can be found in human stool (including C. albicans, C. glabrata, C. tropicalis and C. krusei) (publication in preparation). Likewise, no amplification of Geotrichum, which is often found in human faecal samples, some sporozoa, including Cystoisospora and Cyclospora, and many intestinal helminths (including nematodes, trematodes and cestodes) is attained using these primers, thus suggesting limitations as to their use in studies aimed at global detection of eukaryotic ribosomal genes in human stool samples. The use of primers with a broader eukaryotic coverage in combination with deep sequencing of nonhuman eukaryotic ribosomal genes in stool samples should therefore be taken into use to investigate the structure and function of the fungi and parasites that appear to be common in many individuals. Such analyses should moreover be combined with similar 16S microbiota profiling in order to enable analysis of associations between all three kingdoms.

Hence, the focus on micro-eukaryotes should be expanded. Although efforts continue to explore their potential roles in health and disease by phylogenetic interrogation and investigations into gene expression, it should also be investigated to what extent these micro-eukaryotes are markers of a given intestinal microbiota composition, and how they interact with bacteria in the gut. Using targeted PCR with relevant primers and massive parallel (18S) and metagenomic sequencing techniques, the structure, function and clinical significance of common micro-eukaryotes will be elucidated, and many of the conundrums surrounding these organisms untangled.