Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • ADVERTISEMENT FEATURE Advertiser retains sole responsibility for the content of this article

Setting standards for reproducibility in gut microbiome research

Coloured scanning electron micrograph (SEM) of Escherichia coli bacteria


The gut is home to a bustling microbial community that aids digestion, regulates metabolic function, and modulates the immune system. As such, numerous studies1 from the past decade have linked these communities with conditions including obesity, autoimmune disease and cancer.

Research exploring the interplay between the microbiome and health has yielded intriguing insights, but also discovered unsettling variability in the data obtained by different laboratories2. A comparison of the two largest human microbiome profiling projects, Metagenomics of the Human Intestinal Tract (MetaHIT) and the Human Microbiome Project (HMP), concluded that differences in the DNA extraction protocols led to significant changes in the observed ratios of Firmicutes and Bacteroidetes — two of the gut’s most abundant phyla3. “It is an exciting time for the field of metagenomics and the microbiome,” says Christopher Mason, associate professor at Weill Cornell Medicine in New York. “However the limited ability to compare between different research studies greatly hinders the progress of the research.”

There seems to be a systematic problem. “Relatively minor alterations in the DNA extraction procedure or in the bioinformatics analysis can give a distorted view,” explains Raul Cano, chief scientific officer at BioCollective, a microbiome company based in Denver, Colorado. “Before we can make correlations of microorganisms with a condition, we need to know that the microbiome we have is reflective of reality.”

Enthusiasm for microbiome research has outpaced agreement upon experimental best practices. Labs have often cobbled together workflows based on existing molecular techniques and analytical methods. Many experts now believe a reckoning is at hand, and that progress in the clinical application of gut microbiome data depends on researchers weighing the strengths and weaknesses of their methods. “We have to normalize our approach to the science,” says Cano.

Figure 1: Keeping variability in check

From beginning to end, there are numerous opportunities for experimental biases to profoundly alter the results of a gut microbiome study – and recognizing these factors is an essential step towards controlling them.

The weakest links

Microbiome studies have many moving parts (Figure 1). What’s more, research groups often tinker with experimental processes depending on their specific study and specimen types. All this naturally leads to variability between labs, and may cause bias in the resulting microbial profile, says Shuiquan Tang, microbiome research scientist, at life-science company Zymo Research in Irvine, California. “A sample’s handling and storage can introduce significant bias or even complete loss of information,” he says. This problem was recently highlighted in a publication from the American Gut Project4, which grappled with unwanted ‘blooms’ of bacteria that flourished because of the way the fecal samples were collected and transported, and which could compromise the quality of the microbiome analysis. Therefore, says Tang, immediate preservation is critical. “Samples should be preserved in such a way that the profile is static from the time of collection through DNA extraction, regardless of issues such as temperature fluctuations and freeze-thawing.”

Perhaps the most pernicious source of variability occurs when the DNA is extracted from the microbes in a sample. A recent international analysis5 found that the method of DNA extraction was the most significant variable in metagenomic measurements, with some protocols recovering as much as 100-fold more DNA than alternatives. “This is a direct consequence of the size of the microorganisms, their cellular structure, and the lysis method used,” says Ryan Kemp, director of Nucleic Acid Solutions at Zymo Research. For example, Gram-positive bacteria have much thicker cell walls than their Gram-negative counterparts. If a given extraction method fails to break Gram-positive walls, these species will therefore be underrepresented in the resulting microbial census. Similarly, other eukaryotic flora that inhabit the gut, such as yeast, are also difficult to lyse.

Next, the DNA is converted into a library as a prelude to sequencing. Virtually all library preparations require PCR amplification. Unless controlled, PCR can also introduce bias by amplifying some genomic sequences preferentially. For example, many analyses entail sequencing the gene encoding the small ribosomal subunit (16S sequencing) to distinguish prokaryotes. Choice of the primer and the region to sequence are crucial if the PCR is to capture the full microbial diversity. “There are archaea present in almost every gut,” says Tang, but these organisms are missed by commonly used primer sets that only amplify bacterial species.” However, he adds, newer methodologies address this bias.

There are a wide variety of bioinformatics tools that may be used to classify micro-organisms based on the sequencing data. A recent comparison of 11 tools for interpreting shotgun metagenomics data found that they all arrived at different conclusions, with the number of organisms identified differing by up to three orders of magnitude6. “We recommended pairing existing bioinformatic tools with different classification principles,” says Mason, who was one of the study’s authors. “Combining available programs should improve the accuracy of the results by leveraging each tool’s specific strengths.”

Creating consistency

Researchers can apply a variety of measures to protect against sources of variability. One tool for assessing a sample preparation workflow is a mock microbial community. This is a synthetic collection of microbes, present at well-defined concentrations, containing a diverse range of species — for example, both Gram-positive and Gram–negative bacteria; prokaryotic and eukaryotic organisms; and species with genetic challenges such as atypical guanine-cytosine content or repetitive elements. Mock communities are available from commercial sources such as Zymo Research and the American Type Culture Collection or from individual laboratories.

Researchers can use a mock community that closely reflects the expected composition of their microbial population, but this is not essential. “You’re benchmarking what the analysis is doing, not necessarily what’s in the community,” explains microbiologist Lynn Schriml of the University of Maryland Institute for Genome Sciences in Baltimore. A well-designed mock community offers a powerful control for problems at most steps of the wetlab process, and can help researchers identify process flaws. Using a combination of methods that can correctly measure well-characterized mock microbial communities will give results that draw closer to the truth — and ensure that methodology is reproducible between labs.

A growing number of research consortia, including MetaSUB — an urban microbiome study spearheaded by Mason — are prioritizing standard-setting. “We have PIs all over the world — at 70-plus locations,” says Schriml, who helps coordinate MetaSUB’s efforts in Baltimore. “We spent close to a year working out how we would standardize every step of the project.” However, researchers remain reluctant about introducing overly strict standards, which could constrain the kinds of studies that can be performed. “Standardization of best practices based on sound scientific principles and rigorous controls, as opposed to a protocol, is a guard against standardizing bad practices,” says Kemp.

From Schriml’s perspective, researchers should share metadata describing experimental procedures to improve reproducibility. The Genomic Standards Consortium, an organization she heads, has devised standards to help microbiome researchers describe their work. In her view, this could include such factors as “what you measured, which kit you used, or when you tweaked your protocol”.

Nevertheless, some researchers remain unaware of the problem or are reluctant to adapt their workflow. Cano suggests that journals should act as gatekeepers against poorly controlled or poorly described studies. “If an editor says that you can’t publish a paper if you didn’t follow the rules, things will change real quick.” But Schriml also hopes the community will soon see standards as an asset rather than a chore. “They have to realize the prize that they will get at the end by doing this,” she says, “and how amazing their analyses will be.”

For more information about Zymo Research’s portfolio of microbiomics tools, click here.


  1. Cani, P. D. Gut 67:1716-1725 (2018).

    Article  PubMed  Google Scholar 

  2. Costea, P. I. et al. Nature Microbiology,3(1): 8–16 (2018).

    Article  Google Scholar 

  3. Wesolowska-Andersen, A. et al. Microbiome, 2:19 (2014).

    Article  PubMed  Google Scholar 

  4. McDonald, D. et al. mSystems 3(3): e00031-18 (2018).

    Article  PubMed  Google Scholar 

  5. Costea, P. I., et al. Nature Biotechnol. 35: 1069–1076 (2017)

    PubMed  Google Scholar 

  6. McIntyre, A. B. R., et al. Genome Biol. 18: 182 (2017).

    Article  PubMed  Google Scholar 

Download references


Quick links