Application of 'next-generation' sequencing technologies to microbial genetics

MacLean, Daniel; Jones, Jonathan D. G.; Studholme, David J.

doi:10.1038/nrmicro2088

Review Article
Published: 23 February 2009

Application of 'next-generation' sequencing technologies to microbial genetics

Daniel MacLean¹,
Jonathan D. G. Jones¹ &
David J. Studholme¹

Nature Reviews Microbiology volume 7, pages 96–97 (2009)Cite this article

7576 Accesses
260 Citations
3 Altmetric
Metrics details

Key Points

New sequencing technologies, such as Solexa, 454 pyrosequencing and SOLiD, developed by Illumina, Roche and Applied Biosystems, respectively, are set to revolutionize microbiology by dramatically increasing throughput and reducing costs of DNA sequencing.
These new technologies present new technical and computational challenges, as well as new research opportunities.
Applications include de novo genome sequence assembly, metagenomics, sRNA discovery, detection of polymorphisms, expression profiling and epigenetics.
Many freely available software packages are available for dealing with the large datasets generated by these applications.
As well as sequence alignment and assembly, there is a need for downstream processing of data into a form that is accessible to biologists.
Standards are emerging for analysis and archiving of data generated by the new technologies.

Abstract

New sequencing methods generate data that can allow the assembly of microbial genome sequences in days. With such revolutionary advances in technology come new challenges in methodologies and informatics. In this article, we review the capabilities of high-throughput sequencing technologies and discuss the many options for getting useful information from the data.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: High-throughput sequencing technologies.**

**Figure 2: Selecting a technology for an experiment.**

**Figure 3: Road map for planning software solutions for experiments with different data sources and different goals.**

Genome sequencing—the dawn of a game-changing era

Article 12 June 2019

Veronica van Heyningen

The chemistry of next-generation sequencing

Article 16 October 2023

Raphaël Rodriguez & Yamuna Krishnan

Beyond assembly: the increasing flexibility of single-molecule sequencing technology

Article 09 May 2023

Paul W. Hook & Winston Timp

References

Pop, M. & Salzberg, S. L. Bioinformatics challenges of new sequencing technology. Trends Genet. 24, 142–149 (2008). An accessible overview of the computational challenges presented by new sequencing technologies.
Article CAS PubMed PubMed Central Google Scholar
Trombetti, G. A., Bonnal, R. J., Rizzi, E., De Bellis, G. & Milanesi, L. Data handling strategies for high throughput pyrosequencers. BMC Bioinformatics 8, S22 (2007).
Article PubMed PubMed Central Google Scholar
Hall, N. Advanced sequencing technologies and their wider impact in microbiology. J. Exp. Biol. 210, 1518–1525 (2007).
Article CAS PubMed Google Scholar
Holt, R. A. & Jones, S. J. The new paradigm of flow cell sequencing. Genome Res. 18, 839–846 (2008). A comprehensive description of sequencing technologies and their applications.
Article CAS PubMed Google Scholar
Mardis, E. R. The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133–141 (2008).
Article CAS PubMed Google Scholar
Mardis, E. R. Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet. 9, 387–402 (2008).
Article CAS PubMed Google Scholar
Marguerat, S., Wilhelm, B. T. & Bähler, J. Next-generation sequencing: applications beyond genomes. Biochem. Soc. Trans. 36, 1091–1096 (2008).
Article CAS PubMed PubMed Central Google Scholar
Medini, D. et al. Microbiology in the post-genomic era. Nature Rev. Microbiol. 6, 419–430 (2008).
Article CAS Google Scholar
Rusk, N. & Kiermer, V. Primer: sequencing — the next generation. Nature Methods 5, 15 (2008).
Article CAS PubMed Google Scholar
Schuster, S. C. Next-generation sequencing transforms today's biology. Nature Methods 5, 16–18 (2008).
Article CAS PubMed Google Scholar
Shendure, J. & Ji, H. Next-generation DNA sequencing. Nature Biotechnol. 26, 1135–1145 (2008). Contains detailed descriptions of sequencing technologies and their applications, and a useful survey of available software.
Article CAS Google Scholar
Snyder, L. A., Loman, N., Pallen, M. J. & Penn, C. W. Next-generation sequencing — the promise and perils of charting the great microbial unknown. Microb. Ecol. 57, 1–3 (2009).
Article PubMed Google Scholar
Steinberg, K. M., Okou, D. T. & Zwick, M. E. Applying rapid genome sequencing technologies to characterize pathogen genomes. Anal. Chem. 80, 520–528 (2008).
Article CAS PubMed PubMed Central Google Scholar
Wold, B. & Myers, R. M. Sequence census methods for functional genomics. Nature Methods 5, 19–21 (2008).
Article CAS PubMed Google Scholar
Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).
Article CAS PubMed PubMed Central Google Scholar
Braslavsky, I., Hebert, B., Kartalov, E. & Quake, S, R. Sequence information can be obtained from single DNA molecules. Proc. Natl Acad. Sci. USA 100, 3960–3964 (2003).
Article CAS PubMed PubMed Central Google Scholar
Harris, T. D. et al. Single-molecule DNA sequencing of a viral genome. Science 320, 106–109 (2008).
Article CAS PubMed Google Scholar
Medini, D., Donati, C., Tettelin, H., Masignani, V. & Rappuoli, R. The microbial pan-genome. Curr. Opin. Genet. Dev. 15, 589–594 (2005).
Article CAS PubMed Google Scholar
Velicer, G. J. Comprehensive mutation identification in an evolved bacterial cooperator and its cheating ancestor. Proc. Natl Acad. Sci. USA 103, 8107–8112 (2006).
Article CAS PubMed PubMed Central Google Scholar
Mardis, E., McPherson, J., Martienssen, R., Wilson, R. K. & McCombie, W. R. What is finished, and why does it matter. Genome Res. 12, 669–671 (2002).
Article CAS PubMed Google Scholar
Stiens, M. et al. Comparative genomic hybridisation and ultrafast pyrosequencing revealed remarkable differences between the Sinorhizobium meliloti genomes of the model strain Rm1021 and the field isolate SM11. J. Biotechnol. 136, 31–37 (2008).
Article CAS PubMed Google Scholar
La Scola, B. et al. Rapid comparative genomic analysis for clinical microbiology: the Francisella tularensis paradigm. Genome Res. 18, 742–750 (2008).
Article CAS PubMed PubMed Central Google Scholar
Dinsdale, E. A. et al. Functional metagenomic profiling of nine biomes. Nature 455, 830 (2008). The 454 GS20 technology developed by Roche enabled the authors to find that metagenomes from different biomes encode distinctly different metabolic profiles.
Article CAS Google Scholar
Ossowski, S. et al. Sequencing of natural strains of Arabidopsis thaliana with short reads. Genome Res. 18, 2024–2033 (2008). The authors tackle genome-wide polymorphism by integrating 'resequencing' approaches with de novo assembly.
Article CAS PubMed PubMed Central Google Scholar
Baird, N. A. et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3, e3376 (2008).
Article PubMed PubMed Central Google Scholar
Holt, K. E. et al. High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi. Nature Genet. 40, 987–993 (2008).
Article CAS PubMed Google Scholar
Liu, Z. et al. Patterns of diversifying selection in the phytotoxin-like scr74 gene family of Phytophthora infestans. Mol. Biol. Evol. 22, 659–672 (2004).
Article CAS PubMed Google Scholar
Kamoun, S. A catalogue of the effector secretome of plant pathogenic oomycetes. Annu. Rev. Phytopathol. 44, 41–60 (2006).
Article CAS PubMed Google Scholar
Srivatsan, A. et al. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies. PLoS Genet. 4, e1000139 (2008).
Article PubMed PubMed Central Google Scholar
Loman, N. J. & Pallen, M. J. XDR-TB genome sequencing: a glimpse of the microbiology of the future. Future Microbiol. 3, 111–113 (2008).
Article CAS PubMed Google Scholar
Velculescu, V. E., Zhang, L., Vogelstein, B. & Kinzler, K. W. Serial analysis of gene expression. Science 270, 484–487 (1995).
Article CAS PubMed Google Scholar
Cheung, F. et al. Analysis of the Pythium ultimum transcriptome using Sanger and pyrosequencing approaches. BMC Genomics 9, 542 (2008).
Article PubMed PubMed Central Google Scholar
Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nature Methods 5, 613–619 (2008).
Article CAS PubMed Google Scholar
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA–Seq. Nature Methods 5, 621–628 (2008).
Article CAS PubMed Google Scholar
Nagalakshmi, U. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).
Article CAS PubMed PubMed Central Google Scholar
Lister, R. et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008). This ambitious and comprehensive survey of the epigenome was enabled by sequencing technology developed by Illumina.
Article CAS PubMed PubMed Central Google Scholar
Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. & Gilad, Y. RNA–Seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).
Article CAS PubMed PubMed Central Google Scholar
Shendure, J. The beginning of the end for microarrays? Nature Methods 5, 585–587 (2008).
Article CAS PubMed Google Scholar
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Article CAS PubMed Google Scholar
Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold B. Genome-wide mapping of in vivo protein–DNA interactions. Science 316, 1497–1502 (2007).
Article CAS PubMed Google Scholar
Taylor, K. H. Ultradeep bisulfite sequencing analysis of DNA methylation patterns in multiple gene promoters by 454 sequencing. Cancer Res. 67, 8511–8518 (2007).
Article CAS PubMed Google Scholar
Cokus, S. J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215–219 (2008).
Article CAS PubMed PubMed Central Google Scholar
Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).
Article CAS PubMed Google Scholar
Hakimi, M. A. & Deitsch, K. W. Epigenetics in Apicomplexa: control of gene expression during cell cycle progression, differentiation and antigenic variation. Curr. Opin. Microbiol. 10, 357–362 (2007).
Article CAS PubMed Google Scholar
Wang, G. P., Ciuffi, A., Leipzig, J., Berry, C. C. & Bushman, F. D. HIV integration site selection: analysis by massively parallel pyrosequencing reveals association with epigenetic modifications. Genome Res. 17, 1186–1194 (2007).
Article CAS PubMed PubMed Central Google Scholar
Molnár, A., Schwach, F., Studholme, D. J., Thuenemann, E. C. & Baulcombe, D. C. miRNAs control gene expression in the single-cell alga Chlamydomonas reinhardtii. Nature 447, 1126–1129 (2007).
Article PubMed Google Scholar
Dohm, J. C., Lottaz, C., Borodina, T. & Himmelbauer, H. SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing. Genome Res. 17, 1697–1706 (2007).
Article CAS PubMed PubMed Central Google Scholar
Warren, R. L., Sutton, G. G., Jones, S. J. & Holt, R. A. Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23, 500–501 (2007).
Article CAS PubMed Google Scholar
Jeck, W. R. et al. Extending assembly of short DNA sequences to handle error. Bioinformatics 23, 2942–2944 (2007).
Article CAS PubMed Google Scholar
Pevzner, P. A., Tang, H. & Waterman, M. S. An Eulerian path approach to DNA fragment assembly. Proc. Natl Acad. Sci. USA 98, 9748–9753 (2001).
Article CAS PubMed PubMed Central Google Scholar
Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Article CAS PubMed PubMed Central Google Scholar
Chaisson, M. J. & Pevzner, P. A. Short read fragment assembly of bacterial genomes. Genome Res. 18, 324–330 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hernandez, D., François, P., Farinelli, L., Osterås, M. & Schrenzel, J. De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. Genome Res. 18, 802–809 (2008).
Article CAS PubMed PubMed Central Google Scholar
Butler, J. et al. ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res. 18, 810–820 (2008).
Article CAS PubMed PubMed Central Google Scholar
Phillippy, A. M., Schatz, M. C. & Pop, M. Genome assembly forensics: finding the elusive mis-assembly. Genome Biol. 9, R55 (2008).
Article PubMed PubMed Central Google Scholar
Huang, W. & Marth, G. EagleView: a genome assembly viewer for next-generation sequencing technologies. Genome Res. 18, 1538–1543 (2008).
Article CAS PubMed PubMed Central Google Scholar
Farrer, R. A., Kemen, E., Jones, J. D. G. & Studholme, D. J. De novo assembly of the Pseudomonas syringae pv. syringae B728a genome using Illumina/Solexa short sequence reads. FEMS Microbiol. Lett. 291, 103–111 (2009).
Article CAS PubMed Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Article CAS PubMed Google Scholar
Kent, W. J. BLAT — the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Article CAS PubMed PubMed Central Google Scholar
Ning, Z., Cox, A. J. & Mullikin, J. C. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729 (2001).
Article CAS PubMed PubMed Central Google Scholar
Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).
Article CAS PubMed PubMed Central Google Scholar
Wu, T. D. & Watanabe, C. K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).
Article CAS PubMed Google Scholar
Smith, A. D., Xuan, Z. & Zhang, M. Q. Using quality scores and longer reads improves accuracy of Solexa read mapping. BMC Bioinformatics 9, 128 (2008).
Article PubMed PubMed Central Google Scholar
Prüfer, K. et al. PatMaN: rapid alignment of short sequences to large databases. Bioinformatics 24, 1530–1531 (2008).
Article PubMed PubMed Central Google Scholar
Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).
Article CAS PubMed Google Scholar
Jiang, H. & Wong, W. H. SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24, 2395–2396 (2008).
Article CAS PubMed PubMed Central Google Scholar
Coarfa, C. & Milosavljevic, A. Pash 2.0: scaleable sequence anchoring for next-generation sequencing technologies. Pac. Symp. Biocomput. 102–113 (2008).
Fejes, A. P. et al. FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics 24, 1729–1730 (2008).
Article CAS PubMed PubMed Central Google Scholar
Valouev, A. et al. Genome-wide analysis of transcription factor binding sites based on ChIP–Seq data. Nature Methods 5, 829–834 (2008).
Article CAS PubMed PubMed Central Google Scholar
Stein, L. D. The generic genome browser: a building block for a model organism system database. Genome Res. 12, 1599–1610 (2002).
Article CAS PubMed PubMed Central Google Scholar
Barton, G. et al. EMAAS: an extensible grid-based rich internet application for microarray data analysis and management. BMC Bioinformatics 9, 493 (2008).
Article CAS PubMed PubMed Central Google Scholar
Huntley, D., Tang, Y. A., Nesterova, T. B., Butcher, S. & Brockdorff, N. Genome Environment Browser (GEB): a dynamic browser for visualising high-throughput experimental data in the context of genome features. BMC Bioinformatics 9, 501 (2008).
Article PubMed PubMed Central Google Scholar
Field, D. et al. The minimum information about a genome sequence (MIGS) specification. Nature Biotechnol. 26, 541–547 (2008).
Article CAS Google Scholar
Aury, J. M. High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies. BMC Genomics 9, 603 (2008).
Article PubMed PubMed Central Google Scholar
Reinhardt, J. A. et al. De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae. Genome Res. 19, 294–305 (2009).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We are grateful to S. Kamoun, E. Kemen, S. Foster and M. Pallen for useful discussions and suggestions on the manuscript. This work was supported by Gatsby Foundation core funding to The Sainsbury Laboratory.

Author information

Authors and Affiliations

The Sainsbury Laboratory, Norwich, NR4 7UH, United Kingdom
Daniel MacLean, Jonathan D. G. Jones & David J. Studholme

Authors

Daniel MacLean
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan D. G. Jones
View author publications
You can also search for this author in PubMed Google Scholar
David J. Studholme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David J. Studholme.

Glossary

De novo assembly: Construction of longer sequences, such as contigs or genomes, from shorter sequences, such as sequence reads, without prior knowledge of the order of the reads or reference to a closely related sequence.
Contig: A fragment of genome sequence derived by assembling shorter sequence reads into larger constructs on the basis of overlap between the sequence reads.
Paired-end read: A sequence read known to come from a genomic region within a limited number of nucleotides of another. The extra information puts constraints on how far apart the reads can be placed during assembly or alignment, allowing more accurate placement and construction of contigs.
Epigenetics: The study of inherited changes in gene function that cannot be explained by changes in DNA sequence.
de Bruijn graph: In mathematics, a network structure is properly called a graph. The entities that are connected are called nodes and the connections are called edges. A de Bruijn graph is a graph in which the nodes are sets of symbols (similarly to the nucleotides in a sequence read) and the edges represent overlaps between the symbols. This is a convenient way to represent data, such as overlapping sequence reads.
k-mer: A piece of nucleotide sequence of length k. A k-mer is usually used to indicate a computationally selected subsequence of an experimentally derived sequence, such as a read or a genome.
N₅₀: A measure of contig length. If all contigs generated in an assembly are placed end to end in order of length (longest first), then the N₅₀ is the length of the contig that, when added, causes the total length of the chain to exceed half of the length of the genome being sequenced. The longer the contigs are the longer the contig that would break this barrier.
BLAST: (Basic local alignment and search tool). A computer program for finding sequences in a database that have identity to a query sequence. BLAST has been available for years, and is the most widely used search tool.
MIGS: (Minimum information about a genome sequence). A proposed metadata standard that aims to capture essential species, the source of the strain and other phylogenetic and experimental data about a sequenced organism. Such data collection facilitates the cataloguing and searching of species in large-scale databases.
Finished genome: A genome sequence that has been shotgun sequenced and subjected to post-assembly procedures, such as long PCR, to close the gaps that occur between contigs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

MacLean, D., Jones, J. & Studholme, D. Application of 'next-generation' sequencing technologies to microbial genetics. Nat Rev Microbiol 7, 96–97 (2009). https://doi.org/10.1038/nrmicro2088

Download citation

Published: 23 February 2009
Issue Date: April 2009
DOI: https://doi.org/10.1038/nrmicro2088

This article is cited by

Metabolic properties, gene functions, and biosafety analysis reveal the action of three rhizospheric plant growth-promoting bacteria of Jujuncao (Pennisetum giganteum)
- Richard Yankey
- Ibrahim N. A. Omoor
- Zhanxi Lin
Environmental Science and Pollution Research (2022)
Molecular Identification and In Vitro Plant Growth-Promoting Activities of Culturable Potato (Solanum tuberosum L.) Rhizobacteria in Tanzania
- B. N. Aloo
- E. R. Mbega
- R. Daniel
Potato Research (2021)
Comprehensive transcriptomic and proteomic analyses of antroquinonol biosynthetic genes and enzymes in Antrodia camphorata
- Xiaofeng Liu
- Yongjun Xia
- Lianzhong Ai
AMB Express (2020)
Transcriptome exploration to provide a resource for the study of Auricularia heimuer
- Jian Zhang
- Tingting Sun
- Li Zou
Journal of Forestry Research (2020)
Screening for Candidate Genes Associated with Biocontrol Mechanisms of Bacillus pumilus DX01 Using Tn5 Transposon Mutagenesis and a 2-DE-Based Comparative Proteomic Analysis
- Yunpeng Chen
- Tong Liu
- Lurong Xu
Current Microbiology (2020)

Application of 'next-generation' sequencing technologies to microbial genetics

Key Points

Abstract

Access options

Similar content being viewed by others

Genome sequencing—the dawn of a game-changing era

The chemistry of next-generation sequencing

Beyond assembly: the increasing flexibility of single-molecule sequencing technology

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Related links

DATABASES

Entrez Genome Project

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

This article is cited by

Metabolic properties, gene functions, and biosafety analysis reveal the action of three rhizospheric plant growth-promoting bacteria of Jujuncao (Pennisetum giganteum)

Molecular Identification and In Vitro Plant Growth-Promoting Activities of Culturable Potato (Solanum tuberosum L.) Rhizobacteria in Tanzania

Comprehensive transcriptomic and proteomic analyses of antroquinonol biosynthetic genes and enzymes in Antrodia camphorata

Transcriptome exploration to provide a resource for the study of Auricularia heimuer

Screening for Candidate Genes Associated with Biocontrol Mechanisms of Bacillus pumilus DX01 Using Tn5 Transposon Mutagenesis and a 2-DE-Based Comparative Proteomic Analysis

Search

Quick links

Key Points

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Related links

Related links

DATABASES

Entrez Genome Project

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links