Perspective

The minimum information about a genome sequence (MIGS) specification

Published online:

Abstract

With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the 'transparency' of the information contained in existing genomic databases.

  • Subscribe to Nature Biotechnology for full access:

    $2.5E+2

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    et al. The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 33, 5691–5702 (2005).

  2. 2.

    , , & The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 36 (database issue), D475–D479 (2008).

  3. 3.

    & Ecological perspectives on our complete genome collection. Ecology Letters 8, 1334–1345 (2005).

  4. 4.

    et al. The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific. PLoS Biol. [online] 5, e77 (2007).

  5. 5.

    et al. Using pyrosequencing to shed light on deep mine microbial ecology under extreme hydrogeologic conditions. BMC Genomics 7, 57 (2006).

  6. 6.

    Committee on Metagenomics: Challenges and Functional Applications, National Research Council. The New Science of Metagenomics: Revealing the Secrets of Our Microbial Planet (National Academies Press, Washington, DC, 2007).

  7. 7.

    & Bacterial whole-genome sequences: minimal information and strain availability. Microbiology 150, 2017–2018 (2004).

  8. 8.

    , , , & Genome properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics. Bioinformatics 21, 293–306 (2005).

  9. 9.

    et al. Megx.net—database resources for marine ecological genomics. Nucleic Acids Res. 34 (database issue), D390–D393 (2006).

  10. 10.

    , , , & A plea for DNA taxonomy. Trends Ecol. Evol. 18, 70–74 (2003).

  11. 11.

    et al. Sequencing genomes from single cells by polymerase cloning. Nat. Biotechnol. 24, 680–686 (2006).

  12. 12.

    et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

  13. 13.

    , , & Advanced sequencing technologies: methods and goals. Nat. Rev. Genet. 5, 335–344 (2004).

  14. 14.

    (ed.) Bergey's Manual of Systematic Bacteriology, 2nd edn., Vol. 1, (Springer, New York, 2001).

  15. 15.

    et al. Meeting report: eGenomics: cataloguing our complete genome collection I. Comp. Funct. Genomics 6, 357–362 (2006).

  16. 16.

    & Cataloguing our current genome collection. Microbiology 151, 1016–1019 (2005).

  17. 17.

    Time for a change. Nature 441, 289 (2006).

  18. 18.

    et al. A strategy capitalizing on synergies: the Reporting Structure for Biological Investigation (RSBI) working group. OMICS 10, 164–171 (2006).

  19. 19.

    et al. Promoting coherent minimum reporting requirements for biological and biomedical investigations: the MIBBI project. Nat. Biotechnol. (in the press).

  20. 20.

    , , & Sequenced strains must be saved from extinction. Nature 414, 148 (2001).

  21. 21.

    et al. Meeting report: eGenomics: cataloguing our complete genome collection III. Comp. Funct. Genomics 2007, 47304 (2007).

  22. 22.

    et al. Concept of sample in OMICS technology. OMICS 10, 127–137 (2006).

  23. 23.

    et al. Development of FuGO: an ontology for functional genomics investigations. OMICS 10, 199–204 (2006).

  24. 24.

    et al. Evidence standards in experimental and inferential INSDC Third Party Annotation data. OMICS 10, 105–113 (2006).

  25. 25.

    et al. IMG/M: a data management and analysis system for metagenomes. Nucleic Acids Res. 36 (database issue), D534–D538 (2008).

  26. 26.

    et al. Genome sequence of Silicibacter pomeroyi reveals adaptations to the marine environment. Nature 432, 910–913 (2004).

  27. 27.

    , & Overview of the marine roseobacter lineage. Appl. Environ. Microbiol. 71, 5665–5677 (2005).

  28. 28.

    et al. The marine viromes of four oceanic regions. PLoS Biol. 4, e368 (2006).

  29. 29.

    et al. Whole genome analysis of the marine Bacteroidetes 'Gramella forsetii' reveals adaptations to degradation of polymeric organic matter. Environ. Microbiol. 8, 2201–2213 (2006).

  30. 30.

    et al. Complete genome sequence of the marine planctomycete Pirellula sp. strain 1. Proc. Natl. Acad. Sci. USA 100, 8298–8303 (2003).

  31. 31.

    et al. The genome of Desulfotalea psychrophila, a sulfate-reducing bacterium from permanently cold Arctic sediments. Environ. Microbiol. 6, 887–902 (2004).

  32. 32.

    , & Get the most out of your metagenome: computational analysis of environmental sequence data. Curr. Opin. Microbiol. 10, 490–498 (2007).

Download references

Acknowledgements

We would like to thank the UK National Institute of Environmental eScience (NIEeS) and the European Bioinformatics Institute (EBI) for hosting GSC workshops and the UK Natural Environmental Research Council for providing funds for coordination (NE/D01252X/1) and infrastructure building activities (NE/E007325/1).

Author information

Affiliations

  1. Natural Environmental Research Council Centre for Ecology and Hydrology, Oxford OX1 3SR, UK.

    • Dawn Field
    • , Tanya Gray
    • , Paul Swift
    • , Adrian Tett
    • , Sarah Turner
    •  & Gareth Wilson
  2. Michigan State University, East Lansing, Michigan 48824, USA.

    • George Garrity
    •  & James Cole
  3. School of Computer Science, University of Manchester, Manchester M13 9PL, UK.

    • Norman Morrison
    • , David Hancock
    •  & Robert Stevens
  4. NERC Environmental Bioinformatics Centre, Oxford Centre for Ecology and Hydrology, Oxford OX1 3SR, UK.

    • Norman Morrison
    •  & David Hancock
  5. J. Craig Venter Institute (JCVI), 9704 Medical Center Drive, Rockville, Maryland 20850, USA.

    • Jeremy Selengut
    • , Samuel V Angiuoli
    • , Michael Ashburner
    • , Nelson Axelrod
    • , Dan Haft
    • , Leonid Kagan
    • , Saul Kravitz
    • , Kelvin Li
    • , Barbara Methe
    •  & Karen Nelson
  6. European Molecular Biology Laboratory (EMBL) Outstation, European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    • Peter Sterk
    • , Guy Cochrane
    • , Nadeem Faruque
    • , Henning Hermjakob
    • , Susanna-Assunta Sansone
    • , Chris Taylor
    •  & Bob Vaughan
  7. National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, Maryland 20894, USA.

    • Tatiana Tatusova
    •  & Ilene Mizrachi
  8. Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

    • Nicholas Thomson
    • , Christiane Hertz-Fowler
    •  & Julian Parkhill
  9. Plymouth Marine Laboratory, Prospect Place, Plymouth PL1 3DH, UK.

    • Michael J Allen
    • , Jack Gilbert
    •  & Ian Joint
  10. Institute for Genome Sciences and Department of Epidemiology and Preventive Medicine, University of Maryland School of Medicine, 20 Penn Street, Baltimore, Maryland 21201, USA.

    • Samuel V Angiuoli
    • , Michael Ashburner
    •  & Owen White
  11. Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK.

  12. Department of Biology, University of York Box 373, York, YO10 5YW, UK.

    • Sandra Baldauf
  13. National Institute of Environmental eScience, Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EQ, UK.

    • Stuart Ballard
  14. US Department of Energy (DOE) Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California 94598, USA.

    • Jeffrey Boore
  15. Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 S9, B-9000 Ghent, Belgium.

    • Peter Dawyndt
  16. Laboratory of Microbiology, Ghent University, K.L. Ledeganckstraat 35, B-9000 Ghent, Belgium.

    • Paul De Vos
  17. BCCM/LMG Bacteria Collection, Ghent University, K.L. Ledeganckstraat 35, B-9000 Ghent, Belgium.

    • Paul De Vos
  18. Penn State University, 208 Mueller Laboratory, University Park, Pennsylvania 16802, USA.

    • Claude dePamphilis
  19. Department of Computer Science, 5500 Campanile Drive, San Diego State University, San Diego, California 92182, USA.

    • Robert Edwards
  20. Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois 60439, USA.

    • Robert Edwards
    •  & Natalia Maltsev
  21. SymBio Corporation, 1455 Adams Drive, Menlo Park, California 94025, USA.

    • Robert Feldman
  22. California Institute for Telecommunications and Information Technology (Calit2), a University of California San Diego (UCSD)/University of California Irvine partnership, 9500 Gilman Drive, La Jolla, California, 92093, USA.

    • Paul Gilna
  23. Microbial Genomics Group, Max Planck Institute for Marine Microbiology and Jacobs University Bremen, Bremen 28359 Germany.

    • Frank Oliver Glöckner
    •  & Renzo Kottmann
  24. Department of Ecology and Evolutionary Biology and University of Colorado Natural History Museum, 218 UCB, University of Colorado, Boulder, Colorado 80309, USA.

    • Philip Goldstein
    •  & Robert Guralnick
  25. Microbial Ecology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Building 400-404, Walnut Creek, California 94598, USA.

    • Phil Hugenholtz
  26. The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA.

    • Matthew Kane
    •  & Lita Proctor
  27. School of Computing, Napier University, Merchiston Campus, 10 Colington Road Edinburgh, Scotland, EH10 5DT, UK.

    • Jessie Kennedy
  28. Department of Terrestrial Microbial Ecology, Netherlands Institute of Ecology, Centre for Terrestrial Ecology, PO Box 40, Heteren 6666 ZG, Netherlands.

    • George Kowalchuk
  29. BIATECH Institute, 19310 North Creek Parkway South, Suite 115, Bothell, Washington 98011, USA.

    • Eugene Kolker
  30. Division of Biomedical and Health Informatics, Department of Medical Education and Biomedical Information, University of Washington, Seattle, Washington 91895, USA.

    • Eugene Kolker
  31. Seattle Children's Hospital Research Institute, 1900 9th Avenue, Seattle, Washington 98101, USA.

    • Eugene Kolker
  32. Genome Biology Program, DOE Joint Genome Institute, 2800 Mitchell Drive, Building 400-404, Walnut Creek, California 94598, USA.

    • Nikos Kyrpides
  33. Department of Plant Biology, University of Georgia, Athens, Georgia 30602-7271, USA.

    • Jim Leebens-Mack
  34. Department of Molecular and Cell Biology, University of California, 539 Life Sciences Addition, Berkeley, California 94720-3200, USA.

    • Suzanna E Lewis
  35. School of Computing Science, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.

    • Allyson L Lister
    • , Phillip Lord
    •  & Anil Wipat
  36. Centre for Integrative Systems Biology of Ageing and Nutrition (CISBAN), Henry Wellcome Laboratory for Biogerontology Research, Newcastle University, Newcastle General Hospital, Newcastle upon Tyne NE4 6BE, UK.

    • Allyson L Lister
    •  & Anil Wipat
  37. Biological Data Management and Technology Center, Computational Research Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, USA.

    • Victor Markowitz
  38. Department of Ecology and Evolutionary Biology, University of California, 455 Steinhaus Hall, Irvine, California 92697, USA.

    • Jennifer Martiny
  39. Molecular Infectious Diseases Group, Weatherall Institute of Molecular Medicine and University of Oxford Department of Paediatrics, John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK.

    • Richard Moxon
  40. Department of Biology, Howard University, 415 College Street, NW, Washington, DC 20059, USA.

    • Karen Nelson
  41. LTER Network Office, Department of Biology, University of New Mexico, Albuquerque, New Mexico 87171, USA.

    • Ingio San Gil
  42. SIMBIOS Centre, University of Abertay Dundee, Dundee, DD1 1HG, UK.

    • Andrew Spiers
  43. Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems, Mishima, Shizuoka 411-8540, Japan.

    • Yoshio Tateno
  44. Center for Biological Sequence Analysis, The Technical University of Denmark, Lyngby, DK-2800 Kgs. Lyngby, Denmark.

    • David Ussery
  45. Department of Molecular Biology, University of Wyoming, Laramie, Wyoming 82071, USA.

    • Naomi Ward
  46. Center for Bioinformatics and Department of Genetics, University of Pennsylvania School of Medicine, 14th Floor Blockley Hall, 423 Guardian Drive, Philadelphia, Pennsylvania 19104, USA.

    • Trish Whetzel

Authors

  1. Search for Dawn Field in:

  2. Search for George Garrity in:

  3. Search for Tanya Gray in:

  4. Search for Norman Morrison in:

  5. Search for Jeremy Selengut in:

  6. Search for Peter Sterk in:

  7. Search for Tatiana Tatusova in:

  8. Search for Nicholas Thomson in:

  9. Search for Michael J Allen in:

  10. Search for Samuel V Angiuoli in:

  11. Search for Michael Ashburner in:

  12. Search for Nelson Axelrod in:

  13. Search for Sandra Baldauf in:

  14. Search for Stuart Ballard in:

  15. Search for Jeffrey Boore in:

  16. Search for Guy Cochrane in:

  17. Search for James Cole in:

  18. Search for Peter Dawyndt in:

  19. Search for Paul De Vos in:

  20. Search for Claude dePamphilis in:

  21. Search for Robert Edwards in:

  22. Search for Nadeem Faruque in:

  23. Search for Robert Feldman in:

  24. Search for Jack Gilbert in:

  25. Search for Paul Gilna in:

  26. Search for Frank Oliver Glöckner in:

  27. Search for Philip Goldstein in:

  28. Search for Robert Guralnick in:

  29. Search for Dan Haft in:

  30. Search for David Hancock in:

  31. Search for Henning Hermjakob in:

  32. Search for Christiane Hertz-Fowler in:

  33. Search for Phil Hugenholtz in:

  34. Search for Ian Joint in:

  35. Search for Leonid Kagan in:

  36. Search for Matthew Kane in:

  37. Search for Jessie Kennedy in:

  38. Search for George Kowalchuk in:

  39. Search for Renzo Kottmann in:

  40. Search for Eugene Kolker in:

  41. Search for Saul Kravitz in:

  42. Search for Nikos Kyrpides in:

  43. Search for Jim Leebens-Mack in:

  44. Search for Suzanna E Lewis in:

  45. Search for Kelvin Li in:

  46. Search for Allyson L Lister in:

  47. Search for Phillip Lord in:

  48. Search for Natalia Maltsev in:

  49. Search for Victor Markowitz in:

  50. Search for Jennifer Martiny in:

  51. Search for Barbara Methe in:

  52. Search for Ilene Mizrachi in:

  53. Search for Richard Moxon in:

  54. Search for Karen Nelson in:

  55. Search for Julian Parkhill in:

  56. Search for Lita Proctor in:

  57. Search for Owen White in:

  58. Search for Susanna-Assunta Sansone in:

  59. Search for Andrew Spiers in:

  60. Search for Robert Stevens in:

  61. Search for Paul Swift in:

  62. Search for Chris Taylor in:

  63. Search for Yoshio Tateno in:

  64. Search for Adrian Tett in:

  65. Search for Sarah Turner in:

  66. Search for David Ussery in:

  67. Search for Bob Vaughan in:

  68. Search for Naomi Ward in:

  69. Search for Trish Whetzel in:

  70. Search for Ingio San Gil in:

  71. Search for Gareth Wilson in:

  72. Search for Anil Wipat in:

Corresponding author

Correspondence to Dawn Field.

Supplementary information

Word documents

  1. 1.

    Supplementary Text and Figures

    Supplementary Table 1