Annotated high-throughput microscopy image sets for validation

Ljosa, Vebjorn; Sokolnicki, Katherine L; Carpenter, Anne E

doi:10.1038/nmeth.2083

Download PDF

Correspondence
Published: 28 June 2012

Annotated high-throughput microscopy image sets for validation

Vebjorn Ljosa¹,
Katherine L Sokolnicki¹ &
Anne E Carpenter¹

Nature Methods volume 9, page 637 (2012)Cite this article

11k Accesses
322 Citations
16 Altmetric
Metrics details

Subjects

A Corrigendum to this article was published on 29 April 2013

This article has been updated

To the Editor:

Choosing among algorithms for analyzing biological images can be a daunting task, especially for nonexperts. Software toolboxes such as CellProfiler^1,2 and ImageJ³ make it easy to try out algorithms on a researcher's own data, but it can still be difficult to assess whether an algorithm will be robust across an entire experiment based on the small subset of images that is practical to examine or annotate. Even if controls are available, a pilot high-throughput experiment may be insufficient to show that an algorithm will robustly identify rare phenotypes and handle the experimental artifacts that will invariably be present in a high-throughput experiment. It is therefore useful to know that a particular algorithm has proven superior on several similar image sets. The performance comparisons presented in papers that introduce new algorithms are often not very helpful for assessing this because each study typically relies on a different test image set (often to the advantage of the proposed algorithm), the algorithms compared may not be the ones the researcher is most interested in and the authors may not have implemented other algorithms as optimally as their own. Although biologists should always also validate algorithms on their own images, it would be useful if developers would quantitatively test new algorithms against a publicly available established collection of image sets. In this way, objective comparison can be made to other algorithms, as tested by the developers of those algorithms. We see a need for such a collection of image sets, together with ground truth and well-defined performance metrics.

Here we present the Broad Bioimage Benchmark Collection (BBBC), a publicly available collection of microscopy images intended as a resource for testing and validating automated image-analysis algorithms. The BBBC is particularly useful for high-throughput experiments and for providing biological ground truth for evaluating image-analysis algorithms. If an algorithm is sufficiently robust across samples to handle high-throughput experiments, low-thoughput applications also benefit because tolerance to variability in sample preparation and imaging makes the algorithm more likely to generalize to new image sets.

Each image set in the BBBC is accompanied by a brief description of its motivating biological application and a set of ground-truth data against which algorithms can be evaluated. The ground truth sets can consist of cell or nucleus counts, foreground and background pixels, outlines of individual objects, or biological labels based on treatment conditions or orthogonal assays (such as a dose-response curve or positive- and negative-control images). We describe canonical ways to measure an algorithm's performance so that algorithms can be compared against each other fairly, and we provide an optional framework to do so conveniently within CellProfiler. For each image set, we list any published results of which we are aware.

The BBBC is freely available from http://www.broadinstitute.org/bbbc/. The collection currently contains 18 image sets, including images of cells (Homo sapiens and Drosophila melanogaster) as well as of whole organisms (Caenorhabditis elegans) assayed in high throughput. We are continuing to extend the collection during the course of our research, and we encourage the submission of additional image sets, ground truth and published results of algorithms.

Author contributions

K.L.S. and V.L. curated image sets and oversaw collection of ground-truth annotations. K.L.S. developed benchmarking pipelines. V.L. defined benchmarking protocols. A.E.C. conceived the idea and guided the work. All authors wrote the manuscript.

Change history

25 February 2013
In the version of this article initially published, funding information was not included. The work was funded by US National Institutes of Health grant R01 GM089652. The error has been corrected in the HTML and PDF versions of the article.

References

Carpenter, A.E. et al. Genome Biol. 7, R100 (2006).
Article Google Scholar
Kamentsky, L. et al. Bioinformatics 27, 1179–1180 (2011).
Article CAS Google Scholar
Schneider, C.A., Rasband, W.S. & Eliceiri, K.W. Nat. Methods 9, 671–675 (2012).
Article CAS Google Scholar

Download references

Acknowledgements

We thank the contributors of image sets as well as V. Uhlmann and the Carpenter laboratory members who have helped annotate them with ground truth. This work was funded by US National Institutes of Health grant R01 GM089652.

Author information

Authors and Affiliations

Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts, USA
Vebjorn Ljosa, Katherine L Sokolnicki & Anne E Carpenter

Authors

Vebjorn Ljosa
View author publications
You can also search for this author in PubMed Google Scholar
Katherine L Sokolnicki
View author publications
You can also search for this author in PubMed Google Scholar
Anne E Carpenter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Vebjorn Ljosa or Anne E Carpenter.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ljosa, V., Sokolnicki, K. & Carpenter, A. Annotated high-throughput microscopy image sets for validation. Nat Methods 9, 637 (2012). https://doi.org/10.1038/nmeth.2083

Download citation

Published: 28 June 2012
Issue Date: July 2012
DOI: https://doi.org/10.1038/nmeth.2083

This article is cited by

Multi-sample \(\zeta \)-mixup: richer, more realistic synthetic samples from a p-series interpolant
- Kumar Abhishek
- Colin J. Brown
- Ghassan Hamarneh
Journal of Big Data (2024)
Annotated dataset for training deep learning models to detect astrocytes in human brain tissue
- Alex Olar
- Teadora Tyler
- Péter Pollner
Scientific Data (2024)
Microenvironmental reorganization in brain tumors following radiotherapy and recurrence revealed by hyperplexed immunofluorescence imaging
- Spencer S. Watson
- Benoit Duc
- Johanna A. Joyce
Nature Communications (2024)
A survey on recent trends in deep learning for nucleus segmentation from histopathology images
- Anusua Basu
- Pradip Senapati
- Krishna Gopal Dhal
Evolving Systems (2024)
Learning to Generalize over Subpartitions for Heterogeneity-Aware Domain Adaptive Nuclei Segmentation
- Jianan Fan
- Dongnan Liu
- Weidong Cai
International Journal of Computer Vision (2024)

Annotated high-throughput microscopy image sets for validation

Subjects

Change history

25 February 2013

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

This article is cited by

Multi-sample \(\zeta \)-mixup: richer, more realistic synthetic samples from a p-series interpolant

Annotated dataset for training deep learning models to detect astrocytes in human brain tissue

Microenvironmental reorganization in brain tumors following radiotherapy and recurrence revealed by hyperplexed immunofluorescence imaging

A survey on recent trends in deep learning for nucleus segmentation from histopathology images

Learning to Generalize over Subpartitions for Heterogeneity-Aware Domain Adaptive Nuclei Segmentation

Search

Quick links

Subjects

Change history

25 February 2013

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Multi-sample \(\zeta \)-mixup: richer, more realistic synthetic samples from a p-series interpolant

Annotated dataset for training deep learning models to detect astrocytes in human brain tissue

Microenvironmental reorganization in brain tumors following radiotherapy and recurrence revealed by hyperplexed immunofluorescence imaging

A survey on recent trends in deep learning for nucleus segmentation from histopathology images

Learning to Generalize over Subpartitions for Heterogeneity-Aware Domain Adaptive Nuclei Segmentation

Search

Quick links