One purpose of the biomedical literature is to report results in sufficient detail that the methods of data collection and analysis can be independently replicated and verified. Here we present reporting guidelines for gene expression localization experiments: the minimum information specification for in situ hybridization and immunohistochemistry experiments (MISFISHIE). MISFISHIE is modeled after the Minimum Information About a Microarray Experiment (MIAME) specification for microarray experiments. Both guidelines define what information should be reported without dictating a format for encoding that information. MISFISHIE describes six types of information to be provided for each experiment: experimental design, biomaterials and treatments, reporters, staining, imaging data and image characterizations. This specification has benefited the consortium within which it was developed and is expected to benefit the wider research community. We welcome feedback from the scientific community to help improve our proposal.


High-throughput analyses of gene expression in biological samples (for example, using microarrays or proteomics) often do not provide information about the cell types or spatial domains within tissues in which genes are expressed and may not reveal dynamic or transient gene expression. Therefore, such analyses are often followed by experiments to determine the location and degree of gene expression in specific cell types within the tissue by probing with reporters for the genes of interest. In addition, high-throughput analyses of fresh samples can be supplemented with a wealth of clinical information associated with tissue samples in large collections worldwide. However, studies that use in situ hybridization (ISH) and immunohistochemistry (IHC) staining and/or their resulting images are often presented without the information needed to understand the images or the methods that produced them. Furthermore, neither the reagents and methods nor the results are easily searchable through current biomedical literature databases such as PubMed. As the interpretation of ISH and IHC stains may differ between observers, between different image analysis platforms and programs, and even between different sessions using the same image analysis platform and program1, communication of the methods used is critically important for evaluating published work.

Data annotation specifications for microarray experiments1,2,3,4 have begun to benefit the biomedical research community. Many researchers participated in the debate around MIAME and contributed to its development. The accessibility of data increased significantly, aided by common exchange formats; open-source software and ontologies were developed by many groups; and discussion forums promoted interaction between manufacturers and experimenters. Similar guidelines are under development for other high-throughput technologies5,6,7,8,9,10.

In the area of microscopy images, data formats that facilitate the exchange of data have been proposed. The XML (Extensible Markup Language) data format for tissue microarrays does not include minimum-information reporting guidelines11. Also available is Open Microscopy Environment (OME), which provides a flexible XML data format for storing and transmitting metadata for microscopy image datasets (see Table 1 for URL). However, there is no comprehensive specification for facilitating the exchange of data from visual interpretation–based experiments that seek to determine the abundance and/or localization of proteins or mRNA in tissues (hereafter referred to as 'gene expression localization experiments'), such as ISH and IHC.

Table 1: Organizations, standards and resources

The goals of MISFISHIE

MISFISHIE describes the minimum information to be provided when publishing, making public or exchanging results from visual interpretation–based tissue gene expression localization experiments, such as ISH, IHC, lectin affinity histochemistry, and experiments that involve reporter gene constructs (for example, green fluorescent protein (GFP) and β-galactosidase). Compliance with this specification should enable researchers at different laboratories to fully evaluate data and reproduce experiments. Although MISFISHIE facilitates the identification of specific sources of variability, it cannot, and does not aim to, reduce this variability. However, if complete information, including raw image data, is always provided, the original interpretations may be reevaluated by other researchers. Like MIAME12, MISFISHIE prescribes that the most relevant details within each of the sets of broad categories of information be provided, relying on data producers and reviewers to ensure that each category contains the information deemed necessary to allow readers to fully assess and reproduce experiments.

MISFISHIE does not dictate a specific format for reporting information. We intend to develop a data model based on the concepts of MAGE-OM (MicroArray Gene Expression Object Model) and software based on the MAGEstk (MicroArray Gene Expression software tool kit)3. This model and the associated XML-based mark-up language will provide a data format for archiving or transferring data. Because a major revision of MAGE-OM, the FuGE-OM (Functional Genomics Experiment Object Model)13, is at present being developed to accommodate data from other functional genomics experiments, the MISFISHIE-derived object model will probably be an extension of FuGE-OM rather than a separate construct. A simpler, non-XML format following the concepts of MAGE-TAB14 may also facilitate data sharing in cases where simplicity is most important15.

MISFISHIE was designed to function together with other technology-related specifications such as MIAME and MIAPE (Minimum Information About a Proteomics Experiment)16 to support functional genomics investigations. We anticipate that MISFISHIE will be integrated with other MGED (Microarray and Gene Expression Data) Society standards17, in particular through the Reporting Structure for Biological Investigation (RSBI) working group18 and the Minimum Information for Biological and Biomedical Investigations (MIBBI) project19. Clearly, the goal of integrating different data types will be best served by a common reporting structure. Separation of the minimum information specification and the data format is important because in the data format there should be scope to provide unlimited further information beyond the minimum specification and there should be the ability to encode incomplete information for optimal flexibility. Furthermore, broad acceptance of a minimum information standard would greatly aid the design of a data model.

To facilitate data transfer between some existing expression databases, a MISFISHIE-compliant XML data format has been developed. A document type definition (DTD) was developed for three expression databases: ANISEED (Table 1), COMPARE20 and 4DXpress21. Its format follows MISFISHIE. This DTD and an associated example are available at ANISEED (http://crfb.univ-mrs.fr/aniseed/exchange_format.php) and at COMPARE (http://compare.ibdml.univ-mrs.fr/exchange_format.php).

It has long been appreciated that improved standards for IHC are needed. However, the focus has been on developing standardized technical protocols that would produce more uniform staining22 or on reducing subjectivity in interpreting histological sections23. MISFISHIE does not endorse standardized methodologies or data interpretation but rather seeks to promote complete disclosure of the methodologies used.

Guidelines for tumor-marker prognostic studies, known as REMARK24, were recently established. REMARK encompasses outcome studies based on tumor markers of any kind, not just those of IHC. MISFISHIE encompasses nearly any study employing IHC or ISH regardless of context, such as a tumor-marker study or a zebrafish embryo study. We expect that requirements pertaining to specialized subdomains (for example, clinical prognostic studies) will be added to MISFISHIE in the future.

Existing databases for gene expression localization data provide a useful framework from which to build a specification. Two databases for the mouse research community, the Mouse Gene Expression Database (GXD)25 and the Edinburgh Mouse Atlas Gene Expression (EMAGE) database26, influenced the design of MISFISHIE. We replaced mouse-specific fields with more organism-neutral ones and eliminated fields deemed unnecessary. In these databases, many experiments entered by curators using information in journal articles have empty fields because the papers lacked sufficient detail. MISFISHIE-compliant publications will result in more complete database descriptions. Although MISFISHIE is primarily designed for peer-reviewed journal articles, it will guide database development as well. For example, the release of ANISEED version 3.0 is based on MISFISHIE rules, and the new schema of this database is MISFISHIE compliant. The inclusion of specific experimental details, such as tissue type, reagents and methods, will allow investigators to more efficiently find precedents for their experiments. For example, an investigator might rapidly search all publications that reported immunoperoxidase localization of membrane metallo-endopeptidase (CD10, MME) in the human prostate using a database and retrieve information on how the gene localization experiments were conducted.

Proposed guidelines

An abbreviated MISFISHIE checklist is provided in Box 1. A printable version is available at http://scgap.systemsbiology.net/standards/misfishie/MISFISHIE_Checklist_2007-10-28.xls. The complete checklist is available at http://scgap.systemsbiology.net/standards/misfishie. One example of real experimental data annotated according to MISFISHIE is given in Supplementary Note 1 online; more examples accompany the complete checklist at the preceding URL.

Box 1: Box 1 Checklist

This checklist should be used in conjunction with the full specification, not instead of it.

  • Experiment design

    • Experiment description

    • Assay type(s) (IHC, ISH, GFP, etc.)

    • Experimental design (multiple reporter survey, specimen variation)

    • Experimental factors (variables in assays such as reporter or specimen, etc.)

    • Total number of assays performed

    • (optional) URL for more information

    • Contact information

  • Biomaterials and treatments

    • Attributes of the individual (organism, sex, strain, line, developmental stage, age, etc.)

    • Physiologic state (e.g., normal versus disease)

    • Relevant exogenous factors (treatment, special diet, etc.)

    • Anatomic source of specimens

    • Provider of the specimens

    • Assay preparation protocol (enough to reproduce?)

  • Reporter (probe or antibody) information

    • Unambiguous reporter identification, ideally genomic

    • Full sequence or clone ID of the reporters

    • Protocol for obtaining exact reporter (purchase from_____, create, etc.)

    • Other important attributes (mono- or polyclonal, generating organism, etc.)

  • Staining protocols and parameters

    • Detection method (number of reporters, detection reagent and systems)

    • Staining protocol (enough to reproduce?)

    • Details about positive and negative controls

  • Imaging data and parameters

    • The digital images for each assay (can download to your computer and explore?)

    • Detection method

    • (optional) Control images and imaging acquisition protocol

  • Image characterizations

    • Definition of structural units (from ontology or manual definition)

    • Definition of intensity scale

    • Characterization of results in tabular form (digital or printed)

    • (optional) Characterization protocol

The checklist covers six types of information; for each, the guiding principle is to supply enough information to allow the experiment to be reproduced.

  1. Experimental design

  2. Biomaterials (specimens) and treatments (section or whole-mount preparation)

  3. Reporters (probes or antibodies)

  4. Staining

  5. Imaging data

  6. Image characterizations

Ontologies, such as the MGED Ontology (MO)4 or Ontology for Biomedical Investigations (OBI; formerly named FuGO)27 are extremely advantageous as a source of descriptors because they facilitate computational searches of data. For terms outside the scope of OBI, such as those in anatomy, another appropriate ontology may be used. A good list of ontologies is maintained at the Ontologies for Biology Organization (OBO) website (Table 1). Use of OBI and other ontologies will be especially important as MISFISHIE-supporting applications and databases are developed. Many of the terms used in this specification are already defined in OBI.

Experimental design. The experiment as a whole is described by the following:

  1. Experiment description: the aims of the experiment.

  2. Assay type(s): for example, IHC, ISH, lectin affinity histochemistry, cell lineage– or tissue-specific reporter expression.

  3. Experiment design type: for example, comparisons of normal versus diseased tissue, of multiple tissue or embryo specimens of similar type, or of multiple probes or antibodies applied to the same tissue; a localization screen; etc. The MGED Ontology ExperimentDesignType has many entries categorizing design type.

  4. Experimental factors: the parameters or conditions that are tested, such as probe or antibody, disease state, genetic variation, structural unit, age, etc. Again, the MGED Ontology is a rich source of terms for describing the factors being tested.

  5. Total number of assays performed in the experiment: an assay is defined as one instance of a hybridization/stain of a single specimen with a single reporter. Thus, the result of a tissue microarray consisting of a 10 × 10 array of tissues would be counted as 100 assays. If replicates or reruns are a component of the experimental design, provide details that should include number of replicates per tissue, per reporter, etc.

  6. URL of any websites or database accession numbers (if available) pertinent to the experiment.

  7. Contact information for communicating with the experimenters.

Biomaterials (specimens) and treatments (section or whole-mount preparation). Describing specimens comprehensively is challenging, as they may have dozens or hundreds of characteristics. This is especially true for material from human subjects when clinical information is available. Characteristics that are known to differ among specimens should be provided with each specimen, whereas common attributes of all the specimens may be provided only once. The biological sample is described by the following:

  1. Origin of the specimens.

    1. Attributes of the individual(s). The organism species must be named, preferably using the US National Center for Biotechnology Information (NCBI) taxonomy, and for non-human organisms the strain and mutant alleles should be named according to the accepted standards for that organism. Other attributes may include, but are not limited to, sex, age, developmental stage, genotype, and phenotype.

    2. Physiologic state of the individual(s) (normal versus diseased).

    3. Relevant exogenous factors (for example, treatment, special diet).

    4. Anatomic source of the tissue or cell sample.

    5. Provider of the specimens.

    The information necessary to reproduce the biomaterials is not limited to the above examples. Use of an ontology or controlled vocabulary is highly encouraged, although a standardized set of terms and a single, widely accepted ontology is not yet available. The rationale for providing specific structural detail is that the location of an object, such as a cell type that is being studied, may correlate with expression of a specific gene by that cell type. Structural detail may be important not only for cases where gene expression depends on tissue handling (for example, there is stronger labeling at the specimen edges) but also in cases where there is heterogeneity even within a single microanatomical unit (for example, in lung tumors, cell cycle regulatory genes are highest at the periphery)28.

  2. Manner of preparation of the specimens.

    1. Nature of the specimens (for example, whole tissue, whole mounts of tissue, tissue sections, thickness of sections, whole cells, or sections of cells).

    2. Manner in which the specimens were prepared for the experiments (for example, fixed specimens, with type of fixative and duration of fixation; fresh, non-fixed, non-frozen specimens; or non-fixed, frozen specimens; sections mounted on slides versus sections floating in reagents).

    3. Protocols used. Referencing previously published protocols is permissible if the protocols are appropriately detailed and were strictly followed.

    Sensitivity of the immunoreaction of some gene products to fixation is exemplified by the observation that cyclin-dependent kinase inhibitor p27Kip1 (CDKN1B) was least frequent and least intense in prostate cancer cells that were farthest from the cut surface of a fixed tissue. These were the cells that were least rapidly fixed29.

Reporters (probes or antibodies). Reporters (probes, lectins or antibodies) can differ in reactivity from lot to lot and from manufacturer to manufacturer. A manufacturer's literature usually provides most of the needed information but may not be permanent. For privately produced reporters, enough information should be provided to enable another lab to generate them. Validation of reporters in the current literature is often poor. MISFISHIE does not at present require that researchers validate reporters, but such validation is encouraged and should be reported when performed.

  1. Unambiguous genomic identification of each reporter:

    1. For in situ hybridizations, provide the corresponding GenBank/EMBL/DDBJ accession number and, if applicable, the start and end nucleotide positions of the probe within that sequence. Also, provide the accession number version or database release version.

    2. For antibodies, provide the protein identifier, including specific version information for the accession number or database release.

  2. Full sequence of each probe or clone number of each antibody. For fluorescent protein experiments, the promoter sequence should be specified. In each case, provide the method by which the reporter was characterized.

    1. If the sequence or clone number is not known, the template or clone must be made publicly available. Provide specific details on how the template or clone may be obtained.

    2. In tissue localization experiments based on expression of a fluorescent protein reporter gene fused to the promoter of a gene of interest, what is important is not the sequence of the reporter but the sequence of the promoter, which confers cell and tissue specificity on the reporter.

  3. Protocol(s) for how the reporters were designed and produced or the source from which they were obtained.

    1. For reporters purchased from a company, the company name, address, catalog number and lot number should be provided.

    2. For a custom-made antibody, the putative antigen and references to studies that characterize the sensitivity and specificity of the antibody in tissue immunostains should be given.

  4. Additional attributes of the reporter:

    1. For antibodies, the type of primary antibody (monoclonal or polyclonal), the immunoglobulin isotype and the organism in which the antibody was generated.

    2. For lectins, the full name (for example, Dolichos biflorus), the source of the lectin (for example, which company produced it), how it was detected (for example, whether it was fluorescently labeled or biotinylated, with follow-up histochemical analysis), and how it was labeled (if, for example, the investigators labeled the lectin themselves, they should give source of the reagents, the method and/or the labeling kit).

Staining. Staining protocols vary considerably, and the merits of standardizing them have been discussed extensively in the literature.

  1. Number of detectable reporters in the hybridization or stain (for example, more than one for multiple-dye fluorescence microscopy) and details about the detection method:

    1. Detection reagent (for example, fluors, enzyme substrates, gold particles).

    2. Source of the detection system and description of the reaction.

  2. Protocol to produce the hybridization or immunostain, including a description of how the tissue (organism, organ or section) was mounted onto the slide or substrate and treatments of the section (for example, IHC protocol inclusive of parameters such as buffer, temperature, post-wash conditions, etc.). Referencing previously published protocols is permissible if the protocols are appropriately detailed and were strictly followed. Also:

    1. What steps, if any, were taken to decrease nonspecific reaction product. For example, in immunoperoxidase experiments, the specimen preparation may be preincubated with albumin solution to block nonspecific binding or with peroxide solution to block signal due to endogenous peroxidase.

    2. Use of an antigen or gene product retrieval method.

  3. Information about assay controls: the nature of both positive and negative tissue and reporter controls (or state if controls were not performed). The same level of detail for the tissue controls should be reported as for the cells or tissues that are being studied. Optionally, provide specificity reporter controls, such as competitive inhibition with either purified protein or peptide in IHC.

Imaging data. Although the MIAME specification stops short of requiring microarray image data, we propose that representative IHC or ISH images be provided, as interpretation of these images varies among observers. Images are not needed to reproduce an experiment, but they aid in the analysis. Both positive and negative results should be reported.

Repositories for images from gene expression localization experiments exist for several model organisms, such as GXD25 and EMAGE26 for mouse and ZFIN30 for zebrafish, but not for human. A general, organism-independent database would be very valuable, as it could provide examples of tissue localization studies, serve as a reference site for verifying the tissue localizations of reporter reagents and provide accession numbers for publications. There are two projects that aim to provide these features. MorphBank is an available general purpose image repository for biological research. BioImage is an image repository under construction (ref. 31).

The information on imaging data should include:

  1. Digital images for each assay in the study, digitally available for download without charge. Images should be of sufficient resolution to allow independent characterization and provided in a standard file format (for example, JPEG, PNG, GIF, TIFF). Images should be named or tagged with the reporter and specimen that they represent.

  2. Detection method by which hybridization or staining is observed (for example, for each channel, fluorescence excitation and emission wavelengths if more than one reporter is used). If the detection method is the same for all images, it need only be mentioned once.

  3. Images for the controls are optional.

Image characterizations. Interpretations of the results should be reported in categorized tabular format so that they can be easily stored in a database, queried and compared with other expression data. The following minimum requirements can be supplemented with further characterizations as needed.

  1. Ontology entries, including reference to the ontology terms, accession numbers or terms and definitions if sufficient detail cannot be found in an existing ontology32,33,34,35 for individual structural units used for classification. (Note that some ontologies, such as the College of American Pathologists' Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT) and the Unified Medical Language System (UMLS) of the US National Institutes of Health's National Library of Medicine, may contain licensing restrictions that make them unavailable to some or limit the use of the terms; a MISFISHIE-compliant document that contains SNOMED CT entries or some UMLS entries may not be legally redistributable36). Structural units include organ, tissue, cell and subcellular component. List and characterize only the relevant structural units and not those that are visible in assays or slides but irrelevant.

  2. Intensity scale, ideally one from the MGED Ontology. For example, a three-level scale of present, absent or equivocal might be appropriate for evaluating IHC stains. However, any appropriate scale may be used as long as each gradation of intensity in the scale is defined.

  3. For each relevant structural unit in each assay or image:

    1. Staining intensity or the fraction of the structural unit's population showing each intensity (see example below).

    2. Other optional annotations or characterizations of the structural unit: for example, feature density, qualitative characteristics or spatial distribution of the structural unit or staining. The use of referenced ontology terms is encouraged.

    Both positive and negative calls of staining relevant to the experiment should be reported. A negative result is an upper limit to the expression level, where the limit is usually not well known. If some structural units cannot be characterized for some reporters, corresponding calls may be null. For example:

    Luminal epithelial cell: present

    Basal epithelial cell: absent

    is sufficient; or, when appropriate, more detail:

    Luminal epithelial cell: 90% present, 10% equivocal, 0% absent

    Basal epithelial cell: 10% present, 10% equivocal, 80% absent

    Unless only a few expression calls are presented, it is clearest if the calls are presented in tabular form.

  4. Optionally, the protocol for the characterization and information about the basic characterization technique. For example, how many observers performed the characterizations, whether the characterizations were performed from the images themselves or visually through the instrument and any exceptions or assumptions made in characterizing the data. One example of a well described characterization protocol may be found in ref. 37. We also note that the use of digital images may have advantages in terms of replication and decreased intra- and interobserver variability38.

Survey of the recent literature

To compare MISFISHIE with current publication practice, we examined articles reporting on IHC or ISH in the last 7 years. Three articles39,40,41 were assessed and discussed by all ten ad hoc reviewers to minimize inter-reviewer variability. Another 29 articles42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69 were assigned to individual reviewers. Each reviewer assessed articles as if reviewing a manuscript submitted to a journal that required MISFISHIE compliance. Compliance for each MISFISHIE subsection was rated on a scale of 0 to 10, where a 10 indicates inclusion of all information needed to understand or reproduce the experiment without making any assumptions. Scores of 8 and 9 were considered a low pass; the reviewer could reproduce the experiment with a few assumptions. One example of a paper deemed MISFISHIE compliant is ref. 64.

Of the 32 papers assessed, four (13%) were deemed MISFISHIE compliant in all six sections (Table 2). Another 28% were out of compliance with only one section, and 31% did not comply in two sections. More than 90% complied with sections 1 and 2 (experimental design; biomaterials and treatments). Compliance for sections 3 and 4 (reporters; staining) was 75%. Section 5 (imaging data) proved to be the most troublesome, with only 16% of the articles compliant. Finally, 47% complied with section 6 (image characterizations). The reviewers felt that the majority of noncompliant papers would require only modest additions to become compliant, with the possible exception of section 5. This section requires that at least one representative image of each assay be made electronically available in a model organism database, a generic image database, a journal's supplemental data web site or even the author's web site (the least preferable option).

Table 2: Summary of statistics from the MISFISHIE assessment survey of a cohort of selected current literature


MISFISHIE was developed by the Stem Cell Genome Anatomy Projects consortium of the US National Institutes of Health National Institute of Diabetes & Digestive & Kidney Diseases to facilitate data sharing within the consortium and was discussed with members of the larger research community. The history of the creation of MISFISHIE and the lessons learned from it70 may be helpful to others aiming to create similar guidelines for other data types. We expect that MISFISHIE will be updated as other localization methods, such as DNA in situ hybridization to chromosomes, are implemented. There is still considerable room for researching the scientific best practice for performing and reporting these types of studies, and the eventual accepted specification will be achieved through discussion and consensus. Suggestions from the community are actively encouraged and will be collected and incorporated into an eventual second release, published at the MISFISHIE domain of the MGED web site: http://www.mged.org/Workgroups/MISFISHIE. Comments may be addressed to the email distribution list dedicated to discussion about MISFISHIE: mged-misfishie@lists.sourceforge.net. Some frequently asked questions and answers are listed in Box 2. After a suitable period of dialog and revision by the community, and if there is widespread acceptance by the community, we would encourage reviewers, journal editors and funding agencies to promote compliance with MISFISHIE.

Box 2: Box 2 Frequently asked questions about MISFISHIE


• MISFISHIE is a set of guidelines for reporting the relevant materials, methods and results of a gene expression localization experiment in a way that allows a researcher from a different lab to validate the findings.

When should the MISFISHIE guidelines be applied?

• Authors of a manuscript should include the information requested in MISFISHIE.

• Reviewers should use MISFISHIE to ensure that a manuscript describes the materials, methods and results adequately.

Does MISFISHIE provide guidelines for how I should perform my experiment?

• No, MISFISHIE by design does not promote any particular technique or best practice for carrying out or analyzing experiments. It merely provides guidelines for disclosing what was done.

What data format should be used to encode MISFISHIE-compliant information?

• At present, it is sufficient to encode this information in free text in a manuscript. Data formats that allow the encoding of MISFISHIE-compliant information in a machine-readable format are becoming available and might be required for submission to a database.

Why are there so many detailed requirements?

• The experiments can be complex and sensitive to the materials and methods used. These variables must be explicitly described to validate or reproduce the results.

Why does MISFISHIE not have more detailed requirements?

• Only those requirements deemed truly minimum for all gene expression localization experiments are included. If some critical aspect of the experiment at hand is not listed in MISFISHIE, it should nonetheless be reported; MISFISHIE should not be construed as a maximum information specification.

Our survey of recent articles indicated that only 15% of published works are fully compliant and that most fail by not making images of assays used in the study digitally accessible to the research community. Most of the surveyed papers could be brought into compliance by uploading the images into a repository and adding fewer than a dozen more sentences of description. If article length constraints hinder MISFISHIE compliance, the required information could be provided in supplementary information. Several of the model organism databases, including GXD25 and EMAGE26 for mouse and ZFIN30 for zebrafish, are already able to accept and archive the results from a publication that provides all information that MISFISHIE specifies. We encourage authors to submit their data to these databases upon submission of the manuscript for publication.

Note: Supplementary information is available on the Nature Biotechnology website.


  1. 1.

    Quantitative immunohistochemistry: a new tool for surgical pathology? Am. J. Clin. Pathol. 90, 324–325 (1988).

  2. 2.

    et al. Minimum information about a microarray experiment (MIAME)—toward standards for microarray data. Nat. Genet. 29, 365–371 (2001).

  3. 3.

    et al. Design and implementation of Microarray Gene Expression Markup Language (MAGE-ML). Genome Biol. 3, RESEARCH0046 (2002).

  4. 4.

    & The MGED ontology: a framework for describing functional genomics experiments. Comp. Funct. Genomics 4, 127–132 (2003).

  5. 5.

    et al. A systematic approach to modeling, capturing, and disseminating proteomics experimental data. Nat. Biotechnol. 21, 247–254 (2003).

  6. 6.

    et al. PEDRo: a database for storing, searching and disseminating experimental proteomics data. BMC Genomics 5, 68 (2004).

  7. 7.

    , , , & Jr. An object model and database for functional genomics. Bioinformatics 20, 1583–1590 (2004).

  8. 8.

    et al. CEBS object model for systems biology data, SysBio-OM. Bioinformatics 20, 2004–2015 (2004).

  9. 9.

    et al. A proposed framework for the description of plant metabolomics experiments and their results. Nat. Biotechnol. 22, 1601–1606 (2004).

  10. 10.

    et al. Summary recommendations for standardization and reporting of metabolic analyses. Nat. Biotechnol. 23, 833–838 (2005).

  11. 11.

    , & The tissue microarray data exchange specification: a community-based, open source tool for sharing tissue microarray data. BMC Med. Inform. Decis. Mak. 3, 5 (2003).

  12. 12.

    , , & Minimum information about a functional genomics experiment: the state of microarray standards and their extension to other technologies. Drug Discov. Today Targets 3, 159–164 (2004).

  13. 13.

    et al. The Functional Genomics Experiment model (FuGE): an extensible framework for standards in functional genomics. Nat. Biotechnol. 25, 1127–1133 (2007).

  14. 14.

    et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics 7, 489 (2006).

  15. 15.

    , & Standards for systems biology. Nat. Rev. Genet. 7, 593–605 (2006).

  16. 16.

    et al. HUPO — Proteomics Standards Initiative (PSI). OMICS 10, 145–151 (2006).

  17. 17.

    & MGED standards. OMICS 10, 138–144 (2006).

  18. 18.

    et al. A strategy capitalizing on synergies: the Reporting Structure for Biological Investigation (RSBI) working group. OMICS 10, 164–171 (2006).

  19. 19.

    et al. Promoting coherent minimum reporting requirements for biological and biomedical investigations: the MIBBI Project. Nat. Biotechnol. (in the press).

  20. 20.

    , , & COMPARE, a multi-organism system for cross-species data comparison and transfer of information. Bioinformatics, published online 1 December 2007 (doi:10.1093/bioinformatics/btm599).

  21. 21.

    et al. 4DXpress: a database for cross-species expression pattern comparisons. Nucleic Acids Res. 36 (database issue), D847–D853 (2007).

  22. 22.

    Methodologic standardization in immunohistochemistry: a doorway opens. Appl. Immunohistochem. 1, 229–231 (1993).

  23. 23.

    An exaltation of experts: concerted efforts in the standardization of immunohistochemistry. Hum. Pathol. 25, 2–11 (1994).

  24. 24.

    et al. Reporting recommendations for tumor marker prognostic studies (REMARK). J. Natl. Cancer Inst. 97, 1180–1184 (2005).

  25. 25.

    et al. The mouse Gene Expression Database (GXD): 2007 update. Nucleic Acids Res. 35 (database issue), D618–D623 (2007).

  26. 26.

    et al. EMAP and EMAGE: a framework for understanding spatially organized data. Neuroinformatics 1, 309–325 (2003).

  27. 27.

    et al. Development of FuGO – an ontology for functional genomics experiments. OMICS 10, 199–204 (2006).

  28. 28.

    et al. Active cyclin A-CDK2 complex, a possible critical factor for cell proliferation in human primary lung carcinomas. Am. J. Pathol. 153, 963–972 (1998).

  29. 29.

    , , & Inadequate formalin fixation decreases reliability of p27 immunohistochemical staining: probing optimal fixation time using high-density tissue microarrays. Hum. Pathol. 33, 756–760 (2002).

  30. 30.

    et al. The Zebrafish Information Network (ZFIN): the zebrafish model organism database. Nucleic Acids Res. 31, 241–243 (2003).

  31. 31.

    & The BioImage Database Project: organizing multidimensional biological images in an object-relational database. J. Struct. Biol. 125, 97–102 (1999).

  32. 32.

    & A reference ontology for biomedical informatics: the Foundational Model of Anatomy. J. Biomed. Inform. 36, 478–500 (2003).

  33. 33.

    , & An ontology for cell types. Genome Biol. 6, R21 (2005).

  34. 34.

    et al. An internet-accessible database of mouse developmental anatomy based on a systematic nomenclature. Mech. Dev. 74, 111–120 (1998).

  35. 35.

    , , , & The Adult Mouse Anatomical Dictionary: a tool for annotating and integrating data. Genome Biol. 6, R29 (2005).

  36. 36.

    A tool for sharing annotated research data: the “Category 0” UMLS (Unified Medical Language System) vocabularies. BMC Med. Inform. Decis. Mak. 3, 6 (2003).

  37. 37.

    et al. Expression of luminal and basal cytokeratins in human breast carcinoma. J. Pathol. 203, 661–671 (2004).

  38. 38.

    et al. Web-based tissue microarray image data analysis: initial validation testing through prostate cancer Gleason grading. Hum. Pathol. 32, 417–427 (2001).

  39. 39.

    & Characterization of prostate cell types by CD cell surface molecules. Am. J. Pathol. 160, 37–43 (2002).

  40. 40.

    et al. Fluorescence in situ hybridization analysis of chromosome 12p in paraffin-embedded tissue is useful for establishing germ cell origin of metastatic tumors. Mod. Pathol. 17, 1309–1313 (2004).

  41. 41.

    et al. Basal cell proliferations of the prostate other than usual basal cell hyperplasia: a clinicopathologic study of 23 cases, including four carcinomas, with a proposed classification. Am. J. Surg. Pathol. 28, 1289–1298 (2004).

  42. 42.

    et al. Prostate stem cell antigen is overexpressed in human transitional cell carcinoma. Cancer Res. 61, 4660–4665 (2001).

  43. 43.

    et al. High levels of phosphorylated form of Akt-1 in prostate cancer and non-neoplastic prostate tissues are strong predictors of biochemical recurrence. Clin. Cancer Res. 10, 6572–6578 (2004).

  44. 44.

    et al. The distribution of drug-efflux pumps, P-gp, BCRP, MRP1 and MRP2, in the normal blood-testis barrier and in primary testicular tumours. Eur. J. Cancer 40, 2064–2070 (2004).

  45. 45.

    et al. Prospective evaluation of AMACR (P504S) and basal cell markers in the assessment of routine prostate needle biopsy specimens. Hum. Pathol. 35, 1462–1468 (2004).

  46. 46.

    et al. Syndecan-1 expression in locally invasive and metastatic prostate cancer. Urology 63, 402–407 (2004).

  47. 47.

    , & Growth and differentiation of progenitor/stem cells derived from the human mammary gland. Exp. Cell Res. 297, 444–460 (2004).

  48. 48.

    , , & Localisation of breast cancer resistance protein in microvessel endothelium of human brain. Neuroreport 13, 2059–2063 (2002).

  49. 49.

    , , & Molecular phenotype of airway side population cells. Am. J. Physiol. Lung Cell. Mol. Physiol. 286, L624–L630 (2004).

  50. 50.

    et al. Normal and malignant prostate epithelial cells differ in their response to hepatocyte growth factor/scatter factor. Am. J. Pathol. 159, 579–590 (2001).

  51. 51.

    et al. Isolation of muscle derived stem cells from rat and its smooth muscle differentiation. Mol. Cells 17, 57–61 (2004); erratum 17, 381 (2004).

  52. 52.

    et al. The breast cancer resistance protein BCRP (ABCG2) concentrates drugs and carcinogenic xenotoxins into milk. Nat. Med. 11, 127–129 (2005).

  53. 53.

    et al. High expression of the Met receptor in prostate cancer metastasis to bone. Urology 60, 1113–1117 (2002).

  54. 54.

    et al. Investigation of MRP-1 protein and MDR-1 P-glycoprotein expression in invasive breast cancer: a prognostic study. Int. J. Cancer 112, 286–294 (2004).

  55. 55.

    , & Analysis of the MRP4 drug resistance profile in transfected NIH3T3 cells. J. Natl. Cancer Inst. 92, 1934–1940 (2000).

  56. 56.

    et al. High level of androgen receptor is associated with aggressive clinicopathologic features and decreased biochemical recurrence-free survival in prostate: cancer patients treated with radical prostatectomy. Am. J. Surg. Pathol. 28, 928–934 (2004).

  57. 57.

    et al. Persistent expression of the ATP-binding cassette transporter, Abcg2, identifies cardiac SP cells in the developing and adult heart. Dev. Biol. 265, 262–275 (2004).

  58. 58.

    , , & Human embryonic stem cells express an immunogenic nonhuman sialic acid. Nat. Med. 11, 228–232 (2005).

  59. 59.

    , , & Urothlelium facilitates the recruitment and trans-differentiation of fibroblasts into smooth muscle in acellular matrix. J. Urol. 170, 1628–1632 (2003).

  60. 60.

    et al. Alterations in smooth muscle contractile and cytoskeleton proteins and interstitial cells of Cajal in megacystis microcolon intestinal hypoperistalsis syndrome. J. Pediatr. Surg. 38, 749–755 (2003).

  61. 61.

    et al. Androgen receptor levels in prostate cancer epithelial and peritumoral stromal cells identify non-organ confined disease. Prostate 63, 19–28 (2005).

  62. 62.

    et al. Phenotypic heterogeneity of end-stage prostate carcinoma metastatic to bone. Hum. Pathol. 34, 646–653 (2003).

  63. 63.

    et al. Quantitative determination of expression of the prostate cancer protein alpha-methylacyl-CoA racemase using automated quantitative analysis (AQUA): a novel paradigm for automated and continuous biomarker measurements. Am. J. Pathol. 164, 831–840 (2004).

  64. 64.

    et al. JAGGED1 expression is associated with prostate cancer metastasis and recurrence. Cancer Res. 64, 6854–6857 (2004).

  65. 65.

    et al. C-kit receptor expression in Ewing's sarcoma: lack of prognostic value but therapeutic targeting opportunities in appropriate conditions. J. Clin. Oncol. 21, 1952–1960 (2003).

  66. 66.

    et al. Androgen-independent prostate cancer is a heterogeneous group of diseases: lessons from a rapid autopsy program. Cancer Res. 64, 9209–9216 (2004).

  67. 67.

    et al. Genes expressed in human tumor endothelium. Science 289, 1197–1202 (2000).

  68. 68.

    et al. Expression of the human cachexia-associated protein (HCAP) in prostate cancer and in a prostate cancer animal model of cachexia. Int. J. Cancer 105, 123–129 (2003).

  69. 69.

    & Prostate stem cell antigen (PSCA) expression in human prostate cancer tissues: implications for prostate carcinogenesis and progression of prostate cancer. Jpn. J. Clin. Oncol. 34, 414–419 (2004).

  70. 70.

    et al. Minimum Information Specification For In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE). OMICS 10, 205–208 (2006).

Download references


We thank R. Drysdale, L. Eichner, M. Heiskanen and M. Westerfield for comments and discussions during the preparation of the MISFISHIE specification and C. Emswiler for assistance with the figures. This work was funded in part with support from the US National Institute of Diabetes & Digestive & Kidney Diseases to members of the Stem Cell Genome Anatomy Projects Consortium, including DK63483 to J. Gordon, DK63481 to I. Lemischka, DK63400 to M. Little, DK63630 to A. Liu and DK63328 to L. Zon (Children's Hospital Boston).

Author information


  1. Institute for Systems Biology, 1441 N 34th Street, Seattle, Washington 98103, USA.

    • Eric W Deutsch
    • , David Campbell
    • , Young Ah Goo
    • , Michael H Johnson
    • , Asa J Oudes
    • , Laura E Pascal
    • , Laura Walashek
    •  & Alvin Y Liu
  2. Department of Biochemistry, Stanford University School of Medicine, 279 Campus Drive West, Stanford, California 94305, USA.

    • Catherine A Ball
  3. Association for Pathology Informatics, 9650 Rockville Pike, Bethesda, Maryland 20814, USA.

    • Jules J Berman
  4. Johns Hopkins University School of Medicine, PELICAN Laboratory, Departments of Pathology, Health Information Sciences, Genetic Medicine, Oncology, and Urology, Baltimore, Maryland, 21287 USA.

    • G Steven Bova
  5. European Molecular Biology Laboratory (EMBL)–European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

    • Alvis Brazma
    • , Helen E Parkinson
    •  & Susanna-Assunta Sansone
  6. Department of Microbiology, University of Washington, Seattle, Washington 98195, USA.

    • Roger E Bumgarner
  7. Medical Research Council (MRC) Clinical Sciences Centre Microarray Centre, Imperial College School of Medicine, London W12 ONN, UK.

    • Helen C Causton
  8. MRC Human Genetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK.

    • Jeffrey H Christiansen
    •  & Duncan R Davidson
  9. Institut de Biologie du Développement de Marseille, UMR 6216, Centre National de la Recherche Scientifique (CNRS)/Université de la Méditerranée, Parc Scientifique de Luminy, Case 907, F-13288 Marseille Cedex 9, France.

    • Fabrice Daian
    • , Delphine Dauga
    • , Gregory Gimenez
    •  & David Salgado
  10. Department of Urology, University of Washington, Seattle, Washington 98195, USA.

    • Young Ah Goo
    • , Asa J Oudes
    • , Laura E Pascal
    • , Laura Walashek
    •  & Alvin Y Liu
  11. Institute for Molecular Bioscience, University of Queensland, St. Lucia, Queensland 4027, Australia.

    • Sean Grimmond
  12. EMBL Heidelberg, Meyerhofstrasse 1, D-69117 Heidelberg, Germany.

    • Thorsten Henrich
    •  & Mirana Ramialison
  13. Institute for Medical Genetics, Charité-Campus Benjamin Franklin, and Department of Developmental Genetics, Max-Planck-Institute for Molecular Genetics, Ihnestrasse 73, 14195 Berlin, Germany.

    • Bernhard G Herrmann
    •  & Martin Korb
  14. Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri 63110, USA.

    • Jason C Mills
  15. CNRS UMR 8080, Université Paris-Sud, 91405 Orsay, France.

    • Nicolas Pollet
  16. Dana-Farber Cancer Institute, 44 Binney Street, M232, Boston, Massachusetts 02115, USA.

    • John Quackenbush
  17. The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA.

    • Martin Ringwald
  18. Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, California 94305-5120, USA.

    • Gavin Sherlock
  19. Center for Bioinformatics and Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.

    • Christian J Stoeckert Jr
  20. Division of Gene Regulation and Expression, Wellcome Trust Biocentre, University of Dundee, Dow Street, Dundee DD1 5EH, Scotland, UK.

    • Jason Swedlow
  21. Computational Biology & Bioinformatics Group, Pacific Northwest National Laboratory, PO Box 999, MS K7-90, Richland, Washington 99352, USA.

    • Ronald C Taylor
  22. Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 IHH, UK.

    • Anthony Warford
  23. Division of Developmental Neurobiology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK.

    • David G Wilkinson
  24. Stem Cell Program, Division of Hematology/Oncology, Children's Hospital Boston and Harvard Medical School, 300 Longwood Ave., Boston, Massachusetts 02115, USA.

    • Yi Zhou
    •  & Leonard I Zon
  25. Department of Pathology, University of Washington, Seattle, Washington 98195-6100, USA.

    • Lawrence D True


  1. Search for Eric W Deutsch in:

  2. Search for Catherine A Ball in:

  3. Search for Jules J Berman in:

  4. Search for G Steven Bova in:

  5. Search for Alvis Brazma in:

  6. Search for Roger E Bumgarner in:

  7. Search for David Campbell in:

  8. Search for Helen C Causton in:

  9. Search for Jeffrey H Christiansen in:

  10. Search for Fabrice Daian in:

  11. Search for Delphine Dauga in:

  12. Search for Duncan R Davidson in:

  13. Search for Gregory Gimenez in:

  14. Search for Young Ah Goo in:

  15. Search for Sean Grimmond in:

  16. Search for Thorsten Henrich in:

  17. Search for Bernhard G Herrmann in:

  18. Search for Michael H Johnson in:

  19. Search for Martin Korb in:

  20. Search for Jason C Mills in:

  21. Search for Asa J Oudes in:

  22. Search for Helen E Parkinson in:

  23. Search for Laura E Pascal in:

  24. Search for Nicolas Pollet in:

  25. Search for John Quackenbush in:

  26. Search for Mirana Ramialison in:

  27. Search for Martin Ringwald in:

  28. Search for David Salgado in:

  29. Search for Susanna-Assunta Sansone in:

  30. Search for Gavin Sherlock in:

  31. Search for Christian J Stoeckert in:

  32. Search for Jason Swedlow in:

  33. Search for Ronald C Taylor in:

  34. Search for Laura Walashek in:

  35. Search for Anthony Warford in:

  36. Search for David G Wilkinson in:

  37. Search for Yi Zhou in:

  38. Search for Leonard I Zon in:

  39. Search for Alvin Y Liu in:

  40. Search for Lawrence D True in:

Corresponding author

Correspondence to Eric W Deutsch.

Supplementary information

About this article

Publication history




Further reading