Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

From XML to RDF: how semantic web technologies will change the design of 'omic' standards

Abstract

With the ongoing rapid increase in both volume and diversity of 'omic' data (genomics, transcriptomics, proteomics, and others), the development and adoption of data standards is of paramount importance to realize the promise of systems biology. A recent trend in data standard development has been to use extensible markup language (XML) as the preferred mechanism to define data representations. But as illustrated here with a few examples from proteomics data, the syntactic and document-centric XML cannot achieve the level of interoperability required by the highly dynamic and integrated bioinformatics applications. In the present article, we discuss why semantic web technologies, as recommended by the World Wide Web consortium (W3C), expand current data standard technology for biological data representation and management.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A hypothetical 2DE example.
Figure 2: Data relationships for a spot on a 2DE gel and its XML representation.
Figure 3: Graph model for an RDF statement.
Figure 4: An RDF model for a spot on a 2DE gel.

Similar content being viewed by others

References

  1. Quackenbush, J. Data standards for 'omic' science. Nat. Biotechnol. 22, 613–614 (2004).

    Article  CAS  Google Scholar 

  2. Brazma, A. On the importance of standardisation in life sciences. Bioinformatics 17, 113–114 (2001).

    Article  CAS  Google Scholar 

  3. Zerhouni, E. Medicine. The NIH Roadmap. Science 302, 63–72 (2003).

    Article  CAS  Google Scholar 

  4. Check, E. NIH 'roadmap' charts course to tackle big research issues. Nature 425, 438 (2003).

    Article  CAS  Google Scholar 

  5. Kant, I. Critique of Pure Reason, 2nd revised edn. (Palgrave Macmillan, New York, 2003).

    Google Scholar 

  6. Whorf, B.L. Language, mind and reality. Theosophist 63, 281–291 (1942).

    Google Scholar 

  7. Gordon, P. Numerical cognition without words: evidence from Amazonia. Science 306, 496–499 (2004).

    Article  CAS  Google Scholar 

  8. Cargill, C. Information Technology Standardization: Theory, Process And Organizations (Digital Press, Bedford, Massachusetts, 1989).

  9. Krechmer, K. The fundamental nature of standard: technical perspective. IEEE Commun. Mag. 38, 70 (2000).

    Google Scholar 

  10. Sherif, M. A framework for standardization in telecommunications and information technology. IEEE Commun. Mag. 39, 94–100 (2001).

    Article  Google Scholar 

  11. Farrell, J. & Saloner, G. Standardization, compatibility and innovation. Rand J. Econ. 16, 70–83 (1985).

    Article  Google Scholar 

  12. Barillot, E. & Achard, F. XML: a lingua franca for science? Trends Biotechnol. 18, 331–333 (2000).

    Article  CAS  Google Scholar 

  13. Stanislaus, R., Jiang, L.H., Swartz, M., Arthur, J. & Almeida, J.S. An XML standard for the dissemination of annotated 2D gel electrophoresis data complemented with mass spectrometry results. BMC Bioinformatics 5, 9 (2004).

    Article  Google Scholar 

  14. Kamijo, A. et al. HUP-ML: Human proteome markup language for proteomics database. JMSSJ On-line 51, 542–549 (2003). http://db.wdc-jp.com/mssj/search/abst/200305/ms510542.html

    Google Scholar 

  15. Pedrioli, P.G. et al. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 22, 1459–1466 (2004).

    Article  CAS  Google Scholar 

  16. Cover, R. XML and semantic transparency. The Cover Pages, published online 23 October 1998, revised 24 November 1998. http://www.oasis-open.org/cover/xmlAndSemantics.html

    Google Scholar 

  17. Spender, J. Pluralist epistemology and the knowledge-based theory of the firm. Organ. 5, 233–256 (1998).

    Google Scholar 

  18. Galliers, R.D. & Newwell, S. Back to the future: from knowledge management to data management. in Proceedings of the 9th European Conference on Information Systems 2001, Bled, Slovenia, June 27–29, 2001, 609–615 (Moderna Organizacija, Kranj, Slovenia, 2001).

    Google Scholar 

  19. Manola, F. & Miller, E. RDF Primer. W3C Recommendation published online 10 February 2004. http://www.w3.org/TR/rdf-primer/

  20. Beckett, D. RDF/XML Syntax Specification (Revised). W3C recommendation published online 10 February 2004. http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/

  21. Berners-Lee, T. Primer: Getting into RDF & Semantic Web using N3. Published online 29 June 2005. http://www.w3.org/2000/10/swap/Primer.html

  22. Wang, X. & Almeida, J.S. DLG2 - A Graphical Presentation Language for RDF and OWL (v 2.0). Published online 10 August 2005. http://www.charlestoncore.org/dlg2/

  23. Brickley, D. RDF Vocabulary Description Language 1.0: RDF Schema. W3C recommendation published online 10 February 2004. http://www.w3.org/TR/rdf-schema/

  24. McGuinness, D.L. & van Harmelen, F. OWL Web Ontology LanguageOverview. W3C recommendation published online 10 February 2004. http://www.w3.org/TR/owl-features/

  25. Davis, R., Shrobe, H. & Szolovits, P. What is a knowledge representation? AI Magazine 14, 17–33 (1993).

    Google Scholar 

  26. Gruber, T. A translation approach to portable ontologies. Knowledge Acquisition 5, 199–220 (1993).

    Article  Google Scholar 

  27. Guarino, N. Formal Ontology and Information Systems, in: Formal Ontology in Information Systems (IOS Press, Amsterdam, Netherlands, 1998).

  28. Klyne, G. & Carroll, J.J. (eds.) Resource Description Framework (RDF):Concepts and Abstract Syntax. W3C recommendation published 10 February 2004. http://www.w3.org/TR/rdf-concepts/

  29. Clark, K.G. Identity crisis. XML.com. Published online 11 September 2002. http://www.xml.com/pub/a/2002/09/11/deviant.html

    Google Scholar 

  30. Berners-Lee, T. What do HTTP URIs identify? Published online 27 July 2002. http://www.w3.org/DesignIssues/HTTP-URI

  31. Sole, R. Language: syntax for free? Nature 434, 289 (2005).

    Article  CAS  Google Scholar 

  32. Berners-Lee, T. & Hendler, J. Publishing on the semantic web. Nature 410, 1023–1024 (2001).

    Article  CAS  Google Scholar 

  33. Brazma, A. et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29, 365–371 (2001).

    Article  CAS  Google Scholar 

  34. Spellman, P.T. et al. Design and implementation of microarray gene expression markup language (MAGE-ML). Genome Biol. 3, research0046 (2002).

  35. Stoeckert, C.J. Jr., Quackenbush, J., Brazma, A. & Ball, C.A. Minimum information about a functional genomics experiment: the state of microarray standards and their extension to other technologies. Drug Discov. Today: TARGETS 3, 159–164 (2004).

    CAS  Google Scholar 

  36. Summary Report: W3C Workshop on Semantic Web for Life Sciences. Published online 22 November 2004. (http://www.w3.org/2004/10/swls-workshop-report.html

Download references

Acknowledgements

This work was supported by the US National Heart, Lung, and Blood Institute (NHLBI) Proteomics Initiative through contract N01-HV-28181 to the Medical University of South Carolina, Principal Investigator. D. Knapp, and its bioinformatics core (core C, Principal Investigator J.S. Almeida) and mathematical modeling project (project 7, Principal Investigator E.O. Voit), as well as by its administrative center, separately funded by the same initiative to the same institution, Principal Investigator M.P. Schachte. The authors also acknowledge support by the training grant 1-T15-LM07438-01 “training of toolmakers for Biomedical Informatics” by the US National Library of Medicine of the National Institutes of Health, (NIH/NLM).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jonas S Almeida.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Gorlitsky, R. & Almeida, J. From XML to RDF: how semantic web technologies will change the design of 'omic' standards. Nat Biotechnol 23, 1099–1103 (2005). https://doi.org/10.1038/nbt1139

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1139

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing