Virtual cell: How to make one

doi:10.1038/nindia.2011.170

COMMENT
25 November 2011

Virtual cell: How to make one

Constructing a virtual cell out of a real cell calls for massive number crunching of experimental biological data into computable units. The key is to use the right combination of methods and tools, says Pawan Dhar.

A computational model of any cell is a reduced dimension representation of reality. It is about converting a dynamic system into a series of snapshots, which when run together create a movie.

Studying the model and understanding the collective properties of system components vis-à-vis higher-level behavior is the new science of systems biology.

Real to virtual

One starts off by collecting experimental data, converting that into a model and watching the model grow over time. The first step is to make a whole cell map out of gene-RNA-protein interactions.

The most useful map is the one that is based on the thinking: 'if this condition exists, show me the molecular interaction'. If the question involves several organs, then a cell-cell interaction map would be the most useful for starters.

In the next step, the connectivity relationship between any two molecules is represented qualitatively or quantitatively. Qualitative data would be things like — is the gene on or off, is the protein produced or not. Quantitative information, on the other hand, would be — how much protein is produced, how long is it produced, what is the strength of protein-DNA binding, what is the flux in a certain metabolic pathway at equilibrium — and so on.

The choice of method depends upon availability of data and familiarity with the appropriate tools¹.

While modeling whole cell systems, one has to keep track of hundreds of processes occurring simultaneously. For efficient management of large volumes of data, researchers use special data analysis, data visualisation and data storage tools.

Once the model is constructed, they use simulation and analytical tools to understand its behavior. By altering components and their associated parameters one can study system behavior under various conditions and make appropriate predictions. The model allows one to ask questions like: what would possibly happen if we knocked-in or knocked-out genes? The accuracy of the answer depends upon correctness and completeness of data that goes into the model.

The inventories

Making of a virtual cell calls for massive data integration. The data comes in the form of parts-inventory, interaction-inventory and context-inventory.

The 'parts inventory' is made of genes, various types of RNAs and proteins. The 'interaction-inventory' is composed of RNA-protein, protein-protein, RNA-RNA, protein-DNA interactions stitched into metabolic pathways, gene regulatory and signaling networks. The 'contextual-inventory' is composed of all environmental/cell culture conditions under which the data was collected.

One frequently ends up integrating gene expression, protein expression, RNA and protein interaction data into contextual relationships. This calls for developing appropriate interfaces among various data types. It also calls for data management, data visualisation and a suite of analytical tools that help find answers across databases.

Modeling lingo

Currently more than 150 tools assist researchers in biological model building . To ensure that different tools talk to each other seamlessly, exchange data and enable data display and analysis, the Systems Biology Modeling Language (SBML) was invented².

The SBML framework brings together rich, heterogenous data outputs and operating system formats to a common binary encoded denominator. It uses client end libraries such that a model built in Linux, for example, can be opened up for editing and compilation by another tool developed in windows.

Irrespective of enormous efforts in the systems biology community over the last decade to address data treatment and data portability issues, still there is a need to integrate data emerging from various technologies and build data standards for storage and communication.

For example, it is quite challenging to model gene expression scenarios and merge them with spatial and temporal information of pathways and networks. Likewise, semantically codified biological data is sparse due to lack of common data publishing standards and non-uniform data storage formats among biological databases.

Data power

In the space of quantitative modeling, lack of relevant and enough data, often hampers a well planned modeling effort. Frequently, one is forced to import data from irrelevant systems to generate a sense of completeness in the model. For example, if the aim is to build a pathway model in a certain plant and if data on catalytic turnover rate of enzymes was found missing, one frequently ends up importing the data from yeast and E.coli , to generate a sense of model completeness!

Another important issue is the accuracy of published data. For example, the enzyme kinetic data reported in publications comes from experiments that use aqueous buffers at a certain pH. However, in reality a cell is not a bag of aqueous solution. It resembles a gel. Also often one comes across data conflicts that are difficult to resolve.

It is a standard practice to use Michaelis-menton equation to model metabolic pathways. However, enzyme kinetic equations were derived using a number of assumptions that are unrealistic in biological setting e.g., well mixed reactor.

Due to these reasons, modeling is more of an art, is dependent upon personal understanding of most biologically relevant abstraction.

One tends to heavily mathematize biology in a quest to create a linear system out of non-linear biology. However, over-fitting mathematics into biology creates its own problems in terms of lack of knowledge on fundamental parameter values, a huge quantitative space of unknown and error prone predictive methods to fill in that space.

In an ideal setting one would want to integrate the cell level inventory with the tissue level inventory. However, the challenge is enormous from the data integration point-of-view, keeping track of bad data, false discovery rate and incompleteness of model.

Though mathematized transformation of biological data allows a better tractability, unfortunately the nature of abstraction itself constraints the evolution of the system within a given parameter. Also, one does not know how to handle an input, for which prior knowledge was not hardwired into the model. Due to this reason, emergent properties that arise from simple interactions are difficult to capture and simulate.

Finally, there are situations where part of the system may be modeled qualitatively and part of the system quantitatively. An interface that seamlessly enables data movement between two different subsystems, within the same model, needs to be developed. This calls for innovative thinking and sound knowledge of biology.

This article is the second in a series entitled 'Virtual Cell'.

doi: https://doi.org/10.1038/nindia.2011.170

References

Ghosh, S. et al. Software for systems biology: from tools to integrated platforms. Nat. Rev. Genet. 12, 821-832 (2011)
Google Scholar
Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524-531 (2003)
Google Scholar

Download references

Reprints and permissions

Jobs

W2 Professorship with tenure track to W3 in Animal Husbandry (f/m/d)

The Faculty of Agricultural Sciences at the University of Göttingen invites applications for a temporary professorship with civil servant status (g...

Göttingen (Stadt), Niedersachsen (DE)

Georg-August-Universität Göttingen
Postdoctoral Associate- Cardiovascular Research

Houston, Texas (US)

Baylor College of Medicine (BCM)
Faculty Positions & Postdocs at Institute of Physics (IOP), Chinese Academy of Sciences

IOP is the leading research institute in China in condensed matter physics and related fields. Through the steadfast efforts of generations of scie...

Beijing, China

Institute of Physics (IOP), Chinese Academy of Sciences (CAS)
Director, NLM

Vacancy Announcement Department of Health and Human Services National Institutes of Health DIRECTOR, NATIONAL LIBRARY OF MEDICINE THE POSITION:...

Bethesda, Maryland

National Library of Medicine - Office of the Director
Call for postdoctoral fellows in Molecular Medicine, Nordic EMBL Partnership for Molecular Medicine

The Nordic EMBL Partnership is seeking postdoctoral fellows for collaborative projects in molecular medicine through the first NORPOD call.

Helsinki, Finland

Nordic EMBL Partnership for Molecular Medicine

[1] Ghosh, S. et al. Software for systems biology: from tools to integrated platforms. Nat. Rev. Genet. 12, 821-832 (2011)
Google Scholar

[2] Hucka, M. et al. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524-531 (2003)
Google Scholar

Virtual cell: How to make one

Real to virtual

The inventories

Modeling lingo

Data power

References

Jobs

W2 Professorship with tenure track to W3 in Animal Husbandry (f/m/d)

Postdoctoral Associate- Cardiovascular Research

Faculty Positions & Postdocs at Institute of Physics (IOP), Chinese Academy of Sciences

Director, NLM

Call for postdoctoral fellows in Molecular Medicine, Nordic EMBL Partnership for Molecular Medicine

Search

Quick links