Network reconstruction and validation of the Snf1/AMPK pathway in baker’s yeast based on a comprehensive literature review

Background/Objectives: The SNF1/AMPK protein kinase has a central role in energy homeostasis in eukaryotic cells. It is activated by energy depletion and stimulates processes leading to the production of ATP while it downregulates ATP-consuming processes. The yeast SNF1 complex is best known for its role in glucose derepression. Methods: We performed a network reconstruction of the Snf1 pathway based on a comprehensive literature review. The network was formalised in the rxncon language, and we used the rxncon toolbox for model validation and gap filling. Results: We present a machine-readable network definition that summarises the mechanistic knowledge of the Snf1 pathway. Furthermore, we used the known input/output relationships in the network to identify and fill gaps in the information transfer through the pathway, to produce a functional network model. Finally, we convert the functional network model into a rule-based model as a proof-of-principle. Conclusions: The workflow presented here enables large scale reconstruction, validation and gap filling of signal transduction networks. It is analogous to but distinct from that established for metabolic networks. We demonstrate the workflow capabilities, and the direct link between the reconstruction and dynamic modelling, with the Snf1 network. This network is a distillation of the knowledge from all previous publications on the Snf1/AMPK pathway. The network is a knowledge resource for modellers and experimentalists alike, and a template for similar efforts in higher eukaryotes. Finally, we envisage the workflow as an instrumental tool for reconstruction of large signalling networks across Eukaryota.

The elemental reactions are defined in the rxncon reaction list as "subject-verb-object" clauses, which are equivalent to the format used to document protein-protein interactions in e.g. the BioGRID database (3). In the case of transport or catalytic modification, the channel or enzyme takes the subject position. For reciprocal reactions such as protein-protein interactions, the components are assigned as subject or object based on alphabetic order, to keep reaction names deterministic. In the Snf1 network reconstruction, we used the following twelve reactions:

Network reconstruction
We collected two classes of information during the literature curation: Mechanistic information in terms of elemental reactions and contingencies, and physiological/functional information in terms of input-output relationships. The first was used for the actual network reconstruction, and the second for the network validation. Hence, we used qualitatively different data to build and to validate the network reconstruction.
The mechanistic information was divided into reactions and contingencies. We can exemplify this with the relationship between Snf1 and Mig1: Snf1 phosphorylation leads to nuclear export of Mig1. This is actually three different statements, each of which requires its own evidence. First, Snf1 phosphorylates Mig1. Second, the nuclear pore complex export Mig1. Third, the nuclear export only happens when Mig1 is phosphorylated. The first two statements are reactions, and can be written in rxncon short notation (as they appear in figure 2) as: The product states of these reactions are Mig1-{P} and Mig1-{Cytoplasm}. Not that these states define to possibly overlapping set; the intersection of these two states would define the cytoplasmic, phosphorylated form of Mig1. Only reaction 2 has a source state, Mig1-{Nuclear}, which is consumed by the second reaction. With these pieces, we can now define the contingency: Meaning that reaction 3 (NPC_NEXP_Mig1) requires (!) Mig1 to be phosphorylated (Mig1-{P}).
The physiological/functional information consists of inputs/output relationships that are known to require Snf1. For example, we know that glucose depletion relieves repression of Mig1 regulated genes via Snf1. Hence, a functional model should, upon changes in glucose, convey these changes to the Mig1 regulated genes.
We took a conservative approach to the network reconstruction. The network reconstruction is based on direct mechanistic connections between components, which must be possible to define in terms of reactions or contingencies as defined above. That means that genetic data alone is of limited use, due to the possibility of indirect effects. It also means that we re-evaluated the evidence presented in the papers, rather than relying on the often highly accurate but sometimes speculative interpretation of the authors. Hence, we could not make use of review papers. This conservative angle led us to exclude some components that are likely to be part of the extended Snf1 network based on the lack of direct mechanistic evidence.
The most prominent example is Adr1: Adr1 is phosphorylated on Ser230 in a Snf1 dependent manner (4). However, the mechanism of this interaction is unknown. The interaction between Adr1 and Snf1 could be direct since they have been found to co-localise (5). However, since the snf1∆ mutant shows the opposite transcription phenotype to the Adr1 S230A mutant, it is unlikely that Snf1 is the kinase of Adr1 (4). Moreover, deletion of Snf1 increases the phosphorylation level of Adr1 on Ser230, indicating that Snf1 rather activates a phosphatase or inactivates a protein kinase under glucose limited conditions than directly phosphorylates Adr1 (6). Undoubtedly, literature evidence shows that Snf1 regulates the transcription factor Adr1, however the exact mechanism and its components are unknown and therefore we did not include it in the network reconstruction.
The mechanistic information was entered into the reaction and contingency list tabs of the Excel based rxncon network definition (Sup. file 1 and 2), together with annotations and references.

Network visualisation
The network graphs were generated automatically using the rxncon software and visualised in Cytoscape (1, 7). The reaction graph (Fig 1) visualises the network components and the reactions between them at a topological level. Hence, it represents mechanistic connection and defines all possible information paths, but omits any information of causality. This representation is hence incomplete, but useful to get an overview of the pathway components and their connections.
In contrast, the regulatory graph (e.g. Fig 2) visualises the information flow between elemental reactions and states, i.e., how reactions produce or consume states, and how states stimulate or inhibit reactions via contingencies. The reaction-to-state edges define which reactions produce or consume which states, corresponding to the information in the reaction graph (Fig 1). The state-to-reaction edges define how states influence reactions, and hence define the causal relationship missing in the reaction graph. For a network to transmit information, it must be connected in the regulatory graph. However, this is not sufficient, as it must be possible both to turn on and shut off reactions and states that are dynamically responsive to the signal.

Model generation and simulation
The rxncon network uniquely defines a bipartite Boolean model (bBM) with one specific truth table.
The bBM is based on the regulatory graph and, hence, captures the information flow through the pathway. We generated the bBM with the rxncon software, simulated it using the built-in Boolean simulator and visualised the attractor states on the regulatory graph in Cytoscape (8). The model generation is automatic, based on the network reconstruction. However, to simulate it, we adapted the initial conditions. By default, all components are TRUE (wild-type; no regulated expression of pathway components), and all reactions and states start as FALSE. However, we needed to seed localisation of all components that are translocated; Snf1 (cytosol), Gln3 (cytosol), Msn2 (cytosol), Mig1 (nucleus) and Hxk2 (both nucleus and cytosol), to mimic cells grown in glucose without stress (our default condition). Before model generation, we also eliminated the mutual exclusivity in the bindings between Gal83--Snf1, Sip1--Snf1, and Sip2--Snf1, as well as Glc7--Reg1 and Glc7--Reg2. These bindings are mutually exclusive on the level of individual proteins, but lead to artificial cyclic attractors in the Boolean model that have no informational value.
We ran the simulations with these settings and synchronous updating in the Boolean network modelling software BooleanNet (9). It provides a set of functions for the simulation of biological regulatory networks in a Boolean formalism and is integrated in the rxncon bBM simulator. The Boolean simulations that went beyond the elementary input/output analyses (e.g. automated searches for multiple attractor states) were performed with custom made scripts implemented in the Python programming language calling the BooleanNet software library directly.

Network validation and gap filling
The gap filling process was done by manual evaluation and adaptation of the bBM. We simulated the bBM with a given input configuration until the attractor state was reached, and compared this to the expected simulation outcome (Table 1). All listed output should match the expected state (ON/OFF) when the input conditions are as specified, and should have the opposite state on glucose without stress. When the output did not show the expected behaviour, we manually examined the network to see where the signal was stuck: This would be between the nodes that changed according to the input, and the nodes that did not but were expected to. When necessary, we adapted the network definition to resolve blocks and/or constitutive activities as detailed in the results section. All such adaptations have been clearly labelled as hypotheses in the updated network definition (Sup. file 2).

Analysis of initial conditions for the bBM
We also analysed the impact of the initial settings. We could not perform an exhaustive search, as the number of reactions and states are 72 and 64, respectively, corresponding to 2 136 (more than 10 40 ) possible starting conditions. However, we reversed the initial settings (all states TRUE, all reactions TRUE), and scanned both settings for changes in individual states (i.e. all but one state/reaction TRUE or FALSE). We tested both these sets in the presence and absence of glucose, for a total of 544 simulations. All these simulations converged on two attractors, each appearing 272 times, depending on whether glucose was present or not (stress perturbations were not considered in this search). Finally, we mimicked the snf1∆ mutant by setting the Snf1 node to FALSE. In this last test, none of the input-output relationships of the NR2 were functional, as expected as all signals pass through the Snf1 complex.