Introduction

The notion of complex networks has emerged in the last decades as an important framework for describing and investigating the organization of natural and living systems1. A complex network can be regarded as any collection of units, or nodes, that are interacting as a system and that are connected by directed or undirected links. In this respect, complex networks serve as natural models to describe the structure of a diverse range of systems1,2, with examples from social3, economical4, technological5,6 and biological systems7. The network perspective also plays an important role for describing the organization of ecological systems8,9. In particular, the network theoretic analysis of food webs has proven to be very useful to explore the properties of ‘who eats whom’ in ecosystems10.

Over the last years a wide variety of techniques has been put forward that allow to characterize the structure and topology of complex networks1,11. These include statistical measures, such as centrality indices, that characterize the importance of single nodes or links in the network and network modules that capture the large-scale structure within the network. These studies demonstrated that many real-world networks and in particular biological networks, contain small network building blocks, the network motifs, that recur in a network more frequently than statistically expected12. The importance of network motifs relies largely on the observation that small subgraphs indicate particular patterns of interactions between network nodes and thus may carry a regulatory or dynamic function. The frequency distribution of small subgraphs in the network can by visualized in the form of a significance profile or motif spectrum. This can be regarded as a fingerprint of the local network structure and allows to systematically compare different networks and group them into superfamilies of networks with very similar significance profiles13. The analysis of motif spectra has recently gathered much attention as a useful concept to unveil universal design principles underlying the structure of complex networks and it has been applied for the analysis of diverse networks6,12,13,14,15, including food webs12,16.

Standard network characteristics, such as motif spectra, however cannot per se capture the structure of networks that have multiple layers of complexity and therefore cannot be represented as a traditional graph. Here, we propose an extension of standard motif analysis for a specific network type, which we denote as directed ordered networks. Ordered networks are networks in which the nodes can be compared or related to each other by a binary ordering relation, <, that is independent of the graph topology. That is, we assume that for every two nodes in the network it is known which node is ‘smaller’ or ‘larger’ than the other. Ordered networks might reflect, for example, systems where the nodes are associated with an additional intrinsic property, such as size, fitness, importance, or geographic location17. In these systems the nodes can be naturally ordered according to the value of their intrinsic state variable. For example, individuals in a social network may be sorted according to their social status or financial income, (air)ports in a large-scale transportation network may be sorted according to their size or geographic location and proteins in a regulatory network may be sorted according to their molecular mass or ubiquity among biota. In fact, such intrinsic states arise naturally in many natural and technical systems and therefore most empirical networks can be regarded as being ordered.

Ordered relationships are also of fundamental importance for describing trophic species interactions in ecological food webs. Here, species are sorted according to their body-size, accounting for the fact that it is more likely to find large predators feeding on small prey, than vice versa, i.e. large prey having small predators18,19,20. Empirical food webs have been constructed for diverse ecosystems21,22,23,24,25 and statistical10,18,19,20 as well as evolutionary26,27 models have been developed to understand mechanisms underlying them. Food webs have been characterized by a variety of network theoretic measures, such as typical food chain length, clustering coefficient, or degree distribution10. In particular, motif analysis of empirical food webs has shown to be helpful to obtain a quantitative analysis of their local structure12,16,28,29. However, none of these measures is able to incorporate the body-size related hierarchy pattern, i.e., the directed ordered structure, of food webs.

In this report, we develop a general framework for motif analysis in directed ordered networks. We first propose the notion of directed ordered networks as directed networks where the nodes constitute a totally ordered set. Next, we expand standard motif analysis to be able to capture the hierarchical structure in such systems. Our new approach is based on the identification of all ordered 3-node substructures and the visualization of their significance profile. Finally, we apply this technique to computer generated ensembles of ordered networks and to empirical food web data. Thereby, we demonstrate that the extended motif analysis is a promising technique to analyze hierarchical patterns and the role of body-size order in natural food webs and other complex networks.

Directed Ordered Networks

Definition of Ordered Networks

We define an ordered network as a graph, consisting of a collection of nodes that are connected by links, where the nodes constitute a totally ordered set. This means that nodes can be linearly ordered, i.e., for any pair of non-identical nodes i ≠ j either the binary relation i < j or j < i is true. This binary relation is transitive, that is if i < j and j < k then also i < k. When the nodes are ordered we denote the index of a node as its rank. If all network edges have an orientation (i.e., the underlying network is directed) we speak of a directed ordered network.

Ordered networks arise naturally, but are not restricted to, the important situation where every node i is associated with an intrinsic state 17. In this case, the nodes can be embedded along a one dimensional niche axis and, accordingly, we denote this intrinsic state as the node’s niche variable. Node sets with niche variables can be naturally ordered such that ni < nj is equivalent to i < j. That is, nodes with smaller niche variables have smaller rank. Depending on the specific context, nodes can be ordered according to any continuous niche variable that characterizes or is associated to the node, such as the size, fitness, importance, a dynamical state variable, a physical (e.g., temperature) or biological (e.g., population density) condition, or the geographic location. In this sense, ordered networks are ubiquitous in nature and, depending on the question, the same network may be ordered differently according to different niche variables.

The existence of a total ordering breaks the exchange symmetry between the network nodes and thus has important consequences for the network structure. In particular, in a directed ordered graph, any two nodes i and j are related in one of four possible ways (Fig. 1): Two nodes may not be connected at all, there may be a unidirectional link downwards, a unidirectional link upwards (note that we always place nodes with smaller rank vertically below nodes with higher rank), or there may be a bidirectional link. This distinction between downward connections (i.e., links pointing from nodes with a higher rank to a lower rank, Fig. 1, case 2) and upward connections (Fig. 1, case 3) increases the combinatorial complexity of directed ordered networks compared to non-ordered networks and will be the base of our subsequent motif analysis.

Figure 1
figure 1

Four possible relations between any two nodes i and j (with i > j) in a directed ordered network: (1) no connection, (2) downward connection, (3) upward connection, (4) bidirectional connection.

The node j with smaller rank j < i is plotted vertically below the node i of higher rank.

Motif Analysis of Directed Ordered Networks

Network motifs are small subgraphs, commonly corresponding to triplets of nodes, that are significantly overrepresented in a network12. To identify the motifs in a given network, the frequency of subgraphs in the network is compared to the expected number in an ensemble of randomized networks. If the frequency of a given subgraph in the network is significantly larger than the mean frequency in a randomized ensemble, the subgraph is considered a network motif.

In the following we develop an approach to extend the motif analysis to directed ordered networks. Since in a directed ordered graph any two nodes i and j are related in one of four possible ways (Fig. 1), each triplet of nodes can be in one of 43 = 64 possible configurations. Ten of those configurations contain isolated nodes (i.e., nodes without any link) and will not be considered in the following. The remaining 54 connected ordered configurations, or substructures, are shown in Fig. 2. When the hierarchy among nodes is neglected, the ordered substructures collapse to the well-known 13 classes from standard motif analysis12. Thereby, each of the 13 classes has a different size, i.e. it contains a different number of ordered 3-node substructures (Fig. 2). For identification, we denote each substructure by a pair of indices (q,s), where q = 1…13 is the class ID and 1 ≤ s ≤ 6 is the member ID.

Figure 2
figure 2

All unordered and ordered network motifs with 3 nodes.

The top row contains all 13 possible 3-node motifs when the node-order is not considered. Taking into account node-order yields 54 ordered motifs (shown below the top row), which are arranged below their respective isomorphism class. Thereby, each of the 13 unordered 3-node motif classes (labeled by their respective motif ID) corresponds to 1 to 6 ordered ordered 3-node motifs (labeled by their member ID). For each ordered motif, the rank of nodes increases from bottom to top.

In general, in directed ordered networks, downward links will have a different frequency than upward links. This generalizes to substructures containing three nodes: their likelihood of appearance depends on the number of up- and downward connections. We call the numbers of appearances, η(q,s), of all ordered substructures in a network the spectrum of ordered 3-node substructures or motif spectrum for short. Unordered motif spectra can be retrieved by summing over member IDs

The ordered motif spectrum is more fine-grained than that of unordered 3-node motifs: dissimilarities in the appearance of ordered motifs within the same (unordered) motif-class are merged and cannot be detected in standard motif analysis (Fig. 2, top row). Since the frequencies of different substructures can easily differ by orders of magnitude, we usually plot the motif spectrum on a logarithmic scale.

A Random Model for Directed Ordered Networks

We first consider a simple statistical model, the directed ordered random network, that provides a simple approach to generate statistical ensembles of ordered networks and allows to derive properties of the motif spectrum analytically. Our model is a straightforward generalization of the Erdös-Rényi random graph model1 and depends only on two connection parameters, and . Let N be the number of indexed nodes (we assume that the nodes are ordered according to their rank, i = 1…N). For each ordered pair of nodes, j < i, an upward directed link is introduced with probability from the node j with smaller rank to the node i with larger rank and a downward directed link is introduced with probability . Thus, for any ordered pair of nodes the probability for the appearance of each of the four possible configurations in Fig. 1 is given by

In a similar way, the probability for the appearance of each ordered substructure with three nodes (Fig. 2) can be calculated. For the substructure (1,1), for example, the appearance probability is

where the first two multipliers describe the absence of links between middle and top nodes, the third and fourth multipliers correspond to the link between bottom and top nodes and the last two multipliers encode the link between bottom and middle nodes. The expected number, η(q,s), of appearances of each substructure with class ID q and member ID s can be calculated as

where the binomial coefficient corresponds to the number of all possible ordered 3-node combinations.

Figure 3a shows an exemplary motif spectrum in the special case of top-down symmetry, where . In this case the 54 ordered motifs can be grouped into five sets, each of which has a specific appearance probability (see solid horizontal lines in Fig. 3a). Thereby, the probability of appearance for each substructure depends only on the number of directed links l (with 2 ≤ l ≤ 6) within the motif class15

Figure 3
figure 3

Motif-spectra for the directed ordered random network model.

The plots show the appearance number, η(q,s), of ordered 3-node substructures from numerical simulation of 100 network realizations with N = 100 (black dots) and from analytical calculation (open circles). Note, the logarithmic scale. The horizontal lines indicate discrete levels of substructure frequencies. a Symmetric case with . Five distinct levels of appearance numbers according to equation (5) emerge. All substructures within the same motif class have identical appearance numbers. b Asymmetric case with , . Substructures within the same motif class can have different appearance numbers, but are restricted to 13 distinct levels.

Since all substructures of the same motif class have the same number of directed links (Fig. 2) they also have the same appearance probability.

Next, we study the situation where the top-down symmetry is broken. Without loss of generality we assume that upward directed links are more common than downward links (). The corresponding motif spectrum is illustrated in Fig. 3b. In contrast to the symmetric case (Fig. 3a) the substructures within a single motif class may now have different probabilities of appearance. Nevertheless, all appearance probabilities in the spectrum are grouped to 13 distinct levels (marked by solid horizontal lines). Those levels arise from permutations of multipliers in equation (3), corresponding to different substructures, but identical probabilities. We call the set of substructures with identical mean frequencies a statistical class. Note that the 13 statistical classes differ from the 13 isomorphism classes (Fig. 2, top row). In other words, members of any isomorphism class can differ widely in their rate of appearance.

Remarkably, the spectrum in Fig. 3b exhibits three dominant substructures, which correspond to the ordered motifs that contain two upward connections and no downward connection: substructure (1,3), where the node of largest rank is reached by upward links by the other two nodes; substructure (2,6), an upward chain; and substructure (4,2), where the node with the smallest rank has an upward link to each of the other nodes. As will be shown below, the same substructures play an important role in natural food webs.

Motif analysis of Empirical and Simulated Food webs

Using the Niche Model to Generate Directed Ordered Networks

In the following, we show that motif spectra of ordered networks can be used to analyze food web data. For the analysis we compare data from an empirical lake food web with statistical ensembles of directed ordered networks, which are generated by the niche model18. The niche model combines stochastic elements with simple link assignment rules and is well known to be able to synthesize networks of trophic interactions between species that closely resemble empirical food webs10,18. The model depends to two parameters, the species richness N (i.e., the number of biological species, each represented by a node in the food web) and the connectance C (i.e., the proportion of possible links in the food web that actually occur). In the niche model, each species i = 1…N is assigned a random niche variable ni [0,1], drawn from a uniform distribution. The niche variable can be regarded to be a proxy of body-size, which determines the node’s incoming links and it constitutes a natural ordering of the network nodes as described above. In the model, species are constrained to consume prey from a contiguous range of species on the niche axis18. That is, species i preys upon all species j that have a niche parameter nj inside a finite segment of length ri = xni, centred at a position ci that is chosen randomly inside the interval [ri/2,ni]. Here, 0 ≤ x ≤ 1 is a random variable from a beta distribution p(x) = β(1 − x)1−β, with β = (1/2C) − 1.

Motif Spectra of Empirical and Simulated Food Webs

In Fig. 4 we compare the motif spectrum of an empirically measured food web with that from an ensemble of computer generated ordered networks. Figure 4a shows the appearance numbers η(q,s) of all ordered 3-node substructures, which we obtained from the pelagic food web of Alford lake from the Adirondack park21 (data kindly provided by U. Brose). The average body-mass of adult individuals was used as niche variable to order species. The data set contains N = 56 species with a connectance of C = 0.0692. As shown in the figure, only eight different substructures occur in the empirical motif spectrum. The most abundant of these are the substructures (1,3), (2,6) and (4,2). Substructure (1,3) corresponds to a motif of a predator that feeds on two prey species of smaller body-size, (2,6) is a tri-trophic chain where prey have smaller body-size than predators and (4,2) describes a prey that is preyed upon by two predators of larger body-size. The same substructures have also been identified as the most dominant substructures in the asymmetric random model of directed ordered networks (Fig. 3b). This agreement can be explained by the fact that feeding relations between two species are not symmetric with respect to body-size. It is more likely that an upward link from a prey of smaller body-size to a larger predator occurs, than vice versa (as usual, we assume that the direction of feeding links in a food web reflects the direction of energy flow, i.e., a directed link from species A to species B means that B eats A).

Figure 4
figure 4

Motif-spectra of empirical and computer generated food-webs (N = 56 species, connectance C = 0.0692).

a Appearance numbers, η(q,s), of ordered 3-node substructures from the pelagic food web of Alford lake (open circles) and from 1000 realizations of the fitted niche model (grey dots) on a logarithmic scale. The mean appearance number of the model realizations is shown by black dots. Out of the 54 possible substructures, in the food web only 8 and in the realizations of the niche model only 37, substructures occur. b Match between model and experimental data by the Z-score Z(q,s). Motifs (1,3), (6,3) and (5,6) show the largest deviation.

Next, we used the empirical values of N and C to generate statistical food web ensembles with the niche model. In Fig. 4a we plot the ordered motif spectra from 1000 realization of the fitted niche model. As shown in the figure, in the statistical ensemble we observe a total of 37 different substructures—clearly more than in the empirical data set. The substructures that occur in both spectra have appearance frequencies that match rather well and, in general, coincide with the substructures of higher frequencies in the statistical ensemble (Fig. 4).

To characterize deviations between the model and the empirical data, we use the Z-score

where denotes the frequency of substructure (q,s) in the empirical food web, and are the model mean frequency and standard deviation of a substructure, respectively. As shown in Fig. 4b, the strongest deviations between data and model (i.e., a positive Z-score) occur for the motifs (1,3), (6,3) and (5,6), all of which are connected to a pattern of omnivory (see Fig. 2). These deviations can be explained by the observation that the fitted niche model generates fewer of these motifs than observed in the field. The, by far, largest Z-score occurs for the motif (1,3), which is also the most abundant in the empirical motif spectrum (Fig. 4a). Analyzing further food-webs of the Adirondack park (results not shown) we find that this is a common pattern: typically the motif spectrum of the niche model deviates most strongly from that of the most abundant substructure in the empirical data. These results conform with the well-known observation that even though structural food web models, such as the niche model, are able to provide detailed understanding about the structural complexity of natural food webs, they still show some systematic deficiencies to predict the fine structure of complex food webs16,19.

To test for the relevance of body-size ordering in empirical food webs, we compare the motif spectrum of the pelagic food web of Alford lake with that of its randomly re-ordered counterparts (Fig. 5a). If a substructure does not appear in the empirical food web we set η(q,s) = 0 (not shown on the logarithmic scale in Fig. 5a). The figure reveals only slight agreement between the motif spectra of the empirical food webs before and after randomized ordering. This is confirmed by our calculation of the spectrum of Z-scores in Fig. 5b. Note, that now and in Eq. (6) denote the mean frequency and standard deviation of a substructure in the randomized webs. The large entries in the spectrum of Z-scores reveal substantial deviations in the structure of the natural and randomized food webs. If the body-size order is neglected by summing over all member IDs, the spectrum reduces to the 13 standard unordered motif classes and the spectra of empirical and randomized networks become indistinguishable (Fig. 5c). These results indicate that hierarchy due to body-size is a crucial aspect of the structure of empirical food webs. The precise role of body-size order for structuring natural food webs provides an intriguing possibility for future research.

Figure 5
figure 5

Motif-spectra of the pelagic food-web of Alford lake and its 1000 randomly reordered realizations (grey dots).

Data can only be shown for the 6 motif classes that actually occur in the empirical food web. a Comparison of the absolute motif counts η(q,s) on a logarithmic scale. b Relative deviations between the empirical and randomized food webs, measured by the Z-score Z(q,s). c Comparison of the number ηq of unordered substructures according to equation (1) between experimental data and randomized networks on a logarithmic scale. Symbols are explained in the legend.

Discussion

The method presented in this paper is a natural extension of the classic network motif analysis12 to networks with hierarchically structured nodes. For these networks, the spectrum of ordered substructures yields a quantitative description of the connectivity-patterns with respect to node-rank. Thereby, highly abundant ordered motifs, such as substructure (1,3) in Fig. 3b or (4,2) in Fig. 4, represent connectivity-patterns that are typical within the node hierarchy. The spectrum can be expressed in absolute motif counts η(q,s) or appearance probabilities P(q,s), whereby the latter is independent of network size, which makes it suitable for unified comparisons across different networks.

We have shown that ordered motif spectra can reflect a breaking of the top-down symmetry in the hierarchy among nodes. This means that networks, in which connections to nodes of higher rank are more frequent than downward directed connections, naturally contain a larger share of the corresponding motifs that are mostly composed of upward links, or vice versa. In general, we observe large variations in the frequencies of ordered substructures that easily can span several orders of magnitudes. A high abundance of an ordered motif does, however, not directly imply its statistical significance. The latter requires a model to compare against, as demonstrated by our use of the randomly reordered networks and the niche model. On the other hand, rare motifs should not immediately be disregarded, as they might still play a key role in the network. Assume, for example, that in a food web most of the biomass is concentrated on a small number of interacting species. In this case, even if the corresponding motif(s) are very rare, they could still carry an important ecosystem function. Therefore, in general, we suggest to take the whole spectrum of ordered motifs into account for network analysis.

In this work we only considered substructures composed of three nodes, however, the presented technique generalizes to larger motifs. In practice, counting larger motifs might be more challenging due to the rapidly increasing computational demands with growing motif- and network-size. Our approach also generalizes to different classes of natural and theoretical networks. Here, we have discussed food webs, using the species’ niche coordinates to define the ordering among the network nodes. But many other networks, for which motifs have been analyzed traditionally12, might possess natural ordering criteria or hidden niche variables, which would allow for a worthwhile reanalysis using ordered motifs.

Finally, the proposed approach may be helpful to detect structural changes in adaptive networks that are coevolutionary changing in time (e.g., evolving food webs)30 and it might open new avenues for generalizing motif analysis to networks that contain multiple layers of connectivity31,32. Such structures recently have gotten in focus of the scientific attention as networks of networks or multi-layered networks and are characterized by nodes that are connected by more than one type of relationship. For example, the risk of cascading failures in important infrastructure facilities, such as electrical power grids5, may be related to connections within multiple interdependent communication, transport, or infrastructure subsystems31. Similar, in ecology, trophic interactions represent only one of many possible forms by which species can influence each other. It is increasingly recognized that ecological networks contain different interactions beyond feeding relationships, such as host-parasitoid interactions, interference competition and other forms of non-trophic interactions (e.g., mutualism, habitat modification, or facilitation)33,34. The exploration of the structural and dynamical properties of such multi-layered networks is still in its infancy and in particular, simple robust techniques are needed that allow to capture the huge complexity of such systems32. In this sense, our proposed method may not only be helpful to identify hierarchical patterns in empirical food webs and other natural networks, it may also provide the base for extending motif analysis to multi-layered networks.

Additional Information

How to cite this article: Paulau, P. V. et al. Motif analysis in directed ordered networks and applications to food webs. Sci. Rep. 5, 11926; doi: 10.1038/srep11926 (2015).