Abstract
Complex biological, neuroscience, geoscience and social networks exhibit heterogeneous selfsimilar higher order topological structures that are usually characterized as being multifractal in nature. However, describing their topological complexity through a compact mathematical description and deciphering their topological governing rules has remained elusive and prevented a comprehensive understanding of networks. To overcome this challenge, we propose a weighted multifractal graph model capable of capturing the underlying generating rules of complex systems and characterizing their node heterogeneity and pairwise interactions. To infer the generating measure with hidden information, we introduce a variational expectation maximization framework. We demonstrate the robustness of the network generator reconstruction as a function of model properties, especially in noisy and partially observed scenarios. The proposed network generator inference framework is able to reproduce network properties, differentiate varying structures in brain networks and chromosomal interactions, and detect topologically associating domain regions in conformation maps of the human genome.
Introduction
Mining the topological complexity of networks must go beyond estimating statistical network metrics (e.g., degree^{1,2}, clustering coefficient^{3}, pathlength distribution) or measuring the network’s geometric^{4,5,6} properties. Instead, we must elucidate the underlying hidden heterogeneous rules that govern the emergence and dynamics of complex networks. For instance, the interactions between brain regions or neurons (topological structures) generate cognitive functional states, but challenges remain in understanding the brain wiring mechanism and the rules related to cognitive processes in network neuroscience^{7}. Furthermore, topological analysis of yeast chromatin maps reveals a transition from intra to interchromosomal interactions when the yeast undergoes different growing states^{8}, but fails to identify the network generators or rules corresponding to this transition. Moreover, multifractal topological analysis reveals that chromosomal interactions are bifractal^{9}. While these multifractal network analysis efforts can detect subtle conformational changes among complex network components (e.g., chromosomes in human stem cells), they fail to explain the emergence and evolution of networks, identify their general set of hidden network generators, and explain how small changes to these generators can lead to exhibited or enhanced complex behavior. Aside from chromosomal interactions, it has also been proven that various real networks possess a hierarchically organized (selfsimilar) community structure which grows recursively and copies themselves^{10,11,12,13,14,15}. For example, neuronal culture networks^{16} and protein interaction networks^{17} also possess complex multifractal behaviors.
Multifractality has been studied as a topological feature with box covering/box counting methods^{4,18}. The scaling behavior is examined by a renormalization process which coarsegrains the network into boxes^{6,19}. However, commonly used approaches, like renormalization groupinspired algorithms (such as box covering, sandbox) fail to illustrate the network emergence^{20}. Various graph models have been proposed to model the growing scalefree properties and multifractal degree distribution by selfrepeatably inheriting and adding nodes and connections^{21,22}. Nevertheless, the multifractality that exists on a topological level cannot uncover the hidden community structure or the generator rules.
Unlike exploring and measuring the multifractality of various topological structures, we focus on identifying the underlying network generating functions and developing a general mathematical framework together with efficient algorithms to mine the multifractality encoded in the node attributes and the weighted interactions among nodes. The network generating function can provide high level and condensed description of complex systems. Uncovering the generating rules will enable us not only to generate synthetic graph structures with different topological properties, but also reproduce and mine their topological complexity and heterogeneity. The probabilistic description of the generating function should also help to explore the validity of links in a noisy graph and apply to various scenarios. To the best of our knowledge, a robust and general framework of multifractal generating model along with comprehensive analysis is lacking. Although the multifractal network generators^{23} can generate networks with multifractal properties and any given graph metrics, the simulatedannealing based parameter estimation is not robust and the model is limited to binary graphs. The stochastic kronecker graph model^{10,24,25} is also capable to capture selfsimilarity, but the network size is required to be related to the model and the heterogeneity in node attributes is neglected. The multiplicative attribute graph model generalizes the two aforementioned models and characterizes the node attributes in social networks^{26,27}. Though the model formulation is general, the estimation algorithm targets only binary node attributes.
To address these research gaps and better understand the complexity and multifractality of realworld networks, one must address the following major challenges: (1) How can we construct a general multifractal network generating model capable of not only capturing the observed multifractal characteristics, but also provide mathematical tools for efficiently investigating and engineering their macroscopic properties? (2) How can we efficiently and correctly reconstruct the underlying network generator model? (3) Can we recover the weighted multifractal network generative model from incomplete (partial) observations and noisy or adversarial data/influences? (4) Do such techniques scale up to realworld networks and enable us to study whether multifractality appears in realworld applications such as the brain connectomes and chromosomal interactions?
To answer these questions, here we propose a weighted multifractal graph model (WMGM) constructed recursively from generative measures of linking probabilities and capable to capture the multifractality and weighted heterogeneity of functional interactions. To clarify the difference between characterizing the multifractality in topological structure and in the network generating model, we specify that the functional level and model level terms refer to analyzing networks through their reconstructed generative model. In contrast, the graph level term means that we are examining the properties of the network topology. To efficiently learn the parameters of the network generating model, we provide a rigorous variational inference framework capable of reconstructing the underlying multifractal network generator for partially observed networks. This inference method can deal with networks of arbitrary sizes and any attribute cardinality; it also offers a robust parameter estimation. We examine our proposed approach on both synthetic and realworld networks. We show that the proposed model can characterize and reproduce many graph properties (i.e., degree, clustering coefficient, weight distribution). We present the efficiency and robustness of the proposed model and inference method against incomplete and noisy observations. By applying the network generator inference framework to realworld datasets (e.g., brain networks, chromatin interactions), we reveal the hidden structure of complex systems at the functional level. The results indicate that the WMGM is capable of differentiating between various structures in brain networks and in chromatin interactions. We further show that the proposed inference algorithm can help to detect topologically associating domains (TADs) in chromosomal interaction maps.
Results
Weightedmultifractal graph model
We propose the weighted multifractal graph model (WMGM). It is meant to serve as a generalized network generating rule that captures the observed multifractal properties associated with node attributes and the heterogeneity in weights (intensities of pairwise interactions). Building on measure theory concepts, the crux of this multifractal network generating model is to construct a series of probabilities that we associate with the side lengths of rectangles that are recursively built up by repeatedly splitting a unit square. This ensures a heterogeneous selfsimilar network structure. These probabilities are then used to generate the node attributes and edges for the network.
We first define an initial generating measure \({\theta }^{(1)}=\left({l}^{(1)},{p}^{(1)}\right)\) on a unit square. The rationale for considering a unit square is to ensure that the probability mass function of node attributes sums to 1. Next, the unit square is divided into M^{2} rectangles, where \({\{{l}_{i}^{(1)}\}}_{i = 1}^{M}\) are the side lengths of each rectangle. To these rectangles, we assign the probabilities \({\{{p}_{ij}^{(1)}\}}_{i,j = 1}^{M}\). We consider symmetric p^{(1)} terms in this work, but as an extension, we could permit the asymmetric case which can model directed networks. Along the same lines as in the multifractal network generator^{20,23}, the selfsimilar WMGM \({\theta }^{(K)}=\left({l}^{(K)},{p}^{(K)}\right)\) is formulated recursively from this unit square θ^{(1)} with interval length l^{(K)} = l^{(K−1)} ⊗ l^{ (1)} and linking probability p^{(K)} = p^{(K−1)} ⊗ p^{(1)}. Here ⊗ denotes the Kronecker product.
An undirected weighted graph is then generated by the following procedure: (1) N nodes are spread into M^{K} classes with prior probabilities l^{(K)}. The indicator variable ϕ_{uq} denotes the label indicating whether a node u has attribute q. Note that ϕ_{uq} = 0 or 1, \(\mathop{\sum }\nolimits_{q = 1}^{{M}^{K}}{\phi }_{uq}=1\). The attributes follow the categorical distribution \(P\{{\phi }_{uq}=1\}={l}_{q}^{(K)}\), q = 1…M^{K}. (2) The edges between nodes u and v are generated with a linking probability p^{(K)}. Let \({\{w(r)\}}_{r = 0}^{\infty }\) denote the predefined weight distribution, where w(0) = 0 and \(\left\{w(r)\right\}\) is monotonically increasing. The probability that an edge between node u and v has weight w(r) is given by \(P\{{e}_{uv}=w(r) {\phi }_{uq}=1,{\phi }_{vh}=1\}={\left({p}_{qh}^{(K)}\right)}^{r}\left(1{p}_{qh}^{(K)}\right)\). For simplicity, we denote it as \({p}_{qh}^{(K)}({r}_{uv})\), where r_{uv} is the weight category r of the edge between node u and v. Here, the probability that an edge does not exist is \({p}_{qh}^{(K)}(0)=1{p}_{qh}^{(K)}\) and the chance that an edge (regardless of the weight) exists is \(\mathop{\sum }\nolimits_{r = 1}^{\infty }{p}_{qh}^{(K)}(r)={p}_{qh}^{(K)}\). It naturally satisfies \(\mathop{\sum }\nolimits_{r = 0}^{\infty }{p}_{qh}^{(K)}(r)=1\) and can be easily mapped to unweighted graphs. In contrast to^{20}, where the linking probabilities p^{(1)}(r) are determined for each weight level r, we design the edge distribution \(P\{{e}_{uv}=w(r) {\phi }_{uq}=1,{\phi }_{vh}=1\}\) as the geometric distribution with \({p}_{qh}^{(1)}\) identical to all weights. The rationale is that smaller weights are more common. We also aim at using fewer parameters to capture the heterogeneous graph structure.
The multifractality of the model emerges from the recursive construction. The derivation of the partition function and the multifractal metrics are presented in the method section Multifractal analysis of WMGM. Special cases of the proposed model correspond to several related models. When M^{K} = N, the proposed weighted multifractal graph model retrieves the Kronecker model^{10} as a particular case. When the weight is neglected (i.e., total weight level R = 1), the proposed model reduces to the multifractal network generator^{23}.
Figure 1a shows the numerical example of the model building procedure and graph generation. The model θ^{(2)} in the middle is constructed by θ^{(1)} shown on the lefthand side at the first iteration. In the simulated graph, the colors of nodes represent the attributes generated by l^{(2)} and the weights of the links are generated by p^{(2)}. Figures 1b and c show two different models which generate networks with different community structures. In the generating model θ^{(K)}, the linking probability p^{(K)} approximates the connection rules and community structure. Larger value of \({p}_{qh}^{(K)}\) leads to denser connection between nodes in community q and h. If \({p}_{qh}^{(K)}\) is even across different q, h, it suggests relatively ambiguous heterogeneity (Fig. 2a). When \({p}_{qh}^{(K)}\to 1\), each pair of nodes in community q and h has a connection and thus it creates a fully connected subgraph (Fig. 2b). When \({p}_{qh}^{(K)}\to 0\), no connection may exist between category q and h, leading to an mpartite structure (q = h, Fig. 2c) or a community structure (q ≠ h, Fig. 2d).
To overcome the challenges related to mining largescale complex systems (e.g., heterogeneity in weights, scaledependent higher order interactions), we investigate how the proposed WMGM can decipher the hidden rules that govern their complex topological architecture and functionality. Indeed, mapping a large network to a generative model can contribute to losing some intricate details of subnetworks and their interactions. However, reconstructing a function level model can compress unnecessary redundant information and allow us to deal with incomplete or noisy data, which is common in realworld datasets. Consequently, the WMGM can learn the hidden rules that govern structures in complex systems such as brain networks (e.g., understanding and interpreting the emergence of neuronal computation in brain networks), biological systems (e.g., understanding the emerging genotypephenotype relationships) and social networks. To investigate the benefits and limitations of the WMGM, we evaluate its capabilities in reconstructing the network generating models from scarce noisy observations on a series of artificially generated and realworld networks. First, we validate our method on synthetic data in terms of convergence and estimation error. Next, because realworld networks are usually only partially observed and often noisy, we investigate the robustness of our approach to such factors. We also apply our method to three realworld complex networks, namely the brain connectome of Drosophila, the chromosome interactions of yeast cells undergoing different growth states, and the conformation maps of replicated human chromosomes. The results show that our method can reproduce and elucidate important properties of realworld complex systems.
Learning the hidden network generators (rules) from partial and noisy observations of synthetic networks
To validate the ability of the proposed WMGM framework to reconstruct the groundtruth generators and to understand its estimation error, we examine three synthetic network case studies, from least to most challenging: (i) Clean fully observed networks, (ii) Thresholdvarying partially observed networks, and (iii) Noisy networks with spurious edges. In the case of fully observed network setting, we demonstrate that our model can successfully reconstruct the groundtruth generator and reproduce the graph properties of the synthetic network. We also show that the WMGM inference is robust up to a certain level of missing observations in the case of partially observed networks, and can handle noise by distinguishing between spurious and true links in the last case.
Network generator reconstruction
We first examine the network generator reconstruction accuracy and the ability of the WMGM to recover simulated graph properties. We use a synthetic graph G_{syn} of 500 nodes generated by l^{(1)} = [0.7, 0.3], p^{(1)} = [0.8, 0.5; 0.5, 0.4] with hyperparameters M = 2, K = 3 and a predefined weight set \(\left\{w(r)=r\right\}\). We implement the variational expectation maximization (EM)based estimation method (described in the Methods section Parameter estimation of WMGM) on the fully observed simulated network with the step length γ = 10^{−7} of the gradient methods in Mstep. The algorithm stops when the increment of the objective function (lower bound of loglikelihood) after one EM iteration is smaller than 0.1. Figures 3a and b show the convergence of the lower bound of the loglikelihood \({{{{{{\mathscr{L}}}}}}}_{Q}(\theta ,R)\) and the reconstructed parameters as the EM iteration proceeds, respectively. We note that the lower bound exhibits a fast convergence within the first 20 EM iterations and later slightly increases and converges after 120 iterations. The relative absolute error of the reconstructed lower bound \({{{{{{\mathscr{L}}}}}}}_{Q}(\theta ,R)\) and the true loglikelihood \({{{{{{\mathscr{L}}}}}}}_{{{{\rm{syn}}}}}(R)\) of the synthetic graph G_{syn} is \( ({{{{{{\mathscr{L}}}}}}}_{Q}(\theta ,R){{{{{{\mathscr{L}}}}}}}_{{{{\rm{syn}}}}}(R))/{{{{{{\mathscr{L}}}}}}}_{{{{\rm{syn}}}}}(R) =0.0022\), which shows the recoverability of the proposed WMGM framework. Meanwhile, the estimated parameters show a similar trend and converge to the groundtruth values. Figure 3c presents the mean relative absolute error (RAE) per parameters as a function of the EM iterations. It shows that the error decreases fast within the first 20 EM iterations, and when the small increment of the loglikelihood lower bound emerges at 100 EM iterations in Figure 3a, the error begins to drop sharply again. After 120 EM iterations, it achieves the minimum error of 0.013 (1.3% error per parameter), and when the algorithm terminates, the error is 0.032 (3.2% error per parameter). Figure 3b shows that the value of the recovered \({p}_{22}^{(1)}\) (yellow line) crosses the ground truth 0.4 at 120 EM iterations and then decreases by a small quantity. This suggests that we can early terminate the algorithm when the objective function starts to converge and achieve the best performance.
We also investigate the recoverability of the network structures through the proposed WMGM framework. In Fig. 3d–f, we compare the simulated and reconstructed network properties including the clustering coefficients, degree distribution, and weight distribution. The simulated distributions (blue lines) are directly calculated from the synthetic network G_{syn} and the reconstructed results are calculated from a network G_{recon} generated by the recovered WMGM. We observe that the proposed estimation method can successfully reproduce the network properties of synthetic networks. In Supplementary Note 5 we further quantify the dissimilarity of the simulated and reconstructed graph properties, while in Supplementary Note 3 we include null models as comparison to show the efficiency of the estimation algorithm.
No guarantee exists that the EM algorithm converges to the maximum likelihood estimator. If the objective function is nonconvex, the algorithm may terminate at or near a local optimum. The estimation accuracy is also related to the size of the network and its density. Consequently, we investigate the dependency between mean relative absolute error and the multifractal spectrum width of the model and other key properties in Fig. 4e. We select different levels of randomness in the model with varying positions of the multifractal spectrum, generate a graph of 200 nodes, then recover the model with same model initialization and measure the mean relative absolute error per parameter. We consider three cases: all parameters are randomly generated (in blue asterisks); \({p}_{12}^{(1)}={p}_{21}^{(1)}=0.5\) with random \({p}_{11}^{(1)}\), \({p}_{22}^{(1)}\), \({l}_{1}^{(1)}\), \({l}_{2}^{(1)}\) (in green dots); and \({l}_{1}^{(1)}={l}_{2}^{(1)}=0.5\), \({p}_{12}^{(1)}={p}_{21}^{(1)}=0.5\) with fixed center of multifractal spectrum (in red circles). The multifractal spectrum width is calculated as in Methods section Multifractal analysis of WMGM. Figure 4e shows that the random cases make such local minima particularly prominent.
Reconstruction of network generator from partial observations
Complex networks are usually partially observed. This situation has many causes, including the following scenarios: (1) the network is still growing and new nodes can join in the future; (2) it is computationally expensive or technologically impossible to examine the whole network (e.g., all neurons in the human brain). Therefore, we investigate the ability of successfully reconstructing the groundtruth WMGM from partial observations.
For the partially observed experiments, we use a synthetic graph with N_{0} = 100, 000 nodes generated by l^{(1)} = [0.7, 0.3], p^{(1)} = [0.8, 0.5; 0.5, 0.4], M = 2, K = 3. At each time, we randomly select N of N_{0} nodes and take the connections among the selected N nodes as incomplete data. We repeat this process 10 times for each N and measure the recovered parameters. The mean and standard deviation of recovered parameters and error are shown in Fig. 4a, b against the number of nodes observed N. We find that the model is correctly recovered with low standard deviation at N = 200 or more nodes observed, where the mean relative absolute error (RAE) per parameter with N = 200 is 3.1% and the standard deviation is 0.5%. We conclude that the proposed WMGM and the inference method is robust against missing components in the system.
We repeat the experiments with different full original network sizes N_{0} = 10^{3}, 5 × 10^{3}, 10^{4}, 5 × 10^{4}, 10^{5} and the same N. We calculate the mean RAE and report the minimum fraction of observed nodes f to achieve a small certain error in Fig. 4c. Both axes are in log scale. The blue dots represent mean RAE smaller than 0.035 (3.5%) and red asterisks represent mean RAE smaller than 0.050 (5.0%). They are well fitted by power laws (shown as solid lines). The regression for error smaller than 0.035 is \(f=332\times {N}_{0}^{1.04}\) and for error less than 0.050 is \(f=95\times {N}_{0}^{0.97}\). It shows that the required size of observation to achieve a certain small error is decreasing and follows a power law as the original network size grows. Figure 4d shows the relationship of the average error of 10 experiments with combinations of N = 50, 100, …, 500 and N_{0} = 10^{3}, 5 × 10^{3}, 10^{4}, 5 × 10^{4}, 10^{5}. The axis of N_{0} is set as log scale. The underlying generating model is recoverable when the partial observation contains more than 200 nodes, regardless of the original network size N_{0}. This is critical, as in realworld complex systems, only partial observation without full monitoring and detection is possible. Since we use the WMGM as the generating rule and we assume the networks are partially but evenly observed, reconstruction with partial observation (a subgraph) can achieve good performance while saving on computational cost. It suggests that when dealing with very large networks, it is possible to correctly estimate the hidden generating rules even using a small subset of the network with only 200 nodes. We further perform more individual experiments to show the robustness in Supplementary Note 7.
Reconstructing the network generators from noisy observations and quantifying the reconstructed link reliability
We test the proposed WMGM and the estimation algorithm on noisy networks with spurious links. For this noisy setting, we first generate a synthetic binary graph with the same model as in section Reconstruction of network generator from partial observations. In the binary version, weights are neglected. Any edge in the weighted graph with e_{uv} ≠ 0 is considered as an edge in the binary version G_{0} of the graph. Next, spurious links are randomly added with probabilities p = 10^{−3}, 3 × 10^{−3}, 10^{−2}, 3 × 10^{−2}, 10^{−1}, 3 × 10^{−1} between pairs of nodes where no edges exist in the original network, producing a noisy graph G_{n}. For each noise level p, we individually run the experiments 10 times. We call links that exist in the synthetic graph G_{0} and the noisy graph G_{n} as true positives, and links added in G_{n} are false positives. We aim at differentiating true positive and false positive links in noisy graphs. We first reconstruct the generative model θ_{n} from the noisy observation G_{n}. We define the link reliability of an edge e_{uv} in the noisy network with its likelihood given by the reconstructed model as \(L{R}_{uv}={{{{\mathrm{log}}}}}\,P({e}_{uv} {G}_{n};{\theta }_{n})\). The link reliability of the link between node u and node v can be estimated as follows:
Figure 5a shows the cumulative distribution of link reliability for different labels, true positive and false positive, with relative noise level p = 10^{−1}. Spurious links (false positive) have lower reliability than true ones (true positives) in their distributions. The average link reliability of false positives is also smaller than the one with true positives. The distinctness implies that the WMGM is able to detect noise in observations and therefore can help to denoise graphs. We validate the ability of graph denoising and its application in Supplementary Note 1. Figure 5b, c shows the reconstructed model parameters and the estimation error for different noise levels p. More spurious links (noise) are added to the network when the noise level p increases. As a consequence, recovered parameter error also increases as the noise level grows. The curve shows that the estimation error is smaller than 0.05 (5%) with low noise level p < 10^{−1}. The estimations are also robust (with low variance of reconstructed parameters and error) when p ≤ 10^{−1}. Though the estimation error is relatively large (10%) when p = 10^{−1}, Fig. 5a shows that even with relatively highlevel noise, our model is still capable of distinguishing noise links and true links in the network. We also perform more repetitions to show the robustness in Supplementary Note 7.
Learning the hidden network generators (rules) of biological networks
To demonstrate the capabilities and benefits of the proposed WMGM inference framework, we investigate and learn the network generators (rules) of the following three biological networks: (i) the neuronal connections in adult Drosophila central brain^{28}, (ii) the chromatin interactions of yeast genome^{8}, and (iii) the conformation maps of replicated human chromosomes^{29}. We show that the WMGM enables us to reveal important properties of these biological datasets such as recovering their topological network properties, differentiating growing states, identifying specific features of brain structures in different regions, and detecting TADs. We also conduct experiments on various social networks. The results of social networks can be found in Supplementary Note 2.
Revealing the network generators of Drosophila brain connectome
We use the largest synapticlevel connectome obtained through a three photon microscopy from fruit fly brain^{28}. Chemical synapses between neurons are detected and the numbers of synapses are calculated as the intensity of neuron connections. The original Drosophila connectome G_{0} of the left alpha lobe in the mushroom body consists of 10,790 neurons (nodes) and 6444 identified synaptic connections (edges). We delete neurons without connections and use the connected 693 nodes to construct a network G with the full 6444 connections and reconstruct the WMGM. We neglect the isolated nodes to avoid the extra computational cost and construct a relatively denser network to achieve better model estimation performance. When the network is large and sparse, the estimation tends to be unstable and inaccurate because we have very limited link observations. In Supplementary Note 9 we show the results with different node and edge sampling size. Note that the method of sampling nodes could influence the network topological structure and the reconstructed model. In the future, we will also investigate and develop strategies that allow us to select the minimum number of nodes required to accurately reconstruct the WMGM model obeying different network properties and for different network sizes. Also, we can always involve the sparsity to the recovered WMGM by adding a negative bias to each linking probability parameter p^{(1)} when the variational EM algorithm process is finished. We discretize the network G with 2^{r} ≤ w(r) < 2^{r+1} − 1 and then use it as the input to the proposed estimation algorithm.
Figure 6 shows the estimation and reconstruction results. Figure 6a, b show the convergence of the lower bound of loglikelihood and parameters with EM iterations. Figure 6c illustrates the reconstructed WMGM. The colors in the square represent the values of p^{(K)} probabilities, and the interval lengths reflect l^{(K)}. The brain connectome in the alpha lobe is sparse, therefore most regions in the square model have small linking probability values. The exception is on the rightbottom diagonal, which has the value \({p}_{88}^{(K)}=0.5213\). Its presence is due to the appearance in the connectome of a group of very strong interactions among around 20 neurons. Figure 6d–f presents the clustering coefficients, degree distribution and the cumulative weight distribution of the empirical and reconstruced brain networks. The blue line is the empirical distribution directly extracted from the brain connectome. The red line is the result calculated from a synthetic network generated by the reconstructed WMGM. The yellow dash line represents a null network which is generated by the ErdosRenyi model^{30} with average linking probability. We also show the distribution in log scale in Supplementary Note 8. The empirical and reconstructed distributions are close to each other, showing that the WMGM and the proposed inference approach can also reconstruct the statistical properties of real networks. In the scenarios of brain connectome and neuronal connections, it is extremely hard to detect and monitor all neurons or the full functional connectivity due to its complex three dimensional structure and the unknown physicochemical interactions. In the Reconstruction of network generator from partial observations section, we discussed the recoverability of the WMGM with limited observations. Therefore, the proposed model can enable neuroscientists to estimate hidden rules and learn topological properties of brain networks even if only limited and partial observations are available.
Brain networks in different brain regions have varying topological structures and features. We exploit our proposed WMGM inference framework to examine the structure and connectivity in four regions with different functionalities of the Drosophila optical lobe: Medulla, Accessory Medulla, Lobula and Lobula Plate. Recall that the brain connections are sparse and tend to appear in a small subset. Therefore, we select the most connected 200 nodes in these brain regions and binarize them as the input to the WMGM inference algorithm. For reconstruction accuracy, we run the inference algorithm 50 times on each brain network and calculate their mean and standard deviation. For the Medulla connectome, we obtain an average network generator with the following parameters l^{(1)} = [0.63, 0.37], p^{(1)} = [0.18, 0.26; 0.26, 0.92]. For the Accessory Medulla connectome, the parameters of the average network generator are as follows: l^{(1)} = [0.48, 0.52], p^{(1)} = [0.07, 0.14; 0.14, 0.34]. For the Lobula connectome, the parameters of the average network generator read: l^{(1)} = [0.46, 0.54], p^{(1)} = [0.41, 0.42; 0.42, 0.95]. For the Lobula Plate connectome, we obtain the following average network generator model: l^{(1)} = [0.42, 0.58], p^{(1)} = [0.12, 0.21; 0.21, 0.82]. The standard deviation for each parameter in each network is smaller than 10^{−10}. The inference results p^{(1)} and l^{(1)} are visualized as colors and side lengths of the yellowgreen squares in Fig. 7. We further show the clustering coefficients and degree distribution of the four brain networks in Supplementary Note 10. It is impossible to obtain a concise description for each network while encoding all their properties. We conclude that the reconstructed network generator models can be easily distinguished and our WMGM can be used to differentiate scaledependent brain regions with different functionalities. Moreover, we can exploit the WMGM to define the regional cognitive functionality using the reconstructed generating rule θ = (p, l). This enables us to measure and quantify the neural behaviors and cognition divergence.
Inferring the network generators of chromosomal interactions of yeast genome in different growing states
The chromosome conformation capture analysis (also known as HiC technique) reveals the topological structure of the genomic sequences^{29,31} and allows scientists to examine the chromatin’s 3D structure. It measures the contacts between any pair of genomic loci^{31}. In the chromosomal interaction matrix built by the HiC technique, the nodes represent the genomic loci and pairwise edge indicates the interaction frequency between two loci in the genome^{32}.
During the various growing states, the yeast genome exhibits a complex topological reorganization^{8}. To mine this topological complexity, we infer the WMGMs emerging from the chromosome interaction data^{8,32}. For each chromosomal interaction matrix, we first downsample the 12,048by12,048 matrix to 503by503 and then discretize it with 200 ≤ w(r) < 200 + 100r.
Figure 8a, b illustrates the reconstructed WMGM θ^{(K)} from the chromosome interactions of the yeast genome in the exponential growth and quiescence states, respectively. We fix l^{(1)} in both growing states to be identical such that we can compare the linking probabilities. The value of p^{(K)} in the major subblocks are shown in the figures. Figure 8a shows that the linking probabilities on the diagonals of exponentially growing yeast cells are larger compared to cells in quiescence state shown in Fig. 8b, while the nondiagonal elements are relatively smaller than the ones in quiescence state. This suggests that when the yeast is growing, the interchromosomal interactions become weaker (\({\sum }_{i\ne j}{p}_{ij}^{(K)}{l}_{i}^{(K)}{l}_{j}^{(K)}\) changes from 0.2217 to 0.1758) and intrachromosomal interactions become stronger (\({\sum }_{i = j}{p}_{ij}^{(K)}{l}_{i}^{(K)}{l}_{j}^{(K)}\) changes from 0.1201 to 0.1516). This conclusion is consistent with the analysis in^{8}, where the authors measure the intrachromosomal distances between two sites on one chromosome. Figure 8c shows the cumulative weight distribution of the chromosome contact maps. The difference in the recovered model is clearer when comparing with the statistical properties of the network weights. We note that the WMGM enables us to identify these properties and our model can therefore help to distinguish between different growth states.
Network generator inference and analysis for the conformation maps of replicated human chromosomes
Chromosome conformation capture analysis (also known as HiC technique) reveals topological structure of the genomic sequences^{29}. TADs detection identifies the highly selfinteracting chromatin regions. TADs emerge as square blocks whose centers locate at the diagonal of the interaction matrices. Though TADs emerge as critical features to characterize the high intradomain contacts, an unambiguous definition is still evolving^{31}. We infer the WMGM from a binarized Cis sister contact maps from^{29} and show that our model can help to detect the TADs in HiC matrices.
In the variational EM based estimation method (see Methods section on Parameter estimation of WMGM), we introduce the variational parameters τ_{uq} to calculate the lower bound of the loglikelihood \({{{{{{\mathscr{L}}}}}}}_{Q}(\theta ,R)\) in Eq. (3). For each node u, \({\{{\tau }_{uq}\}}_{q = 1\ldots {M}^{K}}\) can be viewed as soft assignments regarding the probability that node u has attribute q. They are also estimators of node attribute distribution parameters l^{(K)} and each node is assigned with one τ. We calculate the entropy of the variational estimator τ_{u} with each node u as \(H(u)=\mathop{\sum }\nolimits_{q = 1}^{{M}^{K}}{\tau }_{uq}{{{{\mathrm{log}}}}}\,{\tau }_{uq}\). Figure 9a shows the binarized HiC interaction matrix of the human chromosome 21 Cis sister contacts. For the best reconstruction of the WMGM, we downsample it to a 483by483 matrix and apply the proposed inference algorithm. TADs are circled with orange and green rectangles. The node attribute entropy H(u) is shown in Fig. 9b, where low entropies are close to zero and are circled. We find that the positions (node index) with zero entropies are in correspondence with the region where TADs emerge. We conclude that nodes in TADs have high intrainteractions and tend to have low entropy (where the WMGM has high confidence). This suggests that our WMGM can also help to detect TADs in HiC data analysis.
Discussion
Exploring topological features in complex networks has the potential to enhance our understanding of the behavior of natural and social phenomena. Among massive topological features, we focus on multifractality, an important property that widely exists in complex systems from numerous domains, including biology, sociology, neuroscience, and geology. Analyzing the multifractality of complex systems enables scientists to measure the multiscale interactions among components in largescale complex networks^{5}.
Network multifractality is commonly studied and analyzed at the graph level, where the structure of connections among nodes are considered as selfsimilar^{6,19}. However, past approaches suffer from a number of limitations. Renormalization groupinspired algorithms, capable of estimating the multifractality of graphs fail to explain the emergence and evolution of networks and cannot decipher the hidden generating rules. The stochastic Kronecker graph model^{25} captures selfsimilarity by building a probabilistic model with Kronecker products. However, it requires the graph and model size to be the same, limiting applications to arbitrary scale and partially observed networks. In^{20,23}, the authors propose multifractal network generators, but they reconstruct the model parameters by fitting graph metrics via a simulatedannealing procedure. The simulatedannealing algorithm is unstable and can return various sets of unrelated parameters. This makes it difficult to interpret the generator’s physical meaning. The majority of network models also neglect the importance of weights characterizing the interactions in complex networks.
To decipher the hidden network generators (rules) governing the complex systems dynamics at the functional level, we proposed the weighted multifractal graph model (WMGM). It is capable of capturing their heterogeneity and varying degrees of selfsimilarity. The proposed WMGM serves as a function that maps and compresses the large graph onto a model. The network generating function can also provide highlevel and condensed description of complex systems, which integrates varying graph metrics such as degree and clustering coefficient. To efficiently infer the model parameters, we develop a variational EM inference framework for reconstructing the underlying network generating function encoded in complex networks. We investigate the groundtruth recovery and the robustness of the model inference against incomplete data and noisy observations. The provided mathematical tools can help to investigate network (topological) features by describing largescale complex systems through functional approaches. Uncovering the generating rules enables us not only to generate synthetic graphs with different properties, but also reproduce the topological complexity and heterogeneity of realworld systems. The proposed WMGM framework is applied to several realworld networks – neuronal connections and chromosomal interactions. The recovered WMGMs demonstrate the potential of the WMGM framework to capture and reproduce the topology structures of real networks. The probabilistic description of the generating function also helps to explore the validity of links in a noisy graph and denoise the system. The reconstructed generator is also able to distinguish different functional regions in the brain and yeast growing states, as well as to detect the boundaries of TADs.
In this work, we assume the distribution of node category is the same across nodes and communities. Generalizing the topological scales as the authors show in^{33} can improve the ability of the proposed method to capture the heterogeneity embedded in the topological structure of realworld complex systems such as brain networks. For example, instead of using one rule \(P\{{\phi }_{uq}=1\}={l}_{q}^{(K)}\) to generate the node attribute, we can introduce a heterogeneous distribution where the value of K is varied across nodes in different community. The category (community) of each node u is assigned by \({\phi }_{u} \sim categorical({l}^{({k}_{u})})\). The heterogeneity is introduced as the random variable k_{u} (scale of node u, which is a positive integer between 1 and K under distribution f(k_{u}). The linking probability between nodes u and v given the community ϕ_{u}, ϕ_{v} and scale k_{u}, k_{v} can be calculated as \({\sum }_{q,h}{p}_{qh}^{(K)}\), where the summation over q, h satisfies \({\sum }_{q}{l}_{q}^{(K)}={l}_{{\phi }_{u}}^{({K}_{u})}\) and \({\sum }_{h}{l}_{h}^{(K)}={l}_{{\phi }_{v}}^{({K}_{v})}\).
In the future, we will also investigate the effect of various methods to sample the networks (as we have performed in the Drosophila connectome and social networks case studies) and develop strategies that will provide consistent WMGM generating model across various subgraphs with different sizes and properties. Moreover, future applications of the proposed WMGM framework includes inferring the WMGM models that correspond to partially observed neuronal activity and quantifying how the identified WMGM models evolve and selfoptimize during the observed cognition activities. A crucial question in neuroscience is, how can we measure, identify and compare higher order topological characteristics of neuronal behaviors and activities under different (cognitive) circumstances. On the spike train level, it is difficult to compare the spiking behavior of different recording lengths and varying number of neurons in the neuronal systems during nonstationary brain activity. Different network sizes are hard to compare at the neuronal network level. In the future, we will propose an approach to mine the neuronal activity that focuses on identifying the compressed WMGM models and quantifying their evolution and the distances among various WMGM models corresponding to cognition tasks. More precisely, we can first reconstruct the generating measure from observed neuronal networks exhibiting or performing different cognitive behavior and quantify the changes in generators programming the neural behaviors through modifications in the model parameters θ. Subsequently, we can calculate the distance between two cognitive behaviors as a distance between the parameters between two WMGM models. Another important future work includes an implementation aimed at networks that are very sparse and the application of the WMGM to detect hierarchical community structures in complex systems such as cyberphysical systems. With the aid of the proposed WMGM, we are also looking into quantifying cognition given neuronal behaviors and neuronglia (astrocyte) metabolic coupling and information processing under different cognitive tasks. In the future, when realtime brain activity monitoring is available, the WMGM can be extended to analyze timevarying complex networks generators of label free realtime imaging of neuronglia activity. This could represent a major step towards a comprehensive understanding of nonMarkovian learning and decision making and other brain cognitive functions. With respect to the 4D Nucleome networks, future work will focus on constructing more robust strategies for identifying the TADs with applications to HiC analysis.
Methods
Parameter estimation of WMGM
In this section, we discuss how to recover the parameters for the WMGM via a variational approach. We first provide a probabilistic description of a weighted network within the WMGM framework. Let R denote the NbyN adjacency matrix of the weight category in the graph. Recall that ϕ is the latent node attribute indicator. The probabilistic description of nodes and edges are given by \(p(\phi ;l)=\mathop{\prod }\nolimits_{u = 1}^{N}\mathop{\prod }\nolimits_{q = 1}^{{M}^{K}}{\left({l}_{q}^{(K)}\right)}^{{\phi }_{uq}}\) and \(p(R \phi ;p)={\prod }_{u\,{ < }\,v}\mathop{\prod }\nolimits_{q,h = 1}^{{M}^{K}}{\left({p}_{qh}^{(K)}({r}_{uv})\right)}^{{\phi }_{uq}{\phi }_{vh}}\). Here, we focus on undirected graphs without selfloops. Directed graphs and graphs containing selfloops can also be expressed by changing the summation condition over u, v. In this work, we view M and K as hyperparameters; they are selected prior to the inference procedure.
Given an observed weighted network, we seek to estimate the underlying network generating function θ^{(1)} by maximizing the likelihood function \({{{{{\mathscr{L}}}}}}(\theta )\) on the left of the following:
However, the summation over ϕ makes the loglikelihood \({{{{{\mathscr{L}}}}}}(\theta )\) intractable. Therefore, instead of maximizing the loglikelihood, we alternatively aim to maximize the evidence lower bound \({{\mathbb{E}}}_{Q}\left\{{{{{\mathrm{log}}}}}\,\frac{p(R,\phi ;\theta )}{Q(\phi )}\right\}\) (the righthand side above). In order to minimize the gap between the loglikelihood and its lower bound, which is the Kullback–Leibler divergence from P(ϕ∣R; θ) to Q(ϕ), we choose the distribution over ϕ to be \(Q(\phi )=\mathop{\prod }\nolimits_{u = 1}^{N}\mathop{\prod }\nolimits_{q = 1}^{{M}^{K}}{{\tau }_{uq}}^{{\phi }_{uq}}\), where the variational parameters τ_{uq} measure the soft assignments of node u, \(\mathop{\sum }\nolimits_{q = 1}^{{M}^{K}}{\tau }_{uq}=1\) for u = 1…N. This is known as the meanfield approach in variational inference^{34}. Therefore, the lower bound of \({{{{{\mathscr{L}}}}}}(\theta )\) can be computed as
Algorithm 1: Reconstructing the WMGM through a variational EM algorithm
input: N, M, K and adjacency matrix R in weight category
output: p^{(1)}, l^{(1)}
parameter sweep: M = 1, 2, 3 and K = 1, 2, 3, 4
initialization: p^{(1)}, l^{(1)}, τ
repeat
Estep: update τ by p^{(1)} and l^{(1)}
repeat
\({\tau }_{uq}\leftarrow {\lambda }_{u}{l}_{q}^{(K)}\exp \left\{\mathop{\sum }\nolimits_{v\ne u}\mathop{\sum }\nolimits_{h}{\tau }_{vh}{{{{\mathrm{log}}}}}\,\left({p}_{qh}^{(K)}({r}_{u}v)\right)\right\}\)
until τ converges;
Mstep: update p^{(1)} and l^{(1)} by τ
\({l}_{i}^{(1)}=\frac{1}{NK}\mathop{\sum }\nolimits_{u,q}{\tau }_{uq}\mathop{\sum }\nolimits_{k = 1}^{K}{\mathbb{1}}\left\{q(k)=i\right\}\)
repeat
\({p}_{ij}^{(1)}\leftarrow {p}_{ij}^{(1)}+\gamma \frac{\partial {{{{{{\mathscr{L}}}}}}}_{Q}(\theta )}{\partial {p}_{ij}^{(1)}}\)
until p(1) converges;
until loglikelihood converges;
The inference procedure is then performed via variational expectation maximization (EM), as shown in Algorithm 1. In the Estep, given the parameters l^{(1)} and p^{(1)}, we maximize Eq. (3) with respect to τ. We update τ_{uq} by a fixed point iteration following a similar strategy as in^{34}. λ_{u} is the normalization factor to satisfy the constraint \(\mathop{\sum }\nolimits_{q = 1}^{{M}^{K}}{\tau }_{uq}=1\) in Algorithm 1. In the Mstep, with τ obtained from the Estep, we maximize \({{{{{{\mathscr{L}}}}}}}_{Q}(\theta ,R)\) with respect to the parameters l^{(1)} and p^{(1)}. In Eq. (3), the terms regarding l and p are written separately. Therefore, they are independently updated. \({l}_{i}^{(1)}\) can be analytically computed by setting the partial derivatives to zero. Moreover, the \({p}_{ij}^{(1)}\) are numerically computed by the gradient method with step length γ. q(k) and h(k) are the kth index of decomposing q and h (a ‘reverse’ process of taking the Kronecker products). \(q(k)=\left(\lfloor \frac{q1}{{M}^{k1}}\rfloor \,{{{{\mathrm{mod}}}}}\,\,M\right)+1\). To avoid confusion between θ^{(1)} and θ^{(K)}, we use i and j for indices in θ^{(1)} and q, h in θ^{(K)}. Finally, we also note that in this section, all summations over nodes u and v are taken from 1 to N. Summations of i, j are from 1 to M and summations of q, h are taken from 1 to M^{K}. Note that M, K are considered as hyperparameters in the variational EM framework. We perform more analysis on the choice of best hyperparameters M, K in Supplementary Notes 4 and 6.
Multifractal analysis of WMGM
Next, we use our proposed WMGM to analytically compute the statistical physics inspired and multifractal metrics, such as the partition function, the LipschitzHölder exponent, and the multifractal spectrum. For simplicity, we first reshape the linking probabilities \({\left\{{p}_{ij}^{(1)}\right\}}_{i,j = 1:M}\) in θ^{(1)} as \({\left\{{p}_{i}\right\}}_{i = 1:{M}^{2}}\). We also reshape the area of each subrectangle in the unit square \({\left\{{l}_{i}^{(1)}{l}_{j}^{(1)}\right\}}_{i,j = 1:M}\) as \({\left\{{a}_{i}\right\}}_{i = 1:{M}^{2}}\). Following^{20,35}, the partition function of the model at an average subblock size \(\epsilon ={(\frac{1}{M})}^{2K}\) can be written as
In Eq. (4), \(\left(\begin{array}{cc}{K}\\ {{k}_{1}\ldots {k}_{j}\ldots {k}_{{M}^{2}}}\end{array}\right)\) is the number of subblocks which have the same area \(\mathop{\prod }\nolimits_{i = 1}^{{M}^{2}}{{a}_{i}}^{{k}_{i}}\) and linking probability \(\mathop{\prod }\nolimits_{i = 1}^{{M}^{2}}{{p}_{i}}^{{k}_{i}}\). \({\left\{{k}_{i}\right\}}_{i = 1:{M}^{2}}\) is subjected to \(\mathop{\sum }\nolimits_{i = 1}^{{M}^{2}}{k}_{i}=K\). \(\mathop{\prod }\nolimits_{i = 1}^{{M}^{2}}{[{a}_{i}{p}_{i}]}^{{k}_{i}}\) is the proportion of edges which are generated under linking probability in those subblocks.
In the multifractal analysis, multifractal metrics are calculated based on the partition function Z_{ϵ}(q), where q is the order of the moment. The mass exponent is given by
The Lipschitz–Hölder exponent (refer to coarse Hölder exponent or singularity index in some other scientific works) is defined as \(\alpha (q)=\frac{{{{\rm{d}}}}\tau (q)}{{{{\rm{d}}}}q}\). The multifractal spectrum reads f(α) = α(q)q − τ(q). Here, we provide the expression for the Lipschitz–Hölder exponent:
When the values of the order of the moment q takes q = −q_{0}: dq: q_{0}, the width of the multifractal spectrum can be defined and calculated as dα = α(q)_{max} − α(q)_{min} = α(q_{0}) − α(−q_{0}). The center of the multifractal spectrum is located at α_{center} = α(0).
Data availability
The data supported the results in the study is from public dataset Drosophila connectome https://www.janelia.org/projectteam/flyem/hemibrain and HiC chromosomal interaction https://aidenlab.org/juicebox/.
Code availability
Source code is available at https://github.com/ruocheny/WeightedMultifractalGraphModel.
References
Erdős, P. & Rényi, A. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5, 17–60 (1960).
Barabási, A.L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘smallworld’ networks. Nature 393, 440 (1998).
Song, C., Havlin, S. & Makse, H. A. Selfsimilarity of complex networks. Nature 433, 392 (2005).
Gallos, L. K., Song, C. & Makse, H. A. A review of fractality and selfsimilarity in complex networks. Phys. A: Statistical Mech. Appl. 386, 686–691 (2007).
Xue, Y. & Bogdan, P. Reliable multifractal characterization of weighted complex networks: algorithms and implications. Sci. Rep. 7, 7487 (2017).
Lynn, C. W. & Bassett, D. S. The physics of brain network structure, function and control. Nat. Rev. Phys. 1, 318 (2019).
Rutledge, M. T., Russo, M., Belton, J.M., Dekker, J. & Broach, J. R. The yeast genome undergoes significant topological reorganization in quiescence. Nucl. Acids Res. 43, 8299–8313 (2015).
Pigolotti, S., Jensen, M. H., Zhan, Y. & Tiana, G. Bifractal nature of chromosome contact maps. Phys. Rev. Res. 2, 043078 (2020).
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C. & Ghahramani, Z. Kronecker graphs: An approach to modeling networks. J. Mach. Learning Res. 11, 985–1042 (2010).
Ravasz, E. & Barabási, A.L. Hierarchical organization in complex networks. Phys. Rev. E 67, 026112 (2003).
Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabási, A.L. Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002).
Mandelbrot, B. B. The Fractal Geometry of Nature, Vol. 173 (WH Freeman, 1983).
Mandelbrot, B. B. in Fractals in Geophysics, (eds Scholz, C. H. & Mandelbrot, B. B.) 5–42 (Springer, 1989).
DanLing, W., ZuGuo, Y. & Anh, V. Multifractal analysis of complex networks. Chin. Phys. B 21, 080504 (2012).
Yin, C. et al. Network science characteristics of brainderived neuronal cultures deciphered from quantitative phase imaging data. Sci. Rep. 10, 1–13 (2020).
Vázquez, A., Flammini, A., Maritan, A. & Vespignani, A. Modeling of protein interaction networks. Complexus 1, 38–44 (2003).
Song, C., Havlin, S. & Makse, H. A. Origins of fractality in the growth of complex networks. Nat. Phys. 2, 275 (2006).
Song, C., Gallos, L. K., Havlin, S. & Makse, H. A. How to calculate the fractal dimension of a complex network: the box covering algorithm. J. Statistical Mech.: Theory Exp. 2007, P03006 (2007).
Yang, R. & Bogdan, P. Controlling the multifractal generating measures of complex networks. Sci. Rep. 10, 1–13 (2020).
Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Pseudofractal scalefree web. Phys. Rev. E 65, 066122 (2002).
Dorogovtsev, S. N., Mendes, J. F. F. & Samukhin, A. Multifractal properties of growing networks. EPL (Europhys. Lett.) 57, 334 (2002).
Palla, G., Lovász, L. & Vicsek, T. Multifractal network generator. Proc. Natl Acad. Sci. 107, 7640–7645 (2010).
Leskovec, J., Chakrabarti, D., Kleinberg, J. & Faloutsos, C. Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In European Conference on Principles of Data Mining and Knowledge Discovery (eds Jorge, A., Torgo, L., Brazdil, P., Camacho, R. & Gama, J.) 133–145 (Springer, 2005).
Leskovec, J. & Faloutsos, C. Scalable modeling of real graphs using kronecker multiplication. In Proc. 24th International Conference on Machine learning (ed. Ghahramani, Z.) 497–504 (ACM, 2007).
Kim, M. & Leskovec, J. Modeling social networks with node attributes using the multiplicative attribute graph model. In Proceedings of the TwentySeventh Conference on Uncertainty in Artificial Intelligence. AUAI Press, Arlington, Virginia, USA 400–409.
Kim, M. & Leskovec, J. Multiplicative attribute graph model of realworld networks. Internet Math. 8, 113–160 (2012).
Xu, C. S. et al. A connectome of the adult drosophila central brain. BioRxiv https://doi.org/10.1101/2020.01.21.911859 (2020).
Mitter, M. et al. Conformation of sister chromatids in the replicated human genome. Nature 586, 139–144 (2020).
Erdos, P. & Renyi, A. On random graphs. Publ. Math. Debrecen 6, 290–297 (1959).
Pal, K., Forcato, M. & Ferrari, F. Hic analysis: from data generation to integration. Biophys. Rev. 11, 67–78 (2019).
Robinson, J. T. et al. Juicebox. js provides a cloudbased visualization system for hic data. Cell Systems 6, 256–258 (2018).
Betzel, R. F. & Bassett, D. S. Multiscale brain networks. Neuroimage 160, 73–83 (2017).
Daudin, J.J., Picard, F. & Robin, S. A mixture model for random graphs. Statistics Comput. 18, 173–183 (2008).
Cheng, Q. Generalized binomial multiplicative cascade processes and asymmetrical multifractal distributions. Nonlinear Process. Geophys. 21, 477–487 (2014).
Acknowledgements
The authors gratefully acknowledge the support by the National Science Foundation Career award under Grant No. CPS/CNS1453860, the NSF award under Grant CCF1837131, MCB1936775, CNS1932620, the U.S. Army Research Office (ARO) under Grant No. W911NF1710076, the Okawa Foundation award, and the Defense Advanced Research Projects Agency (DARPA) Young Faculty Award and DARPA Director Award under Grant No. N660011714044, and a Northrop Grumman grant. The views, opinions, and/or findings contained in this article are those of the authors and should not be interpreted as representing the official views or policies, either expressed or implied by the Defense Advanced Research Projects Agency, the Air Force Research Lab, the Department of Defense or the National Science Foundation.
Author information
Authors and Affiliations
Contributions
R.Y., F.S., and P.B. designed the research study. R.Y. and F.S. wrote the codes and conducted to the simulations. All authors contributed to the applications, results analysis, and manuscript writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Communications Physics thanks Arian Ashourvan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yang, R., Sala, F. & Bogdan, P. Hidden network generating rules from partially observed complex networks. Commun Phys 4, 199 (2021). https://doi.org/10.1038/s42005021007015
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42005021007015
This article is cited by

Interplay between topology and edge weights in realworld graphs: concepts, patterns, and an algorithm
Data Mining and Knowledge Discovery (2023)

Sequence likelihood divergence for fast time series comparison
Knowledge and Information Systems (2023)

Community detection in networks by dynamical optimal transport formulation
Scientific Reports (2022)

On the general Toda system with multiple singular points
Calculus of Variations and Partial Differential Equations (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.