Commodity-specific triads in the Dutch inter-industry production network

Di Vece, Marzio; Pijpers, Frank P.; Garlaschelli, Diego

doi:10.1038/s41598-024-53655-3

Download PDF

Article
Open access
Published: 13 February 2024

Commodity-specific triads in the Dutch inter-industry production network

Marzio Di Vece^1,2,3,
Frank P. Pijpers^4,5 &
Diego Garlaschelli^1,2,6

Scientific Reports volume 14, Article number: 3625 (2024) Cite this article

384 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Triadic motifs are the smallest building blocks of higher-order interactions in complex networks and can be detected as over-occurrences with respect to null models with only pair-wise interactions. Recently, the motif structure of production networks has attracted attention in light of its possible role in the propagation of economic shocks. However, its characterization at the level of individual commodities is still poorly understood. Here we analyze both binary and weighted triadic motifs in the Dutch inter-industry production network disaggregated at the level of 187 commodity groups, which Statistics Netherlands reconstructed from National Accounts registers, surveys and known empirical data. We introduce appropriate null models that filter out node heterogeneity and the strong effects of link reciprocity and find that, while the aggregate network that overlays all products is characterized by a multitude of triadic motifs, most single-product layers feature no significant motif, and roughly 85% of the layers feature only two motifs or less. This result paves the way for identifying a simple ‘triadic fingerprint’ of each commodity and for reconstructing most product-specific networks from partial information in a pairwise fashion by controlling for their reciprocity structure. We discuss how these results can help statistical bureaus identify fine-grained information in structural analyses of interest for policymakers.

Reconstructing firm-level interactions in the Dutch input–output network from production constraints

Article Open access 13 July 2022

Effect of network topology and node centrality on trading

Article Open access 06 July 2020

The spatial and temporal dynamics of global meat trade networks

Article Open access 07 October 2020

In the last decade, the increasing availability of data at the industry and firm level led to a vast number of studies analyzing the system of customer-supplier trade relationships—the production network—among industries^1,2,3,4,5,6 or firms^{7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39} and their impact on country-level macroeconomic statistics⁴⁰.

The heterogeneity encoded in the production network structure plays an essential role in amplifying economic growth³⁴ and in the propagation of shocks^1,12 related to exogenous events, such as Hurricane Sandy²⁹, the Great East Asian Earthquake^11,27, the Covid-19 pandemic^6,17,28, or endogenous events such as the 2008 financial crisis^41,42.

Even in the time of globalization—characterized by highly interconnected global supply chains—domestic production networks are still relevant. In fact, it has been shown that for a small country as Belgium, while almost all firms directly or indirectly import and export to foreign firms, these exchanges represent the minority of domestic firms’ total revenues¹⁶.

While aggregated information about single firms is contained in most National Statistical Institutes’ repositories, reliable data on input/output relationships is available only for a small number of countries. For instance, the Compustat dataset contains the major customers of the publicly listed firms in the USA⁸. The FactSet Revere dataset contains major customers of publicly listed firms at a global level, with a focus on the USA, Europe, and Asia³⁰. Two datasets are commercially available in Japan, namely, the dataset collected by Tokyo Shoko Research Ltd. (TSR)¹¹ and the one collected by Teikoku DataBank Inc. (TDB)³⁵. They are characterized by a high coverage of Japanese firms but with a limited amount of commercial partners. Other domestic datasets contain transaction values among VAT-liable firms: this is the case for countries such as Brazil¹⁴, Belgium¹⁵, Hungary¹⁷, Ecuador⁷, Kenya¹⁹, Turkey²⁰, Spain²¹, Rwanda and Uganda²², West Bengal²³; or contain transaction values among the totality of registered domestic firms such as in the case of Dominican Republic¹⁸ and Costa Rica⁴³.

However, in production networks, user firms connect to supplier firms to buy goods for their own production. Customer-supplier relationships are, hence, characterized by an intrinsic product granularity that is usually neglected. The importance of product-specific information has been highlighted, for instance, in a rare study that utilizes surveys with limited data for Japanese automotive firms⁴⁴. Generally, in the economic theory of industries and firms, the problem of product granularity is ‘solved’ artificially, by assuming that industries/firms supply a single product^1,4. This is an oversimplification that often conflicts with reality: indeed, a single firm can possess more than a production pipeline and is capable of supplying multiple products (e.g. Samsung, a Telecommunication company, sells also household appliances, and multinational companies such as Amazon and Google supply a large number of different products).

Recently, Statistics Netherlands (CBS) reconstructed from national statistics two multi-layer production network datasets for domestic intermediate trade of Dutch firms for 2012²⁵ and 2018¹⁰, with each layer corresponding to a different product exchanged by a firm for its own production process, as illustratively depicted in Fig. 1a. The 2012 dataset has been recently used to prove the complementarity structure of production networks³³ by inspecting the number of cycles of order 3 and 4 compared to a null model taking into consideration the in-degree and out-degree distributions. We use the improved version for 2018 and construct an inter-industry network that will be presented in the next section.

In this study, we focus on triadic motifs and anti-motifs that are over-occurrences and under-occurrences of different patterns of directed triadic connections, respectively. They are represented in Fig. 1b. Triadic and tetradic connections are known as the building blocks of complex networks⁴⁵, playing the role of functional modules or evolutionary signs in biological networks^46,47, homophily-driven connections in social networks⁴⁸, complementarity-driven structures in production networks^33,36, their change in time being interpreted as self-organizing processes in the World Trade Web (WTW)^49,50, and early-warning signals of topological collapse in inter-bank networks^51,52 and stock market networks⁴². It has been proven that for the majority of (available) real-world networks, the triadic structure is maximally random⁵³ and by fixing it their global structure is statistically determined⁵⁴.

In contrast, research on weighted motifs and anti-motifs is still underdeveloped. To our knowledge, only one study involves trade volumes circulating on triadic subgraphs, using a probabilistic model based on random walks on the WTW⁵⁵.

Motif detection strictly depends not only on the properties of the real network but also on the randomization method used for the computation of random expectations. In network science literature, various methods have been advanced for network randomization, primarily edge-stub methods, edge-swapping methods, and maximum-entropy methods, we focus on the latter. Randomization methods based on Entropy Maximization^56,57,58 build graph probability distributions that are maximally random by construction. Available global or node-specific data are encoded as constraints in the optimization procedure, and their corresponding Lagrange Multipliers are computed by maximum likelihood estimation (MLE)⁵⁹. This theoretical framework has been proven to successfully reconstruct economic and financial systems^60,61,62,63, statistically predicting both the topology and the weights of the WTW^64,65,66, in an integrated⁶⁷, or conditional fashion⁶⁸, with only structural constraints, or informing the models with economic factors^69,70,71, statistically predicting banks’ risk exposures⁷², and most recently, statistically reconstructing payment flows among Dutch firms that were clients of ABN Amro Bank or ING Bank, constraining their industry-specific production functions²⁶. These methods have been proven to give the best insurance of unbiasedness with respect to missing data, as seen by independent testing^73,74,75,76. Two studies using maximum-entropy modeling are especially worthy of note for motif detection: a theoretical study where the authors develop null models for triadic motif detections and compute z-scores of triadic occurrences analytically⁷⁷, and an applied study where triadic motifs and their time evolution are used as early warnings of topological collapse during the 2008 financial crisis⁵¹.

Our contribution goes in this direction, using maximum-entropy methods constraining degree distributions and strength distributions—in their directed form and taking into account their reciprocal nature—to characterize triadic connections and the total money circulating on them for different product layers of the reconstructed Dutch production network. An analysis of this kind can give better insight into how much product-level granularity is needed in production network datasets and how the links and weights of a production network are organized for different products. Once product layer patterns have been detected, National Bureau officials—having experience in the domestic trade of that single commodity—can infer if such motifs and anti-motifs are due to commodity-specific characteristics, market imbalances, or represent structures aided by laws. If unbalances and anomalies are detected, executive government agencies can use this as input to eventually advance policy laws to nudge a more convenient redistribution of connections and trade volumes.

Results

The CBS production network

The CBS production network for 2018¹⁰ improves on the 2012 version by integrating more auxiliary micro and industry-level data. Before going into detail it is helpful to explain which are the industry classifications and product classifications used by Statistics Netherlands. Industries are classified using the Dutch Standard Industrial Classifications, in brief SBI, which are equivalent to the European Standard Classification NACE rev.2 in the first two digits, although the subsequent digits can differ. Statistics Netherlands has industry data on two different levels, the SBI4 level, containing 132 industries, and the SBI5 level, corresponding to 888 industries. Regarding the CPA product classification, Statistics Netherlands uses a modified version of the original European CPA, mainly at 4 and 6 digits. In the data, we retrieved 192 commodities for 4 digits and 623 commodities at the 6-digit level.

Firm-level data is obtained from the Statistical Business Register (SBR) for 2018 for over 1, 700, 000 firms. The SBR contains values about net turnover, geographical location, business id, and business sector at the SBI5 classification level. After cleaning for micro-firms with annual net turnover below 10, 000 €, around 900, 000 firms remain, accounting for $99.5\%$ of the Dutch economy output in 2018. Details regarding the breakdown of output and input at the commodity level are primarily available at the industry level and for a limited number of firms. The Dutch National Supply-Use tables provide data on inter-industry and intra-industry intermediate input/output transactions for various commodities, classified at the Dutch CPA 4-digit level with industries categorized according to the SBI4 level.

While industry-wide transactions are validated, estimating output and input for individual firms per commodity and matching suppliers with users within commodity layers remains a challenge. To estimate supply per firm, domestic turnover, calculated as VAT turnover minus export turnover, is employed as a distributional key. Firms are assumed to supply in proportion to the ratio of their domestic turnover to the overall industry turnover. Additional adjustments are made for wholesale and retail trade firms to account only for domestic turnover associated with actual production.

Estimating use per firm from VAT turnover data involves determining the ratio between intermediate use and turnover. This ratio is estimated using SBS survey data. The breakdown of supply/use per firm at the commodity level is available for a relatively large number of firms through surveys conducted by structural business statistics (SBS) for commercial firms, Prodcom for manufacturing firms, and estimates generated by National Accounts for non-commercial firms.

Specifically, SBS provides a breakdown of sales and intermediate purchases into ten to twenty commodity categories for small firms and at the CPA-classification level for large firms. Prodcom conducts a similar survey. SBS categories are then mapped into CPA commodities by National Account experts.

For firms not covered by the aforementioned surveys, the breakdown in commodities of intermediate supply/use is estimated using the distribution of the industries as a whole from the supply-use tables. This approximation can result in implausible values of annual supply and use. To address this issue, thresholds are imposed, setting supply values below 2000 € and also use values below 1000 € to zero. Finally, an iterative proportional fitting (IPF) procedure is implemented to ensure consistency with industry-level Tables.

Once supply and use per firm per commodity are obtained, their out-degree distribution is estimated using stylized facts from Japanese firms³⁹, connecting out-degrees with firm sizes through a power-law function, while their in-degree distribution is estimated assuming a power-law connection to firm-specific input at the commodity-level, an assumption that is consistent with recent studies on Dutch inter-firm payments²⁶.

Once in-degree and out-degree distributions per commodity are estimated for each firm, suppliers and users are matched according to a deterministic procedure that takes into account (1) a company score, encoding their net turnover, (2) a distance score, that takes into account their mutual distance, (3) the presence of a link between respective industries in the supply-use tables, (4) the presence of the observed relationship in the Dun and Bradstreet dataset, i.e. a dataset containing the list of the users of the 500 largest suppliers in the Dutch Economy. After the computation of the related ‘link score’, users in each commodity layer are ordered according to their purchase volumes. The top user, then, selects the best X suppliers and establish a connection with them, where X represents its commodity-specific in-degree. The procedure continues from the second-highest purchasing volume user to the last until no available links remain and degree distributions are reproduced. Network weights are then distributed across generated links according to a power-law distribution. Finally, the resulting weighted inter-firm network at the 650 commodity level (National CPA level 6) is compared to the Supply-Use tables (National CPA level 4) and consequent adjustments are made to weights and links. Further details can be found in¹⁰.

We aggregate the inter-firm network at the commodity level, passing from 623 commodities (CPA level 6) to 192 commodities (CPA level 4, compatible with supply-use tables). Then, we aggregate firms in industries at the SBI5 level, taking their business sector ids from the SBR. For the topic of interest, the self-loops implied by intra-industry trade are not important and can be removed from the dataset without adversely affecting the subsequent analysis. After cleaning for intra-industry trade, we obtain a multi-layer inter-industry production network containing linkages and weights for 862 industries (nodes) and 187 commodity groups (layers).

The firm-level reconstructed dataset is not without limitations. One source of error arises from the breakdown provided by SBS and Prodcom surveys, particularly regarding the documented intermediate purchases and sales. The purchases may include imports, and the sales may also include sales for final consumptive use. Another source of error stems from the distributions and assumptions made for firm out-degree and in-degree distributions. While these assumptions are supported by stylized facts from Japanese firms (for out-degrees) and payment data from a large sample of Dutch firms (for in-degrees), it cannot be assumed that the parameters used in the reconstruction are universally applicable or representative of ‘true values’. Finally, the matching procedure results in a deterministic network where the ‘best’ users have priority in connecting with their more closely aligned suppliers. This algorithm cannot account for noisy behavior or real-world uncertainties. In fact, for the 2012 version, with similar assumptions on degree distributions and the same matching algorithm, it has been demonstrated that these assumptions lead to biases in core network statistics such as the number of links in commodity layers³⁷, when compared with the ground-truth provided by a known sample of firm-to-firm connections collected by Dun and Bradstreet (for 2012). While aggregation at the SBI5 level is bound to reduce the biases that arose at the firm-level, it is still not clear how much the results are impacted by the propagation of these errors. Further discussion on limitations is provided at the beginning of “Discussion” section.

Network randomization methods

The main goal of network randomization methods is the generation of a statistical ensemble of networks, which are maximally random given available data. In our case, we randomize each product layer of our industry-multilayer network separately using maximum-entropy methods. The available data—encoded as constraints in the entropy maximization—consists of the supplier’s (user’s) tendency to supply(use) a specific commodity and its output(input). The obtained statistical ensemble of networks represents the possible realizations of the system taking into account suppliers’ and users’ tendencies. After the generation of the synthetic ensemble of networks it is possible to extract metrics of interest as ensemble averages.

The null models we take into account are the directed binary configuration model (DBCM)⁶⁵ and the reciprocal binary configuration model (RBCM)⁷⁷ for the estimation of network links, and the conditional reconstruction method A (CReM$_{A}$)⁶⁸ and the newly developed conditionally reciprocal weighted configuration model (CRWCM) for the conditional estimation of network weights. The DBCM corresponds to the model that maximizes the Shannon entropy attached to the distribution of possible binary adjacency matrices, given that in-degree and out-degree distributions are constrained on average. The RBCM is also used for estimation of links by maximizing the Shannon entropy attached to the distribution of possible binary adjacency matrices, but makes use of additional information, namely the non-reciprocated out-degree, in-degree and the reciprocated degree distributions. These metrics are originated distinguishing links that are reciprocated from the ones that are not and summing on them. Turning our attention to weighted networks, the CReM$_{A}$ is the maximum-entropy model that maximizes the conditional Shannon entropy attached to the distribution of weighted networks, given the realization of the adjacency matrix A. The constraints used in the conditional optimization are the out-strength and in-strength distributions, corresponding to sum of weights going from and to a node, respectively. The CRWCM, instead, is an augmented version of CReM$_{A}$, which can take better account of reciprocation by constraining the out-strength and in-strength distributions for reciprocated and non-reciprocated links. Both CReM$_{A}$ and CRWCM are estimated using an annealed approach, following the articles^68,80, and consequently coupled with the relative binary model. Specifically directionality is encoded in the DBCM$+$CReM$_{A}$ model, also denoted as the directed model, while directional and reciprocal information is encoded in the RBCM$+$CRWCM model, denoted as the reciprocated model. For further information and the mathematical generation of link and weight distributions, please refer to “Methods” section.

Measuring empirical reciprocity statistics

Table 1 Description of the distribution of statistics such as the number of active industries N, the number of links L, the total weight $W_{tot}$, the topological reciprocity $r_{t}$ and the weighted reciprocity $r_{w}$ across commodity layers of the inter-industry network.

Full size table

The presence of data on product granularity gives us the opportunity to study heterogeneity across commodity layers. Let us consider in Table 1 the number of layer-active industries N, the number of links L, the total weight $W_{tot}$, and reciprocity measures such as the topological reciprocity $r_{t}$, defined as the ratio of reciprocated links to L, i.e.

$$r_{t} = \frac{{L^{ \leftrightarrow } }}{L} = \frac{{\sum\limits_{{i,j \ne i}} {a_{{ij}}^{ \leftrightarrow } } }}{{\sum\limits_{{i,j \ne i}} {a_{{ij}} } }}.$$

(1)

and its weighted counterpart $r_{w}$, defined as the ratio of total weight on reciprocated links to W, i.e.

$$\begin{aligned} r_{w} = \dfrac{W_{tot}^{\leftrightarrow }}{W_{tot}} = \dfrac{\sum _{i,j \ne i}w_{ij}^{\leftrightarrow , out}}{\sum _{i,j \ne i}w_{ij}}. \end{aligned}$$

(2)

The median for N is 149, meaning that for around $50\%$ of commodity layers there are less than 149 active industries (as suppliers or users). At the same time, $25\%$ of commodity layers have less than 62 industries, and another $25\%$ have more than 544 industries. Consequently, industries are specialized among a small number of business activities for half of the commodity groups but, a small, and not negligible, number of layers is characterized by a high number of active industries and hence of industry heterogeneity. Some examples are suppliers of plastic goods that are sold to users with heterogeneous specializations, for instance, bread, beer, cereals, fish, etc. Also the distributions regarding the number of commodity-specific links L and the related total weight $W_{tot}$ have wide distributions, with a minimum with few digits, respectively 3 and 0.95 (in millions of euro), and a maximum in 5 digits, respectively 15, 198 and 23, 767, implying a high degree of heterogeneity in network structure across commodity layers.

Passing from the commodity global statistics to $r_{t}$ and $r_{w}$, we see a high degree of heterogeneity also in this case, namely a minimum value of 0 stands for layers where no link is reciprocated, i.e. users and suppliers represent two distinct sets of nodes (bipartite graph). Instead, in the majority of the commodities (above $75\%$) there is a not-null reciprocity. In fact, the median is respectively 0.05 and 0.08. There is also the presence of a small number of commodities (below $10\%$) which are characterized by a large reciprocity, with a maximum of 0.78 for both $r_t$ and $r_w$.

Reciprocity can arise for different reasons: (1) the aggregation from firms to industries or (2) the aggregation of products. To mention the first case, consider two firms A and B in the industry i and other two firms C and D in industry j. Suppose firm A supplies to firm D, while firm C supplies to firm B, in the same commodity layer. Once the firms are aggregated in the related industries, a reciprocated link emerges between them, even if reciprocity is not present at the firm level.

The second case follows from the fact that if each commodity layer represents a unique product, that could be represented by the finest CPA product classification (with around 5000 products), and we take into account only intermediate supply and use, it is not reasonable to think that firms are at the same time suppliers and users (of that specific product). Instead, in case of product aggregation, firms may be suppliers of a product inside that commodity group and also users of another product inside that same commodity group.

Let us now move to the analysis of triads. We define triadic occurrences $N_{m}$, the number of times a specific m-subgraph appears and triadic fluxes $F_{m}$, the total amount of money circulating on each m-subgraph. In Fig. 2, we depict their values normalizing by their sum across the m-types. The normalized $N_{m}$ and $F_{m}$ can be considered as the relative importance of a specific type of triadic subgraph in the network. The aggregated network (depicted in blue), where the weights of all commodity groups are summed, and three commodity layers, namely ‘cereals’ (in green), ‘gas/hot water/city heating (in orange) and ‘agricultural services’ (in pink) are displayed. In the aggregated network, the structures that occur relatively more are $m=1$, represented by a supplier connected to two users and $m=13$, the totally reciprocated cyclical triad. While $m=13$ is probably due to product aggregation, the predominance of $m=1$ is a signal of structural dependency on a limited number of suppliers. However, when normalized $F_{m}$ are investigated, $m=13$ still contain the majority of the volumes. A similar profile, in the binary case, is given by the agricultural services, with the predominance of $m=1$ and $m=13$. At the same time a relatively smaller amount of money is concentrated on $m=13$ with respect to the aggregated case, while $m=1$ and $m=11$ carry a greater amount of money. During the product disaggregation weights on $m=13$ in the aggregated network are redistributed on other subgraphs, especially $m=1$. In ‘cereals’ and ‘gas/hot water/city heating’ these differences are even larger, with a relevant increase of triadic occurrences and fluxes on $m=1$, further increasing the dependency of the network on a limited amount of suppliers. Note that when counting the different triads in Fig. 2 they are not nested, i.e. a subgraph of type $m=8$ requires two reciprocated links and hence does not contain two subgraphs of type $m=1$, which contain only non-reciprocated links. Consequently, the number and fluxes over all triadic subgraphs are structurally independent across different types.

Binary motif analysis

We analyze the number of occurrences $N_{m}$ of all the possible triadic connected subgraphs, depicted in Fig. 1b. To quantify their deviations to randomized expectations, we define the binary z-score of subgraph m

$$\begin{aligned} z\left[ N_{m}\right] = \dfrac{N_{m}(A^*)-\langle N_{m}\rangle }{\sigma \left[ N_{m}\right] } \end{aligned}$$

(3)

where $N_{m}(A^*)$ is the number of occurrences of the m-type subgraph in the empirical adjacency matrix, $\langle N_{m} \rangle$ is its model-induced expected number of occurrences, and $\sigma \left[ N_{m}\right]$ is the model-induced standard deviation.

An analytical procedure⁷⁷ has been developed to compute the binary z-scores for the binary case. However, the assumption on the confidence intervals—represented as the interval $(-3,3)$—holds true only if the ensemble distribution of $N_{m}$ is Normal for each m. For all the commodities, m-types, and binary null models, we test the assumption using a Shapiro Test⁷⁹. According to the test, $N_{m}$ ensemble distributions are in a large proportion not normal at the $5\%$ confidence level. Consequently, we must use a numeric approach. Networks are sampled according to the DBCM recipe by (1) computing the induced connection probability $p_{ij; DBCM}$ and (2) establishing a link between industry i and j if and only if a uniformly distributed random number $u_{ij} \in U(0,1)$ is below $p_{ij; DBCM}$. The analogous recipe for RBCM requires (1) computing the set of connection probabilities for non-reciprocated connection between i and j, namely $p_{ij}^{\rightarrow}$, $p_{ij}^{\leftarrow}$ and $p_{ij}^{\not \leftrightarrow}$, and reciprocated connection $p_{ij}^{\leftrightarrow}$, generate a uniform random variable $u_{ij} \in (0,1)$ and (2) establishing the appropriate links in the dyad in the following way:

a non-reciprocated link from i to j if $u_{ij} \le p_{ij}^{\rightarrow }$;
a non-reciprocated link from j to i if $u_{ij} \in (p_{ij}^{\rightarrow },p_{ij}^{\rightarrow }+p_{ij}^{\leftarrow }]$;
a reciprocated link from i to j (and from j to i) if $u_{ij} \in (p_{ij}^{\rightarrow }+p_{ij}^{\leftarrow },p_{ij}^{\rightarrow }+p_{ij}^{\leftarrow }+p_{ij}^{\leftrightarrow }]$;
no links from i to j and from j to i otherwise.

In both cases, we generate a realization of A and extract the $N_{m}$ statistic. $\langle N_{m} \rangle$ and $\sigma \left[ N_{m}\right]$, are the average and standard deviation of $N_{m}$ extracted from the ensemble distribution of 500 realizations of A. After having computed $z\left[ N_{m}\right]$, we also extract the 2.5-th and 97-th percentiles from the ensemble distribution of $N_{m}$ over all models and we standardize them using Eq. (3) by replacing the empirical $N_{m}$ with the percentile. Such measures will serve as the $95\%$ CI for the z-score.

The results for the aggregated inter-industry network are in Fig. 3a. The z-scores computed with respect to the DBCM are depicted in blue on the left panel, while the z-scores computed with respect to the RBCM are depicted in orange on the right panel. The corresponding confidence intervals at the $5\%$ percent are depicted with the same color (blue or orange) but in slight transparency. The majority of $N_{m}$ are not reproduced by the randomized methods, i.e. the z-scores are outside the confidence intervals. Specifically, only $N_{8}$ is reproduced by the DBCM, while both $N_{1}$ and $N_{9}$ are reproduced by the RBCM. Discounting reciprocal information does not only increase the number of triads that are statistically well described, but potentially changes their type, implying a qualitatively different z-score profile. At the same time, in the aggregated picture, $m=1$ and $m=9$ are seen as described by a null model implementing reciprocity, i.e. neither high dependency on suppliers ($m=1$), nor unstable feedback loops ($m=9$), where industries supply to each other in a cyclical fashion, are revealed. The aggregated network, is hence, characterized by a multitude of structures that are not well described by the null model and are due to additional three-node correlations but is relatively resilient to supply shocks and cyclical input/output. By disaggregating from the aggregated monolayer to the multi-commodity network, the majority of commodity-layers have triadic structures which are statistically reproduced by the reciprocal null model. Only 1 or 2 motifs or anti-motifs are present for the majority of the remaining commodities, a result indicating that beneath the aggregated picture, commodity groups are characterized by a small number of commodity-specific motifs and anti-motifs.

In Fig. 3b–d, z-score profiles for three commodity layers are displayed, namely cereals, electrical components, and the construction of tunnels, waterways, and roads. RBCM well describes all subgraph occurrences ($z_{N_{m}}$ is within CI), while the DBCM signals the presence of anti-motifs for $m=10$, $m=11$ and $m=12$ for cereals, and anti-motif $m=12$ and motif $m=13$ for the construction layer. In Fig. 3e,f, two z-score profiles are displayed—namely for bread and other bakery products and gasoline—for which RBCM signals the presence of at least a motif or anti-motif. A motif $m=12$ is present for the former layer while an anti-motif for $m=4$ is present for the latter. Notice that for bread the DBCM does not signal any motif or anti-motif, implying that deviations can emerge by introducing information on the reciprocal structure. Moreover, subgraph $m=9$ in bread and the majority of subgraphs in the gasoline commodity layer are characterized by a degenerate confidence interval: in all of the generated synthetic networks $N_{m=9}$ correspond to the empirical $N_{9}^*$ with null variance, i.e. the constraints imposed on the ensemble totally describe the specific m-type motif, a matter which can arise regardless of the lack of statistics in the related $N_{m}$. Finally, in Fig. 3g, the z-profile for the commodity layer beer/malt is considered. The DBCM signals a large number of motifs, specifically for $m=2$, $m=10$, and $m=11$, and anti-motifs for $m=3$ and $m=8$. In contrast, the RBCM signals a lone motif $m=3$ and an anti-motif $m=6$.

In Fig. 4a, the empirical counter cumulative distribution for the number of deviating binary triads is shown. Introducing reciprocal structure information reduces the number of motifs and anti-motifs present across commodities. For instance, the percentage of commodities with at least a motif or anti-motif is $61\%$ when compared to the DBCM, and $48\%$ when compared to the RBCM, while the percentage of commodities having at least two motifs or anti-motifs is $46\%$ when compared to the DBCM and $27\%$ when compared to the RBCM.

Lastly, we identify the occurrence of m-type of motifs and anti-motifs across commodities by introducing two quantities, $c_{h}(m)$ and $c_{l}(m)$. $c_{h}(m)$ represents the number of commodities having a motif of type m while $c_{l}(m)$ represents the same measure for anti-motifs. The addition of the reciprocal structure reduces the number of commodity-specific motifs for each subgraph type, with the exception of motif $m=6$ as depicted in Fig. 4b, and the number of anti-motifs for each type, with the exception of anti-motif $m=8$ as depicted in Fig. 4c. The reciprocal null model, hence, reveals a higher number of commodities that are relatively more vulnerable to demand shock due to bankruptcy of industries of type k in triadic formations $m=6$, while it reveals an increased resilience to supply/demand shocks originating from bankruptcy of industries of type j in formations $m=8$.

Weighted motif analysis

While the bankruptcy of an entire industry is unrealistic, a shock due to a decrease in the flow of goods among industries can propagate along the supply chain, with side effects on the real economy. This implies that not only binary information is important for shock propagation but also weighted information, namely the amount of money circulating on connected structures.

Consider the triadic flux $F_{m}$ on motif m, defined as the total money circulating on triadic subgraphs of type m. We characterize the deviation of empirical $F_{m}$ to null models by defining the weighted z-scores as

$$\begin{aligned} z\left[ F_{m}\right] = \dfrac{F_{m}(W^*)-\langle F_{m}\rangle }{\sigma \left[ F_{m}\right] } \end{aligned}$$

(4)

where $\langle F_{m} \rangle$ is the model-induced average amount of money circulating on motif m and $\sigma \left[ F_{m}\right]$ represents the model-induced standard deviation over the ensemble of network realizations.

The theoretical benchmark (or null model) is built by using a combination of binary and conditional weighted models, depending on the wanted constraints. If we deem reciprocal information of negligible importance we should use the combination of models given by DBCM, for the sampling of the binary adjacency matrix, and the CReM$_{A}$, constraining the out-strength and in-strength sequences. If we deem reciprocal information necessary, a combination of the RBCM and CRWCM should be used. We compare here the two to establish the importance of the addition of reciprocity information for the detection of weighted motifs.

In operative terms, using a two-step model such as the DBCM+CReM$_{A}$ reduces to (1) establishing a link between industries i and j when a uniform random number $u_{ij} \in U(0,1)$ is such that $u_{ij} \le p_{ij; DBCM}$, (2) if i and j are connected, sampling $w_{ij}$ by using the inverse transform sampling method technique, i.e., we generate a uniformly distributed random variable $\eta _{ij} \in U(0,1)$ such that

$$\begin{aligned} F(v_{ij})=\int _0^{v_{ij}}q_{CReM_{A}}(w_{ij}|a_{ij}=1)dw_{ij} = \eta _{ij}, \end{aligned}$$

(5)

then we invert the relationship finding the weight $v_{ij}$ to load on the link (i, j).

The network sampling for the RBCM+CRWCM follows the same concepts with two major differences: (1) a link is established using the RBCM recipe and (2) the dyadic conditional weight probability $q_{CReM_{A}}(w_{ij}|a_{ij}=1)$ is substituted with $q_{CRWCM}(w_{ij}|a_{ij}=1)$ in the inverse transform sampling.

In Fig. 5a, the z-score profile for the aggregated network with a single representative commodity is depicted using the directed (in blue on the left panel) or the reciprocal models (in orange on the right panel). There is a large number of motifs and anti-motifs when the benchmark model is directed, only $F_{3}$ does not deviate significantly.

When reciprocity information is considered, the picture changes: only three motifs, namely $m=6$, $m=11$, and $m=13$, are identified, and four anti-motifs, namely $m=3$, $m=8$, $m=10$, and $m=12$, are found when the reciprocal null model is employed. This model’s enhanced accuracy unveils a higher-than-expected volume of financial activity on sub-types characterized by a single exclusive user and two suppliers utilizing each other’s products ($m=6$), two users supplying to each other while employing a product from the same supplier ($m=11$), and entirely cyclical triads ($m=13$). In contrast, a lower-than-expected level of financial activity transpires in open triads with two reciprocated ties ($m=8$), one reciprocated link and one exclusive user ($m=3$), or in closed triads of type $m=10$ and $m=12$. While it might be contended that the heightened concentration of funds on $m=13$ is attributable to aggregation bias, it is crucial to recognize that aggregation solely accounts for the increased monetary worth of the particular sub-type in absolute terms, not for the weighted motif obtained after adjusting for the statistical null model. It should be noticed that the emergence of these specific motifs cannot be easily explained without delving into greater detail, given the representative commodity scheme, while the picture cannot be merely reduced to a higher activity on open triads and a lower activity on closed triads.

Similarly to the binary case, passing from the aggregated network to the disaggregated product-level layers, it is possible to identify a small number of commodity-specific weighted motifs and anti-motifs.

In Fig. 5b–d, three commodity layers are depicted for which no motifs and anti-motifs are present when z-scores are computed using the reciprocal model. They are ‘seeds’, ‘metal components for doors and windows’ and ‘airline services’. In the ‘seeds’ layer, the directed model signals the presence of an anti-motif for $m=5$. In the second layer, no deviations are registered by both null models but CIs are of different nature, in fact, the reciprocal model allows a more restricted range of z-scores with respect to the directed model for $m=9$. In the ‘airline services’ layer, for both models, no deviations are present and three CIs are degenerate for $m=5$, $m=9$, and $m=10$. In Fig. 5e,f, the z-scores relative to the commodity groups ‘coffee/tea’ and ‘textile raw materials and products’ are depicted, for which 1 motif is present by using the reciprocal model. For both the directed and reciprocal models there is a weighted motif $m=2$ in the ‘coffee/tea’ layer. In contrast, in the textile products layer the directed model signals an anti-motif for $m=2$, while the reciprocal model signals a motif for $m=1$. If Fig. 5g, the z-score profile for the commodity layer ‘shipping services’ is shown: the directed model signals a large number of anti-motifs, specifically for $m=5$, $m=7$ and $m=12$, while it registers a motif for $m=11$. The reciprocal model, instead, registers a motif for $m=4$ and anti-motifs for $m=5$ and $m=12$. Different commodity layers call for different motifs and anti-motifs which are due to their specific characteristics. In this paper, we refrain from characterizing every single commodity layer, but a specific and thorough analysis is possible by visualizing the number of triadic sub-types, the z-score profile for $N_{m}$ and their weighted analogs.

The empirical counter cumulative distribution ECCDF($\#$ deviating W$\Delta )$ for the number of deviating weighted triads is depicted in Fig. 6a. The number of deviating triadic fluxes is steadily lower using the reciprocal model. $F_{m}$ are maximally random for $49\%$ when the directed model benchmark is used and for $55\%$ according to the reciprocal model. The reduction of the number of motifs is however not as significant as in the binary case.

In Fig. 6b,c, we plot the weighted analogous of $c_{h}(m)$ and $c_{l}(m)$. Reciprocal information decreases the occurrence of all types of anti-motifs across commodities, with the exception of $m=8$. Instead, the profile induced by $c_{h}(m)$ is significantly different using the two null models. For instance, according to the directed model, $F_{1}$ is almost always well predicted, instead, it is the most occurring motif according to the reciprocal model. At the same time, reciprocity unveils the dependency of more than 40 commodity layers on the supply of a limited amount of suppliers, which in this case control the market. In fact, the high presence of $m=1$ weighted motif signals the vulnerability of the industry-industry network to supply shocks provoked by a reduction of supply volumes.

Discussion

The study of triadic motifs on production networks is still in its infancy due to a scarcity of reliable data. In the existing literature, only binary triadic motifs on one production network, the Japanese one, have been characterized for a single representative commodity³⁶, while the Hungarian dataset has been analyzed only for triadic occurrences without recurring to a null model⁸¹. The Japanese study revealed a simple but significant pattern: open triadic subgraphs are over-represented while closed triadic subgraphs are under-represented. This phenomenon was attributed to complementarity, where economic actors connect in tetradic structures—better explained by open triads—due to complementary needs³³.

Our findings corroborate the notion that an analysis based on a single representative commodity is insufficient to fully characterize a production network. Product-level data is essential for disaggregating the network into layers that are characterized by commodity-specific binary motifs and anti-motifs. Moreover, we found that the majority of layers exhibit maximally random triadic structures when the reciprocal structure is considered.

At the level of binary motifs, we detected that cyclical reciprocated triadic subgraphs, which are dominant in the aggregated network, break up in the disaggregated product layers, where open triangles become dominant, especially $m=1$. However, using the RBCM as a benchmark, we proved that $m=1$ is always well described. Conversely, the completely cyclical triads, even if partially broken in the disaggregated layers, are often over-represented compared to the benchmark estimate. In general, constraining the reciprocation capacity of industries—by constraining the reciprocated degrees—is of the foremost importance when characterizing triadic motifs, as explained by the better accuracy and the decrease in binary triadic motifs and anti-motifs when using RBCM as a benchmark compared to DBCM.

We also characterized weighted motifs and anti-motifs, defined as the amount of money circulating on triadic subgraphs, with a novel model which constrains strengths, decomposing them according to the character of the corresponding links. This type of analysis is totally novel in the context of production networks, and rarely seen with benchmark models⁵⁵. We find a non-trivial result already when analyzing the aggregated network, subgraphs that are well explained in binary terms—their occurrence is well described by the statistical ensemble induced by the DBCM or RBCM—can be not well described in weighted terms, meaning that even if a binary triadic subgraph has the expected occurrence it can accommodate an unexpected concentration of money. Furthermore, we identified a high presence of $m=1$ weighted motifs across commodity layers, a signal of commodity-specific dependency on a limited number of suppliers, which control the market. This implies that a large number of layers are vulnerable to supply shocks, which can arise due to a decrease in supplied volumes (and not only to the supplier’s bankruptcy as in the binary case).

Changing the benchmark from a directed to a reciprocal model significantly changes the identity of motifs and anti-motifs across commodities. Hence, it is essential to take into account the type of the corresponding link in which weights are sampled by constraining reciprocated and non-reciprocated strengths.

Overall, our results indicate that product-level information is strictly necessary to identify triadic structures and fluxes in production networks. We hope that our study can encourage Statistics Bureaus around the world to implement policies and techniques to reveal or reconstruct a reliable product heterogeneity for firm-level transaction data. Our analysis also shows that most commodity-specific layers can be reconstructed via null models that incorporate reciprocity while maintaining dyads independent. For these layers, network reconstruction methods of the type introduced in²⁶, if extended to incorporate reciprocity, are likely to perform well in replicating the properties of the entire layers starting from partial, node-specific information. Most other layers show at most one or a couple of deviating triadic motifs that are unexplained by the null model. For these layers, additional information is needed to achieve a good reconstruction. Once a rigorous product analysis has been performed, experts in the single commodity can interpret why such triadic formations over-occur or under-occur, accommodating an excessive or insufficient amount of trade volume, unveiling the detailed structure of the commodity-specific production networks.

In order to suggest improvements for further research, we conclude by noticing that our study is subject to two main limitations. First, an industry-level analysis inherently yields results that differ from those obtained from firm-level studies and underestimates the risk associated with exogenous and endogenous shocks⁴⁰. Second, the dataset analyzed pertains to industries at the SBI5 level, a classification intermediate between firms and SBI4-level industries. Analyzing the dataset at the firm level was not feasible due to potential biases arising from the deterministic imputation and degree distribution assumptions, which are exacerbated when dealing with highly granular data. Conversely, analyzing industries at the SBI4 level, which encompasses a maximum of 132 industries, would imply that for a substantial number of commodities, very few industries are active. Consequently, the null model would trivially replicate, in a statistical sense, the triadic structures for the majority of commodity layers due to a lack of relevant observations. However, the same biases anticipated at the firm level can arise, even if mitigated, by selecting SBI5-level industries. This could potentially lead to biases in our analysis, especially in the type of motifs and anti-motifs found for each commodity. However, in order to validate all of our ‘fingerprints’ we would need fully empirical data for industries at the SBI5 level for each of the 187 commodities, an information that is not available in any country until now, to the best of our knowledge.

Methods

Binary null models

For binary-directed graphs, the maximum entropy formalism prescribes the maximization of the graph entropy functional S[P(A)]

$$\begin{aligned} S[P(A)] =-\sum _{A \in \textbf{A}} P(A) \ln P(A) \end{aligned}$$

(6)

subject to the normalization of the graph probability P(A) and to the constraints on network properties $C_{\alpha }^{*}$, i.e.

$$\begin{aligned} {\left\{ \begin{array}{ll} \sum _{A \in \textbf{A}} P(A) &{}= 1 \\ \sum _{A \in \textbf{A}} P(A) C_{\alpha }(A) &{}= C_{\alpha }^{*}, \quad \forall \alpha , \end{array}\right. } \end{aligned}$$

(7)

hence maximizing the unbiasedness of the resulting P(A) given available data. Solving the optimization problem, we obtain the canonical P(A)

$$\begin{aligned} P(A)&=\dfrac{e^{{- {\sum }_{{\alpha }}\theta _{{\alpha }}C_{{\alpha }}^{{*}}(A)}}}{\sum _{A \in \textbf{A}}e^{{- {\sum }_{{\alpha }}\theta _{{\alpha }}C_{{\alpha }}^{*}(A)}}} = \nonumber \\&= \dfrac{e^{{-H(A)}}}{\sum _{{A \in \textbf{A}}}e^{{-H(A)}}} \end{aligned}$$

(8)

where H(A) is denoted as the Graph Hamiltonian and is defined as

$$\begin{aligned} H(A) \equiv \sum _{\alpha } \theta _{\alpha } C_{\alpha }(A)^*. \end{aligned}$$

(9)

In this section, we focus on the binary reconstruction methods taking into account local properties.

The directed binary configuration model

In the directed binary configuration model (DBCM), we choose as local properties the the out-degree $(k_i^{out})$ and the in-degree $(k_i^{in})$ representing the number of industries industry i sells to and the number of industries industry i buys from respectively.

Out-degrees and in-degrees can be defined mathematically in terms of the adjacency matrix $A =(a_{ij})$ as

$$\begin{aligned} {\left\{ \begin{array}{ll} k_i^{out} &{}= \sum _{j\ne i} a_{ij} \\ k_i^{in} &{}= \sum _{j\ne i} a_{ji}. \end{array}\right. } \end{aligned}$$

(10)

Solving the constrained entropy maximization we obtain the graph probability P(A) in Eq. (8) where

$$\begin{aligned} H(A) = \sum _{i} \alpha _{i}^{out} k_i^{out} + \alpha _{i}^{in} k_i^{in} . \end{aligned}$$

(11)

The graph probability P(A) can be re-written as the product of Bernoulli trials

$$\begin{aligned} P(A) = \prod _{i,j \ne i} (p_{ij})^{a_{ij}}(1-p_{ij})^{1-a_{ij}} \end{aligned}$$

(12)

where $p_{ij} = P(a_{ij}=1)$ denotes the probability of connection of supplier i with user j and is equal to

$$\begin{aligned} p_{ij} = \dfrac{x_{i}^{out}x_{j}^{in}}{1+x_{i}^{out}x_{j}^{in}} \end{aligned}$$

(13)

with $x_{i}^{out} \equiv e^{-\alpha _{i}^{out}}$ and $x_{i}^{in} \equiv e^{-\alpha _{i}^{in}}$. By maximum log-likelihood estimation (MLE) on the log-likelihood $\mathcal {L}=\ln (P(A))$ we obtain the Lagrange parameters $\alpha _{i}^{out}$ and $\alpha _{i}^{in}$ $\forall i$, a procedure equivalent to solving a system of 2N coupled equations

$$\begin{aligned} {\left\{ \begin{array}{ll} k_{i}^{out,*} &{}= \langle k_{i}^{out} \rangle = \sum _{j \ne i} p_{ij} \\ k_{i}^{in,*} &{}= \langle k_{i}^{in} \rangle = \sum _{j \ne i} p_{ji}. \end{array}\right. } \end{aligned}$$

(14)

where N is the number of industries in the network and $\langle k_i ^{out} \rangle$ and $\langle k_i ^{in} \rangle$ denote the ensemble averages of out-degrees and in-degrees respectively.

The reciprocal binary configuration model

In the reciprocal binary configuration model (RBCM), we decompose the degree according to the reciprocal nature of the connection at hand, namely in non-reciprocated out-degree $k_i^{\rightarrow }$, non-reciprocated in-degree $k_i^{\leftarrow }$ and reciprocated degree $k_i^{\leftrightarrow }$. Those measures can be defined mathematically in terms of the adjacency matrix $A =(a_{ij})$ as

$$\begin{aligned} {\left\{ \begin{array}{ll} k_{i}^{\rightarrow } &{}= \sum _{j \ne i} a_{ij}(1-a_{ji}) = \sum _{j \ne i} a_{ij}^{\rightarrow } \\ k_{i}^{\leftarrow } &{}= \sum _{j \ne i} a_{ji}(1-a_{ij}) = \sum _{j \ne i} a_{ij}^{\leftarrow } \\ k_{i}^{\leftrightarrow } &{}= \sum _{j \ne i} a_{ij}a_{ji} = \sum _{j \ne i} a_{ij}^{\leftrightarrow }. \end{array}\right. } \end{aligned}$$

(15)

Solving the constrained maximization entropy problem, we obtain the graph probability P(A) as in Eq. (8) with graph Hamiltonian given by

$$\begin{aligned} H(A) = \sum _{i} \alpha _i^{\rightarrow } k_i^{\rightarrow } + \alpha _i^{\leftarrow } k_i^{\leftarrow } + \alpha _i^{\leftrightarrow } k_i^{\leftrightarrow }. \end{aligned}$$

(16)

The model-induced graph probability P(A) is the product of Bernoulli trials of mutually exclusive events

$$\begin{aligned} P(A) = \prod _{j < i} \left( p_{ij}^{\rightarrow }\right) ^{a_{ij}^{\rightarrow }} \left( p_{ij}^{\leftarrow }\right) ^{a_{ij}^{\leftarrow }} \left( p_{ij}^{\leftrightarrow }\right) ^{a_{ij}^{\leftrightarrow }} \left( p_{ij}^{\nleftrightarrow }\right) ^{a_{ij}^{\nleftrightarrow }} \end{aligned}$$

(17)

with

$$\begin{aligned} {\left\{ \begin{array}{ll} p_{ij}^{\rightarrow } &{}= \dfrac{x_i^{\rightarrow }x_{j}^{\leftarrow }}{1+x_{i}^{\rightarrow }x_{j}^{\leftarrow } +x_{i}^{\leftarrow }x_{j}^{\rightarrow }+x_{i}^{\leftrightarrow }x_{j}^{\leftrightarrow }} \\ p_{ij}^{\leftarrow } &{}= \dfrac{x_i^{\leftarrow }x_{j}^{\rightarrow }}{1+x_{i}^{\rightarrow }x_{j}^{\leftarrow } +x_{i}^{\leftarrow }x_{j}^{\rightarrow }+x_{i}^{\leftrightarrow }x_{j}^{\leftrightarrow }} \\ p_{ij}^{\leftrightarrow } &{}= \dfrac{x_i^{\leftrightarrow }x_{j}^{\leftrightarrow }}{1+x_{i}^{\rightarrow }x_{j}^{\leftarrow } +x_{i}^{\leftarrow }x_{j}^{\rightarrow }+x_{i}^{\leftrightarrow }x_{j}^{\leftrightarrow }} \\ p_{ij}^{\nleftrightarrow } &{}= \left[ 1+x_{i}^{\rightarrow }x_{j}^{\leftarrow }+x_{i}^{\leftarrow }x_{j}^{\rightarrow } +x_{i}^{\leftrightarrow }x_{j}^{\leftrightarrow }\right] ^{-1}. \end{array}\right. } \end{aligned}$$

(18)

where $x_{i}^{\rightarrow } \equiv e^{-\alpha _{i}^{\rightarrow }}$, $x_{i}^{\leftarrow } \equiv e^{-\alpha _{i}^{\leftarrow }}$ and $x_{i}^{\leftrightarrow } \equiv e^{-\alpha _{i}^{\leftrightarrow }}$ are the exponentiated Lagrange multipliers tuning for the non-reciprocated out-degree, non-reciprocated in-degree and reciprocated degree respectively. The Lagrange multipliers $\alpha _{i}^{\rightarrow }$, $\alpha _{i}^{\leftarrow }$ and $\alpha _{i}^{\leftrightarrow }$ are found using MLE on the Log-likelihood $\mathcal {L} = \ln (P(A))$, a procedure equivalent to solving the system of 3N coupled equations reading

$$\begin{aligned} {\left\{ \begin{array}{ll} k_{i}^{\rightarrow } &{}= \langle k_{i}^{\rightarrow } \rangle = \sum _{j \ne i} p_{ij}^{\rightarrow }\\ k_{i}^{\leftarrow } &{}= \langle k_{i}^{\leftarrow } \rangle = \sum _{j \ne i} p_{ij}^{\leftarrow }\\ k_{i}^{\leftrightarrow } &{}= \langle k_{i}^{\leftrightarrow } \rangle = \sum _{j \ne i} p_{ij}^{\leftrightarrow }, \end{array}\right. } \end{aligned}$$

(19)

i.e., equating the reciprocated and non-reciprocated degrees to their ensemble averages.

Conditional weighted null models

When inspecting network weights, the numeric character of the involved trade volumes restricts the basket of available models. If the weights are discrete-valued, the constrained entropy maximization leads to a family of geometric distributions^66,70,78. In contrast, continuous values lead to a family of exponential probability distributions when the constraints arise from node-specific properties^68,71. We treat the conditional problem, which is well defined only after deciding the form of the binary adjacency matrix A.

The conditional graph entropy S[Q(W|A)], measuring the uncertainty attached to the probability of having a weighted adjacency matrix W compatible with a given realization of the binary adjacency matrix A, i.e.

$$\begin{aligned} S[Q(W|A)] =-\sum _{A \in \textbf{A}} P(A) \int _{W_{A}} Q(W|A) \ln Q(W|A) dW \end{aligned}$$

(20)

is maximized given the normalization of the conditional weighted probability density function Q(W|A) and the constraints $C_{\alpha }(W)$

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _{W_{A}} Q(W|A)dW &{}= 1 \\ \sum _{A} P(A) \int _{W_{A}} Q(W|A) C_{\alpha }(W) dW &{}= C_{\alpha }^*, \quad \forall \alpha . \end{array}\right. } \end{aligned}$$

(21)

where the set of $C_{\alpha }^*$ represent known node-specific properties. From this constrained conditional maximization we obtain Q(W|A), as

$$\begin{aligned} Q(W|A) = {\left\{ \begin{array}{ll} \dfrac{e^{-H(W)}}{\int _{W_{A}}e^{-H(W)} dW_{A}} \quad &{}W \in W_A \\ 0 &{}W \notin W_A \end{array}\right. } \end{aligned}$$

(22)

where $W_A$ stands for the ensemble of realizations of W compatible with A (with weights sampled only on connected dyads $a_{ij}=1$) and the graph Hamiltonian H(W) is defined as

$$\begin{aligned} H(W) \equiv \sum _{\alpha } \beta _{\alpha } C_{\alpha }(W). \end{aligned}$$

(23)

Parameters $\beta _{\alpha }$ are estimated using MLE on the log-likelihood function $\mathcal {L}_{W}$ reading

$$\begin{aligned} \mathcal {L}_{W|A} = - H_{\mathbf {\beta }}(W) - \ln (Z_{\mathbf {\beta },A}) \end{aligned}$$

(24)

where $Z_{\mathbf {\beta },A}$ is the conditional partition function and its computation is possible only if total information about A is available. However, estimating parameters on the empirical topology A neglects its intrinsic random variability when it is sampled using a binary model, such as DBCM or RBCM. This problem is solved in network science literature by defining the generalized log-likelihood $\mathcal {G_{\mathbf {\beta }}}$^68,80

$$\begin{aligned} \mathcal {G_{\mathbf {\beta }}} = - H_{\mathbf {\beta }}(\langle W \rangle ) - \sum _{A \in \textbf{A}} P(A) \ln (Z_{\mathbf {\beta },A}) \end{aligned}$$

(25)

where P(A) is the graph probability induced by the binary model. In the following, we mainly deploy the estimation based on $\mathcal {G_{\mathbf {\beta }}}$ for weighted models. Using the framework mentioned above, we can solve the conditional maximum entropy problem taking into account weighted local properties.

The CReM$_A$ When randomizing the weighted adjacency matrix W, trade marginals such as the out-strength $s_{i}^{out}$ and the in-strength $s_{i}^{in}$—representing the total output or total input of industry i—are usually constrained^66,67. The out-strength $s_i^{out}$ and the in-strength $s_i^{in}$ sequences are defined as the marginals of the weighted adjacency matrix W, namely

$$\begin{aligned} {\left\{ \begin{array}{ll} s_i^{out} &{}= \sum _{j\ne i} w_{ij} \\ s_i^{in} &{}= \sum _{j\ne i} w_{ji}. \end{array}\right. } \end{aligned}$$

(26)

Solving the constrained conditional entropy maximization leads to a conditional cumulative function Q(W|A) as in Eq. (22) where

$$\begin{aligned} H(W) = \sum _{i} \beta _i^{out} s_i^{out} + \beta _i^{in} s_i^{in} \end{aligned}$$

(27)

with a conditional graph distribution

$$\begin{aligned} Q(W|A)&= \prod _{i,j \ne i; a_{ij}=1} q_{ij}(w|a=1) = \nonumber \\&= \prod _{i,j \ne i; a_{ij}=1} \left[ \left( \beta _i^{out} + \beta _{j}^{in} \right) e^{-(\beta _i^{out} + \beta _{j}^{in})w_{ij}}\right] ^{a_{ij}} \end{aligned}$$

(28)

i.e. the product of dyadic exponential distributions in $w_{ij}$ conditional on the establishment of the link $a_{ij}$ and regulated by the node-specific Lagrange parameters $\beta _{i}^{out}$ and $\beta _{i}^{in}$ $\forall i$. By using generalized log-likelihood estimation (GLE), we find the Lagrange parameters—a procedure that equates to slightly changing the dyadic conditional probability by substituting $a_{ij}$ with a dyadic term $f_{ij}$ such that $f_{ij}= \langle a_{ij} \rangle$, i.e., $f_{ij}$ is the ensemble average of $a_{ij}$ and

$$\begin{aligned} q_{ij}(w_{ij}|a_{ij}=1) = \left[ \left( \beta _{i}^{out} + \beta _{j}^{in} \right) e^{-(\beta _{i}^{out}+\beta _{j}^{in})} \right] ^{f_{ij}}. \end{aligned}$$

(29)

Maximizing $G_{\mathbf {\beta }}$ we obtain a system of 2N coupled equations reading

$$\begin{aligned} {\left\{ \begin{array}{ll} s_i^{out} &{}= \sum _{j\ne i} \dfrac{f_{ij}}{\beta _{i}^{out}+\beta _{j}^{in}} = \langle s_{i}^{out} \rangle \\ s_i^{in} &{}= \sum _{j\ne i} \dfrac{f_{ji}}{\beta _{i}^{in}+\beta _{j}^{out}} = \langle s_{i}^{in} \rangle \end{array}\right. } \end{aligned}$$

(30)

and find $\{\beta _i^{in},\beta _{i}^{out}\}$ for each industry.

The CRWCM model

In order to take into account reciprocity, we develop a novel model denoted as Conditionally reciprocal weighted configuration model (CRWCM), that considers the different nature of links on which weights are sampled, namely reciprocated and non-reciprocated links. This choice leads to the definition of four trade marginals for each supplier/user, namely

the non-reciprocated out-strength $s_{i}^{\rightarrow }$ which measures the output of supplier i to users from which it does not buy, defined in terms of W as
$$\begin{aligned} s_{i}^{\rightarrow } = \sum _{j \ne i} a_{ij}^{\rightarrow } w_{ij} = \sum _{j \ne i} w_{ij}^{\rightarrow } \end{aligned}$$
(31)
the non-reciprocated in-strength $s_{i}^{\leftarrow }$, which measures the input of industry i from suppliers to which it does not itself supply, defined as
$$\begin{aligned} s_{i}^{\leftarrow } = \sum _{j \ne i} a_{ij}^{\leftarrow } w_{ji} = \sum _{j \ne i} w_{ij}^{\leftarrow } \end{aligned}$$
(32)
the reciprocated out-strength $s_{i}^{\leftrightarrow , out}$, measuring the output of supplier i to users from which it also purchases, reading
$$\begin{aligned} s_{i}^{\leftrightarrow , out} = \sum _{j \ne i} a_{ij}^{\leftrightarrow } w_{ij} = \sum _{j \ne i} w_{ij}^{\leftrightarrow , out} \end{aligned}$$
(33)
and the reciprocated in-strength $s_{i}^{\leftrightarrow , in}$, measuring the input of user i from suppliers to which it also supplies, defined as
$$\begin{aligned} s_{i}^{\leftrightarrow , in} = \sum _{j \ne i} a_{ij}^{\leftrightarrow } w_{ji} = \sum _{j \ne i} w_{ij}^{\leftrightarrow , in} \end{aligned}$$
(34)

Solving the constrained conditional maximum entropy problem, we obtain the conditional weighted graph probability in Eq. (22) where the graph Hamiltonian is given by

$$\begin{aligned} H(W) = \sum _{i} \beta _{i}^{\rightarrow } s_{i}^{\rightarrow } + \beta _{i}^{\leftarrow } s_{i}^{\leftarrow } + \beta _{i}^{\leftrightarrow , out} s_{i}^{\leftrightarrow , out} + \beta _{i}^{\leftrightarrow , in} s_{i}^{\leftrightarrow , in} \end{aligned}$$

(35)

leading to

$$\begin{aligned} Q(W|A)&= \prod _{j \ne i, a_{ij}=1} q_{ij}(w|a_{ij}=1) = \end{aligned}$$

(36)

where $q_{ij}(w|a_{ij})$ for the single dyad depends on the possible states of $w_{ij}$, namely

$$\begin{aligned} {\left\{ \begin{array}{ll} (\beta _{i}^{\rightarrow }+\beta _{j}^{\leftarrow })e^{-(\beta _{i}^{\rightarrow }+\beta _{j}^{\leftarrow })w_{ij}^{\rightarrow }} &{}\text {for}\, w_{ij}^{\rightarrow }> 0 \\ (\beta _{i}^{\leftrightarrow ,out}+\beta _{j}^{\leftrightarrow ,in})e^{-(\beta _{i}^{\leftrightarrow ,out}+\beta _{j}^{\leftrightarrow ,in})w_{ij}^{\leftrightarrow , out}} &{}\text {for}\, w_{ij}^{\leftrightarrow , out} > 0 \\ 0 &{}\text {for }\, w_{ij} = 0. \end{array}\right. } \end{aligned}$$

(37)

Rephrasing the vector $\{ a_{ij}^{\rightarrow },a_{ij}^{\leftarrow },a_{ij}^{\leftrightarrow },a_{ij}^{\nleftrightarrow }\}$ of $a_{ij}$-states into the vector of their ensemble averages $\{ f_{ij}^{\rightarrow },f_{ij}^{\leftarrow },f_{ij}^{\leftrightarrow },f_{ij}^{\nleftrightarrow }\}$, where $f_{ij}^{(\cdot )} = \langle a_{ij}^{(\cdot )} \rangle$ depends on the binary model of choice, we can use GLE for the estimation of the 4N parameters. The resulting generalized log-likelihood is separable in a reciprocal and non-reciprocal component, i.e., $\mathcal {G}_{\mathbf {\beta }} = \mathcal {G}_{\mathbf {\beta }}^{\rightarrow } + \mathcal {G}_{\mathbf {\beta }}^{\leftrightarrow }$ (see Appendix B for details). The Lagrange parameters $\mathbf {\beta }$ are retrieved by maximizing $\mathcal {G}_{\mathbf {\beta }}$, which equates to solving two uncoupled systems of 2N coupled equations reading

$$\begin{aligned} {\left\{ \begin{array}{ll} s_i^{\rightarrow } &{}= \sum _{j\ne i} \dfrac{f_{ij}^{\rightarrow }}{\beta _{i}^{\rightarrow }+\beta _{j}^{\leftarrow }} = \langle s_{i}^{\rightarrow } \rangle \\ s_i^{\leftarrow } &{}= \sum _{j\ne i} \dfrac{f_{ij}^{\leftarrow }}{\beta _{i}^{\leftarrow }+\beta _{j}^{\rightarrow }} = \langle s_{i}^{\leftarrow } \rangle \end{array}\right. } \end{aligned}$$

(38)

for the non-reciprocated sub-problem and

$$\begin{aligned} {\left\{ \begin{array}{ll} s_i^{\leftrightarrow , out} &{}= \sum _{j\ne i} \dfrac{f_{ij}^{\leftrightarrow }}{\beta _{i}^{\leftrightarrow , out}+\beta _{j}^{\leftrightarrow , in}} = \langle s_{i}^{\leftrightarrow , out} \rangle \\ s_i^{\leftrightarrow , in} &{}= \sum _{j\ne i} \dfrac{f_{ij}^{\leftrightarrow }}{\beta _{i}^{\leftrightarrow , in}+\beta _{j}^{\leftrightarrow , out}} = \langle s_{i}^{\leftrightarrow , in} \rangle \end{array}\right. } \end{aligned}$$

(39)

for the reciprocated sub-problem (Supplementary Information).

Data availability

The data analyzed in this study is under licence by Statistics Netherlands (CBS). Requests to access data should be directed to FPP, f.pijpers@cbs.nl.

Code availability

The code is available as a Python package named ‘NuMeTriS-Null Models for Triadic Structures’ and containing solvers and routines for triadic motif analysis for the mentioned models, namely the DBCM, the RBCM and the mixture models DBCM+CReM$_{A}$ and RBCM+CRWCM. The package is available at the following URL: https://github.com/MarsMDK/NuMeTriS.

References

Acemoglu, D., Carvalho, V. M., Ozdaglar, A. & Tahbaz-Salehi, A. The network origins of aggregate fluctuations. Econometrica 80, 1977 (2012).
Article MathSciNet Google Scholar
Aobdia, D., Caskey, J. & Ozel, N. B. Inter-industry network structure and the cross-predictability of earnings and stock returns. Rev. Acc. Stud. 19, 1191 (2014).
Article Google Scholar
Atalay, E. How important are sectoral shocks?. Am. Econ. J. Macroecon. 9, 254 (2017).
Article Google Scholar
Bouakez, H., Cardia, E. & Ruge-Murcia, F. J. The transmission of monetary policy in a multisector economy. Int. Econ. Rev. 50, 1243 (2009).
Article MathSciNet Google Scholar
Brintrup, A. et al. Supply chain link prediction on uncertain knowledge graph. Complexity 2018, e9104387 (2018).
Google Scholar
Pichler, A. & Farmer, J. D. Simultaneous supply and demand constraints in input-output networks: The case of Covid-19 in Germany, Italy, and Spain. Econ. Syst. Res. 34, 273 (2022).
Article Google Scholar
Bacilieri, A., Borsos, A., Astudillo-Estévez, P. & Lafond, F. Firm-level production networks: What do we (really) know? INET Oxford Working Paper No. 2023-08. (2023).
Atalay, E., Hortaçsu, A., Roberts, J. & Syverson, C. Network structure of production. Proc. Natl. Acad. Sci. 108, 5199 (2011).
Article CAS PubMed PubMed Central ADS Google Scholar
Bernard, A. B., Moxnes, A. & Saito, Y. U. J. Polit. Econ. 127, 639 (2019).
Article Google Scholar
Buiten, G., de Jong, E., Mooijen, G., Hooijmaaijers, S. & Bogaart, P. Reconstruction method for the Dutch interfirm network including a breakdown by commodity for 2018 and 2019 (v1.0). CBS Technical Paper. https://doi.org/10.13140/RG.2.2.16685.77286 (2021).
Carvalho, V. M., Nirei, M., Saito, Y. U. & Tahbaz- Salehi, A. Supply chain disruptions: Evidence from the Great East Japan Earthquake. Q. J. Econ. 136, 1255 (2021).
Article Google Scholar
Carvalho, V. M. & Tahbaz-Salehi, A. Production networks: A primer. Annu. Rev. Econ. 11, 635 (2019).
Article Google Scholar
Cohen, L. & Frazzini, A. Economic links and predictable returns. J. Financ. 63, 25 (2008).
Article Google Scholar
Mungo, L., Lafond, F., Astudillo-Estévez, P. & Farmer, J. D. Reconstructing production networks using machine learning. J. Econ. Dyn. Control 148, 104607 (2023).
Article MathSciNet Google Scholar
Dhyne, E., Magerman, G. & Rubìnova, S. The Belgian production network 2002-2012 , Working Paper 288 (NBB Working Paper, 2015).
Dhyne, E., Kikkawa, A. K., Mogstad, M. & Tintelnot, F. Trade and domestic production networks. Rev. Econ. Stud. 88, 643 (2021).
Article MathSciNet Google Scholar
Diem, C., Borsos, A., Reisch, T., Kertész, J. & Thurner, S. Quantifying firm-level economic systemic risk from nation-wide supply networks. Sci. Rep. 12, 7719 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Cardoza, M., Grigoli, F., Pierri, N. & Ruane, C. Worker mobility and domestic production networks. IMF Working paper, No. 20/205 (2020).
Chacha, P. W., Kirui, B. & Wiedemann, V. Mapping Kenya’s Production Network Transaction by Transaction (Oxford WP, 2022).
Book Google Scholar
Demir, B., Javorcik, B., Michalski, T. K. & Ors, E. Financial constraints and propagation of shocks in production network. Rev. Econ. Stat. 20, 1–46 (2022).
Article Google Scholar
Peydró, J. L., Jiménez, G., Huremovic, K., Moral-Benito, E. & Vega-Redondo, F. Production and financial networks in interplay: Crisis evidence from supplier-customer and credit registers. CEPR Discussion Paper (2020).
Newfarmer, R., Page, J. & Tarp, F. Industries without Smokestacks: Industrialization in Africa Reconsidered (Oxford, 2018)
Kumar, A., Chakrabarti, A. S., Chakraborti, A. & Nandi, T. Distress propagation on production networks: Coarse-graining and modularity of linkages. Phys. A 568, 125714 (2021).
Article Google Scholar
Goto, H., Takayasu, H. & Takayasu, M. Estimating risk propagation between interacting firms on inter-firm complex network. PLoS One 12, e0185712 (2017).
Article PubMed PubMed Central Google Scholar
Hooijmaaijers, S. Buiten, G. A methodology for estimating the Dutch interfirm trade network, including a breakdown by commodity. OECD Conference, New Analytical Tools and Techniques for Economic Policy-making (2019).
Ialongo, L. N. et al. Reconstructing firm-level interactions in the Dutch input–output network from production constraints. Sci. Rep. 12, 11847 (2022).
Article CAS PubMed PubMed Central ADS Google Scholar
Inoue, H. & Todo, Y. Firm-level propagation of shocks through supply-chain networks. Nat. Sustain. 2, 841 (2019).
Article Google Scholar
Inoue, H. & Todo, Y. The propagation of economic impacts through supply chains: The case of a mega-city lockdown to prevent the spread of COVID-19. PLoS One 15, e0239251 (2020).
Article PubMed PubMed Central Google Scholar
Kashiwagi, Y., Todo, Y. & Matous, P. Propagation of economic shocks through global supply chains—evidence from Hurricane Sandy. Rev. Int. Econ. 29, 1186 (2021).
Article Google Scholar
König, M. D., Levchenko, A., Rogers, T. & Zilibotti, F. Aggregate fluctuations in adaptive production networks. Proc. Natl. Acad. Sci. 119, e2203730119 (2022).
Article PubMed PubMed Central Google Scholar
Kosasih, E. E. & Brintrup, A. A machine learning approach for predicting hidden links in supply chain with graph neural networks. Int. J. Prod. Res. 60, 5380 (2022).
Article Google Scholar
Maluck, J., Donner, R. V., Takayasu, H. & Takayasu, M. Motif formation and industry specific topologies in the Japanese business firm network. J. Stat. Mech. Theory Exp. 2017, 053404 (2017).
Article MathSciNet Google Scholar
Mattsson, C. E. S. et al. Functional structure in production networks. Front. Big Data 4, 25 (2021).
Article Google Scholar
McNerney, J., Savoie, C., Caravelli, F., Carvalho, V. M. & Farmer, J. D. How production networks amplify economic growth. Proc. Natl. Acad. Sci. 119, e2106031118 (2022).
Article CAS PubMed Google Scholar
Mizuno, T., Souma, W. & Watanabe, T. The structure and evolution of buyer–supplier networks. PLoS One 9, e100712 (2014).
Article PubMed PubMed Central ADS Google Scholar
Ohnishi, T., Takayasu, H. & Takayasu, M. Network motifs in an inter-firm network. J. Econ. Interact. Coord. 5, 171 (2010).
Article Google Scholar
Rachkov, A., Pijpers, F. & Garlaschelli, D. Potential biases in network reconstruction methods not maximizing entropy. CBS Technical Reports. https://doi.org/10.13140/RG.2.2.31861.29925 (2021).
Taschereau-Dumouchel, M. Cascades and fluctuations in an economy with an endogenous production network. 2017 Meeting Papers, 700, Society for Economic Dynamics (2017).
Watanabe, H., Takayasu, H. & Takayasu, M. Relations between allometric scalings and fluctuations in complex systems: The case of Japanese firms. Phys. A 392, 741 (2013).
Article Google Scholar
Diem, C., Borsos, A., Reisch, T., Kertész, J. & Thurner, S. Estimating the loss of economic predictability from aggregating firm-level production networks. arXiv:2302.11451 (2023).
Maluck, J. & Donner, R. V. Estimating the loss of economic predictability from aggregating firm-level production networks. PLoS One 10, e0133310 (2015).
Article PubMed PubMed Central Google Scholar
Wang, Z. et al. Motif transition intensity: A novel network-based early warning indicator for financial crises. Front. Phys. 9, 25 (2022).
Article Google Scholar
Alfaro-Ureña, A., Fuentes, M., Manelici, I. & Vasquez, J. Research Paper Series, Banco Central De Costa Rica (2018).
Kito, T., New, S. & Ueda, K. How automobile parts supply network structures may reflect the diversity of product characteristics and suppliers’ production strategies. CIRP Ann. 64, 1 (2015).
Article Google Scholar
Milo, R. et al. Network motifs: Simple building blocks of complex networks. Science 298, 824 (2002).
Article CAS PubMed ADS Google Scholar
Shen-Orr, S. S., Milo, R., Mangan, S. & Alon, U. Nat. Genet. 31, 64 (2002).
Article CAS PubMed Google Scholar
Stivala, A. & Lomi, A. Testing biological network motif significance with exponential random graph models. Appl. Netw. Sci. 6, 1 (2021).
Article Google Scholar
Asikainen, A., Iñiguez, G., Ureña-Carrión, J., Kaski, K. & Kivelä, M. Cumulative effects of triadic closure and homophily in social networks. Sci. Adv. 6, eaax7310 (2020).
Article PubMed PubMed Central ADS Google Scholar
Squartini, T. & Garlaschelli, D. Triadic motifs and dyadic self-organization in the World Trade Network. Self-Org. Syst. 7166, 24 (2012).
Google Scholar
Maratea, A., Petrosino, A. & Manzo, M. Triadic motifs in the partitioned world trade web. Proced. Comput. Sci. 98, 479 (2016).
Article Google Scholar
Squartini, T., van Lelyveld, I. & Garlaschelli, D. Sci. Rep. 3, 3357 (2013).
Article PubMed PubMed Central Google Scholar
Squartini, T. & Garlaschelli, D. Early-warning signals of topological collapse in interbank networks. J. Complex Netw. 3, 1 (2015).
Article MathSciNet Google Scholar
Colomer-de Simón, P., Serrano, M., Beiró, M. G., Alvarez-Hamelin, J. I. & Boguñá, M. Deciphering the global organization of clustering in real complex networks. Sci. Rep. 3, 2517 (2013).
Article PubMed PubMed Central ADS Google Scholar
Jamakovic, A., Mahadevan, P., Vahdat, A., Boguñá, M. & Krioukov, D. How small are building blocks of complex networks. arXiv:0908.1143 (2009).
Picciolo, F., Ruzzenenti, F., Holme, P. & Mastrandrea, R. Weighted network motifs as random walk patterns. New J. Phys. 24, 053056 (2022).
Article MathSciNet ADS Google Scholar
Jaynes, E. T. Information theory and statistical mechanics. Phys. Rev. 106, 620 (1957).
Article MathSciNet ADS Google Scholar
Jaynes, E. T. Information theory and statistical mechanics, II. Phys. Rev. 108, 171 (1957).
Article MathSciNet ADS Google Scholar
Jaynes, E. T. On the rationale of maximum-entropy methods. Proc. IEEE 70, 939 (1982).
Article ADS Google Scholar
Garlaschelli, D. & Loffredo, M. I. Maximum likelihood: Extracting unbiased information from complex networks. Phys. Rev. E 78, 015101 (2008).
Article ADS Google Scholar
Bardoscia, M. et al. The physics of financial networks. Nat. Rev. Phys. 3, 490–507 (2021).
Article Google Scholar
Cimini, G. et al. The statistical physics of real-world networks. Nat. Rev. Phys. 1, 58 (2019).
Article Google Scholar
Cimini, G., Mastrandrea, R. & Squartini, T. Reconstructing Networks, Elements in the Structure and Dynamics of Complex Networks (Cambridge University Press, 2021).
Squartini, T. & Garlaschelli, D. Maximum-Entropy Networks: Pattern Detection, Network Reconstruction and Graph Combinatorics, SpringerBriefs in Complexity (Springer, Cham, 2017).
Book Google Scholar
Garlaschelli, D. & Loffredo, M. I. Fitness-dependent topological properties of the World Trade Web. Phys. Rev. Lett. 93, 188701 (2004).
Article PubMed ADS Google Scholar
Squartini, T., Fagiolo, G. & Garlaschelli, D. Randomizing world trade. I. A binary network analysis. Phys. Rev. E 84, 046117 (2011).
Article ADS Google Scholar
Squartini, T., Fagiolo, G. & Garlaschelli, D. Randomizing world trade. II. A weighted network analysis. Phys. Rev. E 84, 046118 (2011).
Article ADS Google Scholar
Mastrandrea, R., Squartini, T., Fagiolo, G. & Garlaschelli, D. New J. Phys. 16, 043022 (2014).
Article ADS Google Scholar
Parisi, F., Squartini, T. & Garlaschelli, D. Enhanced reconstruction of weighted networks from strengths and degrees. New J. Phys. 22, 053053 (2020).
Article MathSciNet ADS Google Scholar
Almog, A., Bird, R. & Garlaschelli, D. Enhanced gravity model of trade: Reconciling macroeconomic and network models. Front. Phys.https://doi.org/10.3389/fphy.2019.00055 (2019).
Article Google Scholar
Di Vece, M., Garlaschelli, D. & Squartini, T. Gravity models of networks: Integrating maximum-entropy and econometric approaches. Phys. Rev. Res. 4, 033105 (2022).
Article Google Scholar
Di Vece, M., Garlaschelli, D. & Squartini, T. Reconciling econometrics with continuous maximum-entropy network models. Chaos Solitons Fractals 166, 112958 (2023).
Article MathSciNet Google Scholar
Cimini, G., Squartini, T., Garlaschelli, D. & Gabrielli, A. Systemic risk analysis on reconstructed economic and financial networks. Sci. Rep. 5, 15758 (2015).
Article CAS PubMed PubMed Central ADS Google Scholar
Anand, K. et al. The missing links: A global study on uncovering financial network structures from partial data. J. Financ. Stab. 35, 107 (2018).
Article Google Scholar
Lebacher, M., Cook, S., Klein, N. & Kauermann, G. In search of lost edges: A case study on reconstructing financial networks. J. Netw. Theory Financ. 5, 29 (2019).
Google Scholar
Ramadiah, A., Caccioli, F. & Fricke, D. Reconstructing and stress testing credit networks. J. Econ. Dyn. Control 111, 103817 (2020).
Article MathSciNet Google Scholar
Mazzarisi, P. & Lillo, F. Methods for Reconstructing Interbank Networks from Limited Information: A Comparison. In Econophysics and Sociophysics: Recent Progress and Future Directions, New Economic Windows 201–215 (Springer International Publishing, 2017).
Google Scholar
Squartini, T. & Garlaschelli, D. Analytical maximum-likelihood method to detect patterns in real networks. New J. Phys. 13, 083001 (2011).
Article ADS Google Scholar
Squartini, T., Picciolo, F. & Ruzzenenti, F. Reciprocity of weighted networks. Sci. Rep. 3, 2729 (2013).
Article PubMed PubMed Central Google Scholar
Shapiro, S. S. & Wilk, M. B. An analysis of variance test for normality (complete samples). Biometrika 52, 591 (1965).
Article MathSciNet Google Scholar
Di Vece, M., Garlaschelli, D. & Squartini, T. Deterministic, quenched, and annealed parameter estimation for heterogeneous network models. Phys. Rev. E 108, 054301 (2023).
Article MathSciNet PubMed ADS Google Scholar
Borsos, A. & Stancsics, M. Unfolding the hidden structure of the Hungarian multi-layer firm network. MNB Occasional Papers, No. 139 (2020).

Download references

Acknowledgements

SoBigData.it receives funding from European Union—NextGenerationEU—National Recovery and Resilience Plan (Piano Nazionale di Ripresa e Resilienza, PNRR)—Project: “SoBigData.it—Strengthening the Italian RI for Social Mining and Big Data Analytics”—Prot. IR0000013—Avviso n. 3264 del 28/12/2021. This work has been also supported by the project ‘Network analysis of economic and financial resilience’, Italian DM n. 289, 25-03-2021 (PRO3 Scuole) CUP D67G22000130001 and by the PNRR-M4C2-Investimento 1.3, Partenariato Esteso PE00000013-“FAIR-Future Artificial Intelligence Research”-Spoke 1 “Human-centered AI”, funded by the European Commission under the NextGeneration EU programme. DG acknowledges support from the Dutch Econophysics Foundation (Stichting Econophysics, Leiden, the Netherlands) and the Netherlands Organization for Scientific Research (NWO/OCW). MDV and DG acknowledge support from the ‘Programma di Attività Integrata’ (PAI) project ‘Prosociality, Cognition and Peer Effects’ (Pro.Co.P.E.), funded by IMT School for Advanced Studies Lucca. MDV also acknowledges support by the European Community programme under the funding schemes: ERC-2018-ADG G.A. 834756 “XAI: Science and technology for the eXplanation of AI decision making”.

Author information

Authors and Affiliations

IMT School for Advanced Studies Lucca, P.zza San Francesco 19, 55100, Lucca, Italy
Marzio Di Vece & Diego Garlaschelli
Lorentz Institute for Theoretical Physics, Leiden University, Niels Bohrweg 2, 2333CA, Leiden, The Netherlands
Marzio Di Vece & Diego Garlaschelli
Scuola Normale Superiore, P.zza dei Cavalieri 7, Pisa, Italy
Marzio Di Vece
Statistics Netherlands, Henri Faasdreef 312, 2492 JP, Den Haag, The Netherlands
Frank P. Pijpers
Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Amsterdam, The Netherlands
Frank P. Pijpers
INdAM-GNAMPA Istituto Nazionale di Alta Matematica, Rome, Italy
Diego Garlaschelli

Authors

Marzio Di Vece
View author publications
You can also search for this author in PubMed Google Scholar
Frank P. Pijpers
View author publications
You can also search for this author in PubMed Google Scholar
Diego Garlaschelli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, methodology, and writing—review and editing: all authors; software, visualization, and writing—original draft preparation: M.D.V; supervision: F.P.P. and D.G.; data access: F.P.P., fund acquisition: D.G.

Corresponding author

Correspondence to Marzio Di Vece.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Di Vece, M., Pijpers, F.P. & Garlaschelli, D. Commodity-specific triads in the Dutch inter-industry production network. Sci Rep 14, 3625 (2024). https://doi.org/10.1038/s41598-024-53655-3

Download citation

Received: 07 August 2023
Accepted: 03 February 2024
Published: 13 February 2024
DOI: https://doi.org/10.1038/s41598-024-53655-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Commodity-specific triads in the Dutch inter-industry production network

Subjects

Abstract

Similar content being viewed by others

Reconstructing firm-level interactions in the Dutch input–output network from production constraints

Effect of network topology and node centrality on trading

The spatial and temporal dynamics of global meat trade networks

Results

The CBS production network

Network randomization methods

Measuring empirical reciprocity statistics

Binary motif analysis

Weighted motif analysis

Discussion

Methods

Binary null models

The directed binary configuration model

The reciprocal binary configuration model

Conditional weighted null models

The CRWCM model

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Reconstructing firm-level interactions in the Dutch input–output network from production constraints

Effect of network topology and node centrality on trading

The spatial and temporal dynamics of global meat trade networks

Results

The CBS production network

Network randomization methods

Measuring empirical reciprocity statistics

Binary motif analysis

Weighted motif analysis

Discussion

Methods

Binary null models

The directed binary configuration model

The reciprocal binary configuration model

Conditional weighted null models

The CRWCM model

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links