Author Correction: Hypergraph reconstruction from network data

The empirical findings are qualitatively unchanged; the authors have found highly similar compression ratios when comparing the clique descriptions to the optimal descriptions obtained by minimising the description length (DL). The exact values of the minimum description length (MDL) are different since the modification to Eq. (12) changes the absolute value of P(H), and since the Monte Carlo Markov chain algorithm is stochastic and therefore yields slightly different results from runs to runs. Below is a detailed list of changes brought about by modifying Eq. (12) and running new simulations.


Changes to the main text
In the subsection "Hypergraph prior", in the 7th paragraph right before Eq. (10): "calculated" has been replaced by "approximated." The original Eq. (10) read and has now been replaced by The justification for Eq. (10): "first computing the reciprocal of the probability that two nodes are not connected by any hyperedge in the hypergraph and then multiplying the result by the total number of node pairs" in the original text has been replaced by: "by assuming that hyperedges do not overlap on average." The original Eq. (11) read μ ¼ ðL À 1Þ log 1 and has now been replaced by: μ ¼ E=ðL À 1Þ: Eq. (12) was originally and has now been replaced by: The text immediately following Eq. (12) which read "We note that μ diverges as the density E= N 2 of G approaches one, correctly reflecting the fact that even an infinitely dense hypergraph could have generated the data. This divergence is a sign that our empirical prior is not well-defined in the extremely dense limit. But as we have discussed in the introduction, the empirical networks we typically encounter are sparse by construction-we need not worry about this limit in practice.", has been removed since the approximated equation for μ does not diverge. Instead, Eq. (12) is now followed by "which is the equation we will use henceforth, with μ = E/(L − 1)." The scaling equation for log PðHjGÞ appearing in "Results and discussion", under the heading "Properties of the posterior distribution," was and now reads: The conditioning of this equation on G is discussed in the text following the equation, which originally stated: "This equation tells us that the log-posterior log PðHjGÞ decreases with growing β, because the argument of the logarithm is at least one. Furthermore we have P(G|H) = 1 by construction, ... " It now reads: "This equation tells us that the log-posterior log PðHjGÞ decreases with growing β, because the argument of the logarithm is greater or equal to one. Furthermore, the likelihood equals one by construction...." The ratio of posterior distributions under changes to a minimal hypergraph originally appearing in the third paragraph of the Results subsection titled "Properties of the posterior distribution" was written as PðH 0 m jGÞ PðH m jGÞ ¼ and has now been corrected to: The paragraph following this equation read "This ratio is smaller than one: the minimal property of H m implies that E k < N 2 , and the term in the parenthesis is greater than one because N ≫ k. As a result, adding a spurious hyperedge to a minimal hypergraph decreases the posterior probability. As a corollary of the two above observations, we conclude that the minimal hypergraphs are highquality local maxima of P(H|G). We cannot simply pick one of these optima as our reconstruction, however, because there may exist multiple ones of comparable quality. Further, non-optimal hypergraphs may account for a significant fraction of the posterior probability in principle. Instead, we handle these possibly conflicting descriptions by combining them." It now reads "where Z 0 k is the quantity in Eq. (7) for the modified minimal hypergraph, and Z k is the same quantity for the minimal hypergraph. One can show that this ratio is always smaller than one and that, as a result, adding a spurious hyperedge to a minimal hypergraph decreases the posterior probability. The proof is straightforward and relies on the observation that for a minimal The result follows by direct computation when E k < N 2 and uses the fact that that Z 0 k ¼ 2 when E k ¼ N 2 (because adding a single hyperedge to a completely connected minimal hypergraph means one has to double-up one hyperedge)." In 'Results and discussion", under the heading "Detailed case study of higher-order interactions in an empirical network," the number of nodes and edges in the Football dataset have been added.
The text has changed from "The nodes of this network represent teams playing..." to "The 115 nodes of this network represent teams playing..." and "The relationships between teams are viewed through the lens of pairwise interactions ..." now reads "The relationships between teams are viewed through the lens of 613 pairwise relationships..." Under the subheading "Best model fit," four numerical changes have been made: The hyperedges of H * involving more than two nodes were 86, and now are 30. The description length was 4123.3 bits and now is 2405.8 bits, which represents a 43.3% saving (instead of the previous 33.6%) over the description length of the maximal clique hypergraph, which now is of 4246.5 bits (instead of 6208.5 bits).
This change appears in two locations: before Eq. (17), and in the caption of Fig. 6.
The following sentence read "With a threshold of α = 0.05, we find six uncertain triangles (hyperedges on three nodes) and five uncertain edges in the Football data", and now is instead "With a threshold of α = 0.05, we find 16 uncertain triangles (hyperedges on three nodes), 70 uncertain edges, and 9 additional uncertain interactions of higher orders." After Eq. (17), the threshold for uncertainty is now updated to S * ≈ 0.286 (instead of S * = 0.169).
On page 9, first paragraph, the correlation coefficients reported under the heading "Systematic analysis of higher-order interactions in empirical networks" have been updated to reflect the results obtained with the new posterior distribution. The average degree of the nodes still correlates with compression (τ = 0.52, previously 0.53) and, as before, the average local clustering does not correlate with compression (τ = 0.03, previously τ = − 0.07). The average interaction size is no longer significantly correlated with the tested properties. As a consequence, in the second paragraph, the sentence "The correlation between local properties and interaction size is not as strong as with compression, but there are some dependencies (τ = 0.40 and τ = 0.27 for the degree and local clustering, respectively). These might be partly explained by constraints on the possible values that the average interaction size 〈s〉 can adopt." is now rewritten as: "The correlation between local properties and interaction size is weak (τ = 0.09 and τ = 0.12 for the degree and local clustering, respectively). Nonetheless, we expect some weak dependencies as these network properties put constraints on the possible values that the average interaction size 〈s〉 can adopt." In the second paragraph of page 9, the average interaction size has been updated. The sentence "Other datasets yield hypergraphs with large interactions on average, involving as many as five nodes in the airport network." is now changed to "Other datasets yield hypergraphs with large interactions on average, involving as many as 4.4 nodes in the airport network." In the third paragraph of the conclusion, at page 10, the word "undoubtedly" was removed from the sentence "The method we have proposed here is undoubtedly one of the simplest instantiations...", and now reads "The method we have proposed here is one of the simplest instantiations..."     Table 2 below for detailed numerical values). See Correction Fig. 5 for the original version of Fig. 7.

Changes to the supplementary information
Supplementary note 1. Equations (2) and (4) have been updated to reflect the matching changes made to equations (12) and (11) of the main text, respectively. In particular, Supplementary Equation (2) was And has now been replaced by: Supplementary Equation (4) was And has now been replaced by: The caption title of Supplementary Table 1 was "Properties of the empirical bipartite networks analyzed in Section II.E of the main text." which erroneously referred to an invalid reference (Section II.E) and has now been changed to "Properties of the empirical bipartite networks analyzed in Fig. 4 of the main text." The caption title of Supplementary Table 2 was "Properties of the empirical bipartite networks analyzed in Section II.G of the main text." which erroneously referred to an invalid reference (Section II.G) and has now been changed to "Properties of the empirical bipartite networks analyzed in Fig. 7 of the main text." See Correction Fig. 6 for the original version of Supplementary Table 2.
All corrections described above have now been implemented in both the HTML and PDF version of the article. The Supplementary Information file has also been updated with the corrected version.