Generative hypergraph models and spectral embedding

Many complex systems involve interactions between more than two agents. Hypergraphs capture these higher-order interactions through hyperedges that may link more than two nodes. We consider the problem of embedding a hypergraph into low-dimensional Euclidean space so that most interactions are short-range. This embedding is relevant to many follow-on tasks, such as node reordering, clustering, and visualization. We focus on two spectral embedding algorithms customized to hypergraphs which recover linear and periodic structures respectively. In the periodic case, nodes are positioned on the unit circle. We show that the two spectral hypergraph embedding algorithms are associated with a new class of generative hypergraph models. These models generate hyperedges according to node positions in the embedded space and encourage short-range connections. They allow us to quantify the relative presence of periodic and linear structures in the data through maximum likelihood. They also improve the interpretability of node embedding and provide a metric for hyperedge prediction. We demonstrate the hypergraph embedding and follow-on tasks—including quantifying relative strength of structures, clustering and hyperedge prediction—on synthetic and real-world hypergraphs. We find that the hypergraph approach can outperform clustering algorithms that use only dyadic edges. We also compare several triadic edge prediction methods on high school and primary school contact hypergraphs where our algorithm improves upon benchmark methods when the amount of training data is limited.

A typical graph-based data set captures pairwise interactions between nodes. There is growing interest in understanding higher-order, group-level, interactions, with different paradigms being proposed 1,2 . In this work, we represent such interactions with a hypergraph formulation; here each hyperedge involves two or more nodes. This framework is discussed in [3][4][5] and has found application in real-world problems such as epidemic spread modelling 6 , image classification 7 , and the study of biological networks 8

.
A fundamental learning task on graph-based data is to embed nodes into a low-dimensional Euclidean space 9,10 . The learned embedding could be used in follow-on tasks such as clustering, classification, and structure recovery. There are various types of learning algorithms for graphs; some design and analyze Laplacian matrices related to the graph 11 , some solve maximum likelihood problems associated with random graph models 12,13 , and others involve more complex machine learning frameworks 14,15 .
In this work, we build on the use of spectral methods which derive node embeddings from eigenvectors of a Laplacian matrix 16 . Such spectral algorithms are popular, since they can be implemented efficiently on large sparse graphs and they are backed up by accompanying consistency theory 17 . Two main approaches have also recently been proposed for spectral clustering on hypergraphs. One approach is to employ higher-order Laplacian tensors 18 . Tensors in general contain richer information, however, their use can require considerably more computational expense than matrix algorithms, and the results can be difficult to visualize and interpret. A second approach is to "flatten" the higher order information into a representative node-level matrix. Some matrixbased approaches analyze the vertex-edge incidence matrix associated with a random walk interpretation 9 , other frameworks utilise motif-based Laplacian matrices that could be generalized to various motifs and time steps 10,19 . The methodology that we develop here fits into this second category by building a node-based matrix, using an intermediate step that looks over all hyperedge dimensions in order to incorporate higher order information.
A second aspect of our work is the connection between spectral methods and random models. Many graph embedding 20 , re-ordering 12 , clustering 21 , and structure recovery 22,23 techniques solve maximum-likelihood problems on graphs assuming specific generative models. Besides their application in these inverse problems, random graph models are useful inference tools for quantifying structure, predicting new or missing links, and improving the interpretability of learning algorithms by relating node embeddings to edge probabilities 24,25 .
Many spectral algorithms are naturally related to optimization problems. This is the case when the Laplacian matrix is Hermitian, so that its eigenvectors are critical points of a quadratic form 26 . For example, spectral embedding for undirected graphs using the standard combinatorial Laplacian is related to minimizing the unnormalized cut 11,27 .
Furthermore, such optimization formulations may lead to interesting random graph interpretations of spectral algorithms. When the quadratic form can be expressed as the log-likelihood of the graph under a suitable model, the optimization problem may be restated as a node reordering. Such connections have been investigated for undirected graphs 28 and directed graphs 25 , and here we extend these ideas to the hypergraph setting.
In particular, we associate customized spectral embedding algorithms with generative models that belong to a new class of range-dependent random hypergraphs that encourages short-range connections between nodes, generalizing existing graph models 12,29 . These range-dependent random hypergraphs offer flexibility that is not available in stochastic block models 21 which require block sizes to be pre-specified or inferred.
The rest of the paper is structured as follows. Our notation is introduced in the "Notation" section. In the "Linear hypergraph embedding" and "Periodic hypergraph embedding" sections we define the linear and periodic hypergraph embedding algorithms and derive associated optimization problems. We propose random models associated with the hypergraph embedding algorithms in the "Generative hypergraph models" section, which leads to a model comparison workflow that quantifies the relative strength of linear versus periodic structures. Numerical studies on synthetic and real-world hypergraphs using the proposed models are presented in the "Experiments" section.
The main contributions of this work are as follows.
• We propose new range-dependent generative models for hypergraphs that generate linear and periodic cluster patterns. • We establish their connection with linear and periodic spectral embedding algorithms.
• We demonstrate on synthetic and real data that, after tuning model parameters to the data, these models can quantify the relative strength of linear and periodic structures. • We perform prediction of triadic hyperedges (triangles) using the proposed linear model and show that it outperforms the existing average-score based method 30 on synthetic hypergraphs, and also on high school and primary school contact data when the amount of training data is limited.

Notation
We consider undirected, unweighted hypergraphs G = (V , E) on the vertex set V containing n nodes and the hyperedge set E. We let R ∈ R be an unordered set of nodes, where R denotes the collection of all such sets. We use |R| to denote the number of nodes in tuple R, that is, its cardinality, and we assume 2 ≤ |R| ≤ T for all R ∈ E. Let A R indicate the presence of a hyperedge, so that A R = 1 if R ∈ E and A R = 0 otherwise. We define the t-th order n by n adjacency matrix ij counts the number of hyperedges with cardinality t that contain distinctive nodes i and j; hence, . We use i i i to denote √ −1 and 1 1 1 to denote the vector in R n with all entries equal to one. We let a a a ′ represent the transpose of a real-valued vector a a a and let b b b H denote the conjugate transpose of a complex-valued vector b b b . We use P to denote the set of all permutation vectors, that is, all vectors in R n that contain each of the integers 1, 2, . . . , n . We will focus on one-dimensional embedding. We let x i ∈ R be the location to which node i is assigned, and

Linear hypergraph embedding
Given a hypergraph, suppose we wish to find node embeddings x x x ∈ R n such that hyperedges tend to contain nodes that are a small distance apart. To formalize this idea, we can define a linear incoherence function I lin (x x x, R) that sums up the squared Euclidean distance between all nodes pairs in tuple R: We may then define the total linear incoherence of the hypergraph, η lin (G, x x x) , by aggregating the linear incoherence over all node tuples. Furthermore, we may wish to tune the weights of hyperedges of different cardinalities through a coefficient c |R| ≥ 0 for node tuple R; that is, One justification for these tuning parameters c |R| is that they allow us to avoid the case where high-cardinality hyperedges dominate the expression. For example, we could choose c t = 1 t(t−1) to balance the contributions from hyperedges with different sizes. A suitable choice of c t may also depend on the relative importance of hyperedges in the application.
In Proposition 3.1 we show that the total linear incoherence maybe be written as a quadratic form involving the hypergraph Laplacian matrix www.nature.com/scientificreports/ Proposition 3.1 For any x x x ∈ R n with L defined in (3), and η lin (G, x x x) defined in (2) we have We note that each Laplacian L [t] is symmetric and positive semi-definite with smallest eigenvalue 0.

Assumption 3.1
We assume that the unweighted, undirected graph associated with the binarized version of L is connected. It then follows that L has a single eigenvalue equal to 0 with all other eigenvalues positive. We further assume that there is a unique smallest positive eigenvalue, 2 (the eigenvector v v v [2] corresponding to 2 is a generalization of the classic Fiedler vector).
In minimizing the total linear incoherence (2) we must avoid the trivial cases where (a) all nodes are located arbitrarily close to the origin and (b) all nodes are assigned to the same location. Hence it is natural to impose the constraints �x x x� 2 = 1 and x x x ′ 1 1 1 = 1 . It then follows from the Rayleigh-Ritz Theorem 31 , Theorem 4.2.2] that the quadratic form in Proposition 3.1 is solved by x x x = v v v [2] . This leads us to Algorithm 1 below, which could also be considered as a special case of the algorithm in 18 where the motifs considered are hyperedges. Remark 3.1 Algorithm 1 could be extended to higher dimensional embeddings where node i is assigned to In this case we could generalize (1) to If we require the coordinate directions to be orthogonal, then the embedding is found via the eigenvectors corresponding to the d-smallest non-zero eigenvalues; see 27 for details in the graph case.

Periodic hypergraph embedding
In this section, we look at the periodic analogue of linear hypergraph embedding. Here nodes are embedded into the unit circle rather than along the real line. Such a periodic structure formed the basis of the classic "small world" model of Watts and Strogatz 32 . Results in 24 showed that certain real networks are better represented via this type of "wrap-around" notion of distance. Hence, it is of interest to develop concepts that apply to the hypergraph case.
We may position nodes on the unit circle by mapping them to phase angles θ θ θ = {θ i } n i=1 ∈ [0, 2π) . We may then use a periodic incoherence function to quantify the distance between node pairs in the tuple R: www.nature.com/scientificreports/ Then the total periodic incoherence of the hypergraph becomes In Proposition 4.1 below, we relate the total periodic incoherence to a quadratic form involving the hypergraph Laplacian matrix (3).
Then the proof may be completed in a similar way to the proof of Proposition 3.1.
Appealing again to the Rayleigh-Ritz theorem 31 , Theorem 4.2.2], the quadratic form in (8) is minimized over all ψ ψ ψ ∈ C n with �ψ ψ ψ� 2 = 1 and ψ ψ ψ H 1 1 1 = 1 by taking ψ ψ ψ = v v v [2] . However, this real-valued eigenvector cannot be proportional to a vector with components of the form e i i iθ j . Hence, following the approach in 24 we will use the heuristic of setting defined as v [2] i + i i iv [3] i = |v [2] i + i i iv [3] i | · e iθ i , where v v v [3] is an eigenvector corresponding to the next-smallest eigenvalue of L. Such a heuristic assumption converts two real eigenvectors into a complex vector, which gives an approximate solution to the minimization problem. The resulting workflow is summarized in Algorithm 2.
We also note that for simple unweighted, undirected graphs, finding θ θ θ that minimizes η per (G, θ θ θ) is equivalent to the formulation proposed in 24 . This may be shown by letting u i = cos θ i and z i = sin θ i and expanding (7) as which simplifies to This is essentially equation (3.1) in 24 , derived from a slightly different perspective. We then arrive at Algorithm 2 below.

Generative hypergraph models
We now discuss a connection between the minimization of total incoherence and generative models. Let us consider finding a node embedding x x x ∈ R n that minimizes a generic total graph incoherence expression for a non-negative incoherence function I(x x x, R) . We consider the case where the x i ∈ R must take distinct values from a discrete set In the linear case, this set may be the integers from 1 to n and in the periodic case this set may be equally spaced angles in [0, 2π).
Now consider a random hypergraph model where each hyperedge involving node tuple R ∈ R is generated independently with probability for a function f R that takes values between 0 and 1. We have the following connection.
Theorem 5.1 Suppose x x x ∈ R n is constrained to take values from a discrete set such that x i = ν p i , where p ∈ P is a permutation vector. Then minimizing the total incoherence (11) over all such x x x is equivalent to maximizing over all such x x x the likelihood that the hypergraph is generated by a model of the form (12), where for any positive γ.
Proof Using (12), the likelihood of the whole hypergraph is which leads to the log-likelihood The second term on the right-hand side, which is the probability of the null hypergraph, is independent of the the permutation. Hence, with (13), maximizing the log-likelihood of the hypergraph is equivalent to minimizing Remark 5.1 Theorem 5.1 could be extended to the case where node i is assigned to x x x [i] ∈ R d for d > 1 , and a higher-dimensional incoherence function in (5) is considered. In this scenario, we constrain for a permutation vector p ∈ P . Then R∈R : www.nature.com/scientificreports/ we could follow the same arguments as in Theorem 5.1 to derive a model described by (12) and (13), where For a hypergraph generated by model (13) the number of hyperedges connecting the node tuple R follows a Bernoulli distribution with probability 1/1 + e γ c |R| I(x x x,R) . The log-odds of the hyperedge decay linearly with the incoherence of the node tuple since where the factor γ c |R| determines the decay rate. The probability of a hyperedge is highest when all nodes overlap, i.e., I(x x x, R) = 0 , which gives a 1/2 likelihood. If we generate hyperedges in repeated trials for the node tuple R, the variance of the number of hyperedges is e c |R| γ I(x x x,R) /(1 + e c |R| γ I(x x x,R) ) 2 . When I(x x x, R) = 0 , the largest variance of 1/4 is achieved. The expected total number of hyperedges of the whole hypergraph G can be expressed as We note that Theorem 5.1 introduces the extra scaling parameter γ . This parameter plays no direct role in Algorithms 1 and 2. However, a value for γ is needed if we wish to compare the likelihoods of the two models having inferred the embeddings. In principle, we may fit the parameter γ to a given hypergraph by matching the observed number of hyperedges with their expectation. However, from a computational point of view, this is rather challenging in general, since the computational complexity of the expectation calculation is O (n T ) when the maximum cardinality of a considered hyperedge is T. Hence, given an embedding, in practice we prefer to pick γ by maximizing the likelihood, as described in the following subsection.

Model comparison.
Under the assumption that a given hypergraph arose from a mechanism that favours connections between "nearby" nodes (in some latent, unobservable configuration), it is of interest to know whether a linear or periodic distance provides a better description.We may address this question using a model comparison approach. As in the "Linear hypergraph embedding" and "Periodic hypergraph embedding" sections, we consider one-dimensional embeddings, such that both the linear and periodic version have n + 1 parameters given the Laplacian coefficients: n node embeddings and a decay parameter γ . The node embeddings will be estimated from Algorithms 1 and 2. For any choice of γ , we may then calculate the corresponding likelihood for each type of hypergraph, given the embedding. We may then compare the models by reporting plots of likelihood versus γ or by reporting the maximum likelihood over all γ . We note that Theorem 5.1 states that node embeddings that minimize the incoherence also maximize the graph likelihood under the given discrete constraints. We note that Algorithms 1 and 2 minimize linear and periodic incoherence after relaxing the discrete constraints in Theorem 5.1 for computational feasibility. Such heuristics are often used in discrete programming. Therefore instead of the exact maximum likelihood, we get an estimated maximum likelihood. An overall workflow is shown below in Algorithm 3.

Experiments
Model comparison. Synthetic hypergraphs. In this section we test the performance on Algorithms 1, 2 and 3 in a controlled setting. To do this, we generate hypergraphs with either linear or periodic clustered structure using the proposed random model. For simplicity, we only consider dyadic and triadic edges, although the experiments could be extended to include higher-order hyperedges.
Linear hypergraph with clustered nodes. We first generate hypergraphs with K planted clusters C 1 , C 2 , . . . , C K of size m, and n = mK nodes. We embed the nodes using x i = 2(l−1) , a) is an additive uniform noise. Hyperedges are then drawn randomly according to model (13) with the linear incoherence (1).
We note that, in practice, the embedding algorithms must choose values c 2 and c 3 in order to form the hypergraph Laplacian, and the model comparison algorithm must choose a value for γ . We are therefore interested in the sensitivity of the process with respect to c 2 and c 3 , and in the accuracy with which γ can be estimated. We use c 2 , c 3 and γ 0 to denote parameters used by the generative model to create the synthetic data; we also let c * 2 and c * 3 www.nature.com/scientificreports/ denote the corresponding parameters used in the spectral embedding algorithms and let γ * represent an inferred value of γ 0 . We choose c 2 = 1 and c 3 = 1/3 so that the weight of a hyperedge is inversely proportional to the number of node pairs involved. We let m = 50 , K = 5 , a = 0.05 , and vary the decay parameter γ 0 from 0 to 10. Figure 1a shows an example of the dyadic adjacency matrix, W [2] , with γ 0 = 4 , where dots represent non-zeros. A corresponding triadic adjacency matrix, W [3] , is shown in Fig. 1b. In all our tests we discard hypergraphs that do not satisfy Assumption 3.1.
For each synthetic hypergraph, we estimate the maximum log-likelihood assuming a linear or a periodic structure using Algorithm 3. For each input decay parameter γ 0 , 40 hypergraphs are generated independently and the average maximum log-likelihood is plotted in Fig. 1c. The shaded regions represent the estimated 80% confidence interval. In this case, the linear model correctly achieves a higher average maximum log-likelihood. The tight bound of the confidence interval suggests that the result is consistent across random trials.
We then perform K-means clustering using the periodic and linear embeddings assuming 5 clusters and plot the Adjusted Rand Index (ARI) [33][34][35] in Fig. 1d. Here, a larger ARI indicates a better clustering result . The dotted line shows the average over 40 independently trials for each γ 0 value and the shaded area is the estimated 80% confidence interval. The plot suggests that the clustering from the linear embedding outperforms the clustering from the periodic embedding.
We are interested in the effect of parameters c 3 and c * 3 that control the weight of triadic edges in the random graph model and spectral embedding algorithm respectively. To conduct an experiment, we fix the weight of dyadic edges c 2 = 1 , c * 2 = 1 , and decay parameter γ 0 = γ * = 1 , while varying c 3 and c * 3 . The maximum loglikelihood of the linear model (Fig. 1e) and the ARIs using the linear embedding (Fig. 1f) are shown as heatmaps over c 3 and c * 3 . Values are the average over 40 random trials. Overall, choosing c * 3 = c 3 , gives the highest maximum likelihood. Therefore, when the true c 3 is not known, it could be estimated using a maximum likelihood method. In terms of the clustering result we note that when c 3 is large, for example, when c 3 > 0.3 , using information from triadic edges by setting c * 3 > 0 achieves a better ARI than using only diadic edges, i.e., c * 3 = 0 . This is because a large c 3 encourages more triadic edges to be formed within clusters, whereas a small c 3 leads to more triadic edges between clusters. In general the larger the c 3 , the less sensitive the ARI is to the choice of c * 3 Periodic hypergraph with clustered nodes. To generate hypergraphs with periodic clusters, we use a node embedding based on a vector of angles θ = (θ 1 , θ 2 , . . . , θ n ) T in [0, 2π) , forming K clusters C 1 , C 2 , . . . , C K of size m. In particular, we let θ i = 2π(l−1) , a) is the added noise. The hyperedges are generated using model (13), where the incoherence function is defined in (6). We choose a = 0.05π , c 2 = 1 , c 3 = 1/3 and vary the decay parameter γ 0 . Examples of the dyadic and triadic adjacency matrices with γ 0 = 1 are shown in Fig. 2a,b.
Using the same approach as in the previous section, we compare the maximum log-likelihood and ARIs assuming linear and periodic structures in Fig. 2c,d. We see that the periodic model achieves a higher maximum, and on average the periodic embedding produces higher ARIs.
Heat-maps in Fig. 2e,f show results for different combinations of c 3 and c * 3 for the periodic embedding algorithm. These results were generated in the same way as for Fig. 1c,d. Higher maximum likelihoods are achieved near the diagonal where c * 3 = c 3 , hence the true parameters c 3 for the underlying hypergraph could be estimated using the maximum likelihood method. As in the previous example, when c 3 ≥ 0.3 , using the triadic edges ( c * 3 ) improves the ARI. When c 3 < 0.3 , increasing c * 3 leads to an inferior clustering result. However when c 3 ≥ 0.3 , ARI becomes less sensitive to the choice of c * 3 as long as it is positive. In summary, these tests indicate that the algorithms are able to correctly distinguish between linear and periodic range-dependency when one such structure is present in the data. We observed that setting c * 3 > 0 improves the ARI when the triadic edges have a strong structural pattern; that is, when c 3 is large. Moreover, when the true parameter c 3 is unknown we recommend choosing c * 3 based on a maximum likelihood estimation, that is, finding the value c * 3 that returns the largest maxima in Algorithm 3. Such a choice also achieve reasonable ARIs in our synthetic examples as shown in the diagonal entries in Figs. 1f and 2f.
Real hypergraphs. High school contact data. The high school contact data from 36 records the frequency of student interaction. Students are represented as nodes, and contacts between two or three students are registered as dyadic or triadic edges. We retrieved the hypergraph from 21 containing 327 nodes, and only studied its dyadic and triadic edges considering the computational complexity. We construct the hypergraph Laplacian L = c * 2 L [2] + c * 3 L [3] and perform linear and periodic spectral embedding. For the linear embedding we map nodes into 3-dimensional Euclidean space using the eigenvectors corresponding to the three smallest eigenvalues that are larger than 0.01. We make this choice because the eigenvector associated with the smallest non-zero eigenvalue has only a few non-zero entries and leads to trivial clusters. We fix c * 2 = 1 and vary c * 3 since only the relative weight c * 3 /c * 2 matters in node embedding. The the maximum likelihoods and ARIs evaluated using various c * 3 are shown in the left column in Fig. 3. The true clusters are defined by the classes the students came from. Overall the periodic embedding achieves higher likelihoods and ARIs despite the linear embedding involving more parameters. Since linear clusters tend to have more marginalized groups that are far from other clusters, our results may suggest a lack of marginalisation driven by class membership.
We note that setting c * 3 = 0 causes the algorithm to ignore triangles, and hence to reduce to classical spectral clustering. For the linear algorithm, we see that incorporating triadic edges by using a positive c * 3 can improve the ARI by up to around 0.09. We note that in 21 , modularity maximization-based clustering achieved ARI=1 on the same data. However, those methods have more parameters, which makes the ARI not directly comparable. www.nature.com/scientificreports/ Primary school contact data. The primary school contact hypergraph 21 is constructed from the contact pattern between primary students from 10 classes 37 . Nodes represent students or teachers, and hyperedges represent their physical contact. Each node is labelled by the class of the students or as a teacher. The hypergraph contains 242 nodes and 11 classes of labels. We extracted the dyadic and triadic edges from the hypergraph and performed   www.nature.com/scientificreports/ likelihood at c * 3 = 0.7 and overall performs better than the linear embedding in the clustering task. These results may be related to the existence of a teacher group that connects with all student groups. When we arrange dyadic and triadic adjacency matrices by node classes, these connections will appear as off-diagonal entries. As we have shown in Fig. 2a,b, the periodic model tends to produce more off-diagonal connections than the linear model.

Senate bills data.
In the senate bills hypergraph 21,38,39 , nodes are US Congresspersons and hyperedges are the sponsor and co-sponsors of Senate bills. There are in total 294 nodes, and each node is labelled as either Democrat or Republican. We performed likelihood comparison and clustering with only the dyadic and triadic edges. Since the node degree distribution is highly inhomogeneous, we observe many trivial eigenvectors that are close to indicator functions. To address this issue we trimmed off the top and bottom 2% nodes by node degree, and use the eigenvector associated with the smallest eigenvalue that is greater than 0.01. The linear and periodic models have similar maximum likelihoods and clustering ARIs, as shown in the right column in Fig. 3. In contrast with previous examples, there are only two clusters present in this data set. Hence the difference between the periodic and linear models, which could be reflected in the connection (or disconnection) pattern between the first and the last group, is less prominent.
Hyperedge prediction. Once the node embeddings are estimated from the spectral algorithms, the probability of hyperedges may be computed from the proposed models. The hyperedge probability can naturally serve as a score for hyperedge prediction. We implement and test such triadic edge prediction on timestamped high school contact data 30,36 , primary school contact data 30,37 , and synthetic linear hypergraphs. The results will be compared against approaches based on average-scores proposed in 30 . Other hyperedge prediction methods include feature-based prediction 40 , model-based prediction 41 , and machine learning-based prediction 42 .
For high school and primary school contact data, we used three and four eigenvectors respectively corresponding to the smallest eigenvalues that are greater than 0.01 to be consistent with the previous section, and only consider dyadic and triadic edges. The hyperedges are sorted by time stamps and split into training and testing data. For example, an 80:20 training/testing splitting ratio means we use the first 80% of the hyperedges to train the model and the last 20% to test the predictions. When the training ratio is low, the subgraph for training may be disconnected thus violating Assumption 3.1. Therefore, we only consider nodes in the largest connected component of the graph associated with the binarized version of L of the training subgraph, and test the prediction on the same set of nodes. Note that in real data, the parameters γ 0 , c 2 and c 3 for the hypergraph model are unknown. We fix c * 2 = 1 and choose c * 3 and γ * using a maximum likelihood estimate through a grid search on the training data.  www.nature.com/scientificreports/ On the training set, we assign scores to each triplet using five methods: a random score as baseline, hyperedge probability from the linear model, arithmetic mean, harmonic mean, and geometric mean from 30 . On the test set, we measure the prediction performance with the area under precision-recall curve (AUC-PR) 43 . A Precision-Recall (PR) curve traces the Precision = True Positive/(True Positive + False Positive) and Recall = True Positive/ (True Positive + False Negative) for different thresholds. The AUC-PR is a measure that balances both Precision and Recall where 1 means perfect prediction at any threshold. Setting c * 3 = 0 will assign probability of 0.5 to all triplets, which is equivalent to the random score approach if we break ties randomly.
On the high-school contact data shown in Table 1, the harmonic and geometric mean attain the highest AUC-PR for large amounts of training data, see, for example the 80:20 data split; while the linear model predictions achieve the best results for small amounts of training data, as seen in the 20:80 data split. This could be because when training data is insufficient, there are more unobserved "missing" dyadic and triadic edges. In this case, the node embedding algorithm can infer node proximity based on common neighbours. In other words, it can place nodes with common neighbours nearby even if they haven't been directly linked before. On the other hand, the geometric and harmonic mean will assign a score of zero to a triplet if none of the nodes has been connected previously, and therefore will predict no triadic edges.
We also test the triadic edge prediction on the synthetic linear hypergraphs, generated in the manner described in the "Experiments" section, with K = 4 , m = 60 , γ = 10 , c 2 = 1 , and c 3 = 0.3 , such that the clustered pattern resembles the high-school contact data. We consider three eigenvectors associated with the smallest eigenvalues that are greater than 0.01 for the synthetic linear model. Since defining a periodic model with more than one eigenvector is beyond the scope of this work, we only test the linear hypergraphs. We randomly select a portion of the hyperedges as the training set, while ensuring the sampled hypergraph is connected, and test the performance on the rest of the hyperedges. The AUC-PR averaged over 20 random hypergraphs is shown in Table 1. We observe that the linear model outperforms random score and average scores for various training data sizes.

Conclusion
In this work we have developed new random models and embedding algorithms for hypergraphs, and investigated their equivalence. In particular, we focused on two spectral embedding algorithms customized for hypergraphs, which aim to reveal linear and periodic structures, respectively. We also described random hypergraph models associated with these algorithms, which allow us to quantify the relative strength of linear and periodic structures based on maximum likelihood. We demonstrated the model comparison approach on synthetic linear and periodic hypergraphs, showing that the results are consistent with the generating mechanism. When applied to high school and primary school contact hypergraphs, the model comparison suggests the periodic structure is more prominent. On this data set we also showed that the "spectral embedding plus random hypergraph" approach gives a useful strategy for predicting new hyperedges.
In future work, it would be interesting to investigate how these linear and periodic hypergraph models compare with other versions that use alternative assumptions, including those based on core-periphery 23 and stochastic block model 13,21 structures.

Data availability
This research made use of public domain data that is available over the internet, as indicated in the text.