Determinable and interpretable network representation for link prediction

As an intuitive description of complex physical, social, or brain systems, complex networks have fascinated scientists for decades. Recently, to abstract a network’s topological and dynamical attributes, network representation has been a prevalent technique, which can map a network or substructures (like nodes) into a low-dimensional vector space. Since its mainstream methods are mostly based on machine learning, a black box of an input-output data fitting mechanism, the learned vector’s dimension is indeterminable and the elements are not interpreted. Although massive efforts to cope with this issue have included, say, automated machine learning by computer scientists and learning theory by mathematicians, the root causes still remain unresolved. Consequently, enterprises need to spend enormous computing resources to work out a set of model hyperparameters that can bring good performance, and business personnel still finds difficulties in explaining the learned vector’s practical meaning. Given that, from a physical perspective, this article proposes two determinable and interpretable node representation methods. To evaluate their effectiveness and generalization, this article proposes Adaptive and Interpretable ProbS (AIProbS), a network-based model that can utilize node representations for link prediction. Experimental results showed that the AIProbS can reach state-of-the-art precision beyond baseline models on some small data whose distribution of training and test sets is usually not unified enough for machine learning methods to perform well. Besides, it can make a good trade-off with machine learning methods on precision, determinacy (or robustness), and interpretability. In practice, this work contributes to industrial companies without enough computing resources but who pursue good results based on small data during their early stage of development and who require high interpretability to better understand and carry out their business.


Introduction
Physics has long been concerned as a propeller of civilization's evolution in history.The establishment of Newtonian mechanics and thermodynamics drove the "first technological revolution".The discovery of the electromagnetic induction phenomenon laid the theoretical foundation for the "second technological revolution".Condensed matter physics and quantum physics developed the silicon semiconductor industry for the "third technological revolution".With the ongoing "fourth technological revolution" currently, physics is also propelling the innovation and development in artificial intelligence, among which the study of complex networks [1,2] is a case in point.By using nodes and edges to intuitively describe the nonlinear and heterogeneous interaction patterns of components composing the complex physical, social, or brain systems, its use soon widened to various fields.For decades, scientists have been dedicated to understanding a network's structural and dynamical attributes (like vital node identification [3,4], high-order network structural analysis [5,6,7,8], and percolation theory [9]) and to utilizing these attributes in specific applications, such as link prediction [10], natural language processing [11], and recommender systems [12,13,14].
Recently, as a pivotal tool to abstract a network's structural and dynamical attributes for utilization in a manner that maps the network or its substructures (like nodes) into a low-dimensional vector space, network representation [15,16] has intrigued scientists for years, especially in light of ample evidence that network representation has several virtues dear to both academia and industry [14]: reusable object representations by manual or automated feature engineering, enhanced model precision, and efficient parallel computation based on GPU, among which some are even unprecedented compared to their predecessors.Nevertheless, since the current methods of network representation are mostly based on machine learning [17], almost a black box facing fundamental limits on well-explainable and raising difficulties in tedious hyperparameter tuning attributed to its input-output data fitting rationale, the learned vector space's dimension is generally indeterminable and its elements are not interpreted.Consequently, enormous computing resources are required for searching the suboptimal dimension for the vector space within a range but in most cases researchers still can not interpret why such a dimension works out and what realistic meanings the space's elements may represent.Although recent years have seen massive efforts by computer scientists and mathematics to cope with this issue, the root causes still remain unresolved.For example, although automated machine learning [18] proposed by computer scientists can pave the way for accelerating the search of the space's suboptimal dimension not manually, the search mechanism requiring enormous computing resources still remains.Moreover, although mathematics can conclude an empirical formula used to select the optimal dimension for the space [19], it is normally built on specific models or data, limiting its interpretability and generalization to other scenarios.Given these inadequacies, determinable and interpretable network representation is still an open and important question.
In this work, from a physical perspective, this article proposes two methods of determinable and interpretable network representation.Methodologically, the first method is based on the degree, H-index, and Coreness (DHC) theorem [20] constructing an operator to generate sequences (with a fixed length) of H-indices for nodes.Regarding their realistic meanings and on the advice of the rich club theory [21], this article utilizes these H-indices to construct node representations, which can represent nodes' local attributes around the neighborhood.To abstract nodes' global attributes in a complex network, the second method is based on the DHC entropy (DHC-E) [22], a hyperparameter-free and explainable whole graph embedding algorithm we proposed.If a bipartite network, its m × n adjacent matrix can be extended to a (m + n) × (m + n) augmented matrix, a simple matrix that can be decomposed to m + n matrices, each of which corresponds to a node and carries the node's global attributes.After implementing the DHC-E algorithm on them, node representations can be generated.Unlike those learned by machine learning-based methods, the node representations generated by the two methods have both a determined dimension and interpretable elements.
To evaluate the proposed two methods' effectiveness and generalization, this article further proposes Adaptive and Interpretable ProbS (AIProbS), a network-based link prediction model for bipartite networks, which can utilize nodal representations generated by the two methods, as an attempt to enhance the prediction precision.Methodologically, built on a classical network-based framework called ProbS [23], the AIProbS can control the resource diffusion process of the ProbS framework by setting edge weights quantified with node representations, which can perceive the similarity between nodes.After being equipped with such artificial intelligence as machine learning-based models do, the AIProbS makes the flaw of the classical ProbS framework in self-adaptive perception ability oriented to different scenarios (which is analyzed in Sec.2.2).At the same time, compared with machine learning-based link prediction models [24,25,26,27,28], the AIProbS is hyperparameter-free.In addition, implemented on several designed control experiments of diverse recommender systems (a specific application of link prediction in artificial intelligence), experimental results showed that the AIProbS can reach state-of-the-art precision beyond baseline models on some scenarios and can, by and large, make a good trade-off with machine learning-based models on precision, determinacy, and interpretability.

The model
In the first place, this article proposes two novel network representation methods in Sec.2.1.Then, a classical network-based link prediction framework called ProbS is introduced and its flaws are revealed in Sec.2.2.Based on the ProbS framework, this article proposes Adaptive and Interpretable ProbS (AIProbS) in Sec.2.3, a network-based link prediction model for bipartite networks, which can utilize nodal representations generated by the two methods and can enhance the prediction precision of the classical ProbS framework by making up its flaws.
2.1.Generate a complex network's nodal representations 2.1.1.Method one Degree, H-index, and coreness are three measurements used to quantify nodal influence in a complex network.Node's degree measures nodal influence by counting a node's neighbors: the greater a node's degree is, the more neighbors it is connected with, and the higher influence it has.Node's H-index [29] is the maximum value h such that a node has at least h neighbors with a degree no less than h.Furthermore, to take location into account, coreness calculated by k-core decomposition [30] measures a node's centrality: a greater coreness indicates that a node locates more centrally in a complex network and hence has a higher influence.
The DHC theorem [20] reveals that degree, H-index, and coreness are all related.To describe the relationship, the DHC theorem constructs an operator H, which calculates the maximum value h for each node such that the node has at least h neighbors with H-indices no less than h.For each node i in a complex network, taking its degree k i as the zero-order H-index h (0) i as the beginning, the first-order H-index h (1) i of node i is calculated by H(h ), where h (0) are the zero-order Hindices (i.e., the degree values) of the k i neighbors of node i.By iteratively doing so, h ), as well as h i , ..., can be calculated.Finally, a sequence h i , ... with a fixed length is generated for node i, which is convergent to node i's coreness, as the DHC theorem states: Theorem 2.1.For each node in a complex network, node i's H-indices sequence h i , h Proof.See [20].According to the rich club theory [21] (from the field of social network analysis [31] and soon widened to interdisciplinary studies like computer science [32] or cognitive science [33]) that a node's influence could reflect its attributes and functions around the neighborhood and the whole network structure, this article proposes the following assumption: Assumption 2.1.A node's H-indices sequence can abstract the node's multidimensional influence in the neighborhood, where the sequence's convergence steps can reflect the magnitude of the node's influence.The more important role played by the node in the neighborhood, the more slowly its influence decays during the dynamic evolution (i.e., the convergence process by the DHC theorem), thus the larger its convergence steps are.
Built on assumption 2.1 this article takes a node's H-indices sequence as its node representation.In this way, provided n nodes in a complex network and given that their H-indices sequence converges after up to s steps, this method can map the n nodes to a s-dimensional vector space consisting of their H-indices as node representations.This is a determinable and interpretable network representation method, since for an arbitrary complex network the dimension of its nodal representations is determined as s and the elements can be interpreted as nodal multidimensional influence with different magnitudes.
2.1.2.Method two Following method one, to further abstract a node's global attributes in a complex network, if a bipartite network, its adjacency matrix A m×n can be extended to B (m+n)×(m+n) constructed by O m×m A m×n (A m×n ) T O n×n , where O denotes the null matrix.Based on it, a series of λ i and B i can be decomposed by the following theorem.
Theorem 2.2.The adjacency matrix B (m+n)×(m+n) can be decomposed by B = m+n i=1 λ i B i , where λ i is the i-th eigenvalue of B (m+n)×(m+n) and B i is the corresponding idempotent matrix.
Proof.See Appendix A.
After that, this article implements the DHC-E operator E [22] (i.e., by the DHC theorem to generate a H-index matrix H n×s by row containing the H-indices converged after s steps of each of the n nodes in a complex network, the operator E calculates the Shannon entropy of each column of H n×s and obtains a vector e 1×s , as the whole graph embedding of the network) on each B i or λ i B i one by one, generating the m + n nodes' representations for the bipartite network correspondingly.Apparently, this method is also a determinable and interpretable network representation method.The characteristics of interpretability and hyperparameter-free of the DHC-E algorithm are thoroughly illuminated in [22].

The ProbS framework and its flaws
To evaluate the two methods' effectiveness and generalization, this article utilizes them in link prediction for bipartite networks.Since network representation can be seen as artificial intelligence that recognizes and abstracts a complex network's underlying structural and dynamical attributes, this article explores how such artificial intelligence (i.e., nodal representations generated by these methods) can be used to enhance the precision of classical link prediction models.
Among classical (non-machine learning-based) link prediction models for bipartite networks, the ProbS [23] framework is a typical one.By means of a resource diffusion mechanism inspired by the physical process of Material Diffusion, the ProbS framework can quantify the similarity between nodes after initializing and diffusing resources.Fig. D1 includes an example to intuitively illuminate the schematics of the ProbS framework.For instance, when predicting node B's unobserved links with nodes a and b, resources are first initialized at nodes c and d (the nodes that are connected with node B) with value 1, then are diffused to nodes A, B, and C along edges after being equally divided by the degree of each node, finally are diffused back to nodes a, b, c, and d in the same way, which can be used to quantify the similarity between node B and the four nodes, respectively.A larger similarity of two nodes indicates a higher probability of an unobserved link existing between them.
This article provides a mathematical perspective to describe the ProbS framework, by constructing an operator T to describe its diffusion mechanism.Given a bipartite network consisting of m + n nodes of two different types, respectively, whose adjacency matrix is represented by A m×n .Let R m×n denote the predicted matrix, where R ij represents the similarity (i.e., the probability of the existence of a link) between nodes i and j.Then, through the ProbS framework R m×n can be calculated by where • denotes the dot product, and • denotes the Hadamard product.
where k U i is user i's degree.In Eq. ( 1) the operator T = The operator T tells why the ProbS framework will converge after deriving R from A and then placing A with the derived R iteratively, stated as the following theorem.
Theorem 2.3.Let the operator T = (D Since the difference between the values in A tends to be smoother as the convergent iterative process progresses but link prediction relies for higher precision on the more distinctive differentiation between the predicted values of similarity [14], in link prediction the best iteration steps for the ProbS framework is one, and so does the AIProbS proposed in Sec.2.3. In addition, from such a mathematical perspective, it is easy to see that the ProbS framework faces fundamental limits on intelligence because its resource diffusion mechanism is just based on equal allocation, shown as D I and D U in Eq. ( 1).In practice like recommender systems (an application of link prediction for bipartite networks in artificial intelligence), such a mechanism raises a key question: if respectively take these nodes of two different types as users and items in recommender systems, the resources diffused between users and items back and forth, to some extent, represent user's preferences for items or item's attractiveness to users, while neither of them should be necessarily equal since user biases [34,35,36] and item biases [34,37] generally exist in reality.Moreover, these biases are usually recommendation scenario-oriented, which means that in different scenarios a user's preferences may differ, and so do an item's attractiveness or popularity.Finally, in practice the ProbS framework fails to take these biases into consideration, let alone adaptively perceive and quantify their differences in various scenarios.

The AIProbS model
The essential condition for the ProbS framework to realize that intelligence is to be equipped with self-adaptive perception, an ability to perceive and utilize the attributes of nodes (i.e., nodal representations) in a complex network toward different scenarios.To utilize the nodal representations generated by the two methods proposed in Sec.2.1 in the ProbS framework, this article proposes Adaptive and Interpretable ProbS (AIProbS).
In the first step, on the advice that the rich club theory [21] gives clues that nodes with high centrality tend to form tightly interconnected communities, this article generalizes this conclusion to the field of link prediction, proposing the following assumption: Assumption 2.2.The similarity between node pairs having strongly correlated nodal representations (i.e., similar features or similar influence) is higher than that between weakly correlated ones.
To measure the similarity between nodes, the AIProbS uses the cosine similarity metric.Provided two n-dimension vectors x and y, the cosine similarity between them is calculated by cos where vector α = ( After obtaining the nodal similarity matrix S m×n , utilizing it to control the diffusion process of the classical ProbS framework is the second step.To assign proper weights to every node pair for the diffusion mechanism of the ProbS framework, the AIProbS further complete some normalization and proportioning operations on S • A where A is the adjacency matrix shown in Eq. (1).Since the elements of S • A vary in [−1, 1] while the diffused resources are supposed to be positive, the AIProbS normalizes the value range of the elements to [0, 1] using the max-min normalization operation, for each row vector (S where the max and min are the maximum and minimum elements of the row vector (S • A) i * , respectively.Based on that, the weight matrix W U for nodes belonging to set U is calculated by the proportioning operation as On the other hand, the same operations are completed on S m×n by column, generating the weight matrix W m×n I for nodes belonging to set I. In the last step, the predicted matrix R m×n , where R ij represents the prediced similarity between nodes i and j, is calculated through the AIProbS by Conceivably, there are other metrics for similarity measurement.More combinations were tested in this article (See Appendix C for details) but none of them performed better than the one proposed in this section.All in all, the whole process of the AIProbS are summarized in the pseudocodes shown in Appendix D. For more intuitive illumination, Fig. D1 in Appendix D presents its schematics.

Performance Evaluation
To evaluate the AIProbS's precision as well as its pros and cons in link prediction for bipartite networks, which also can be used to reflect the effectiveness of nodal representations generated by the two network representation methods proposed in Sec.2.1, this article designs control experiments based on recommender systems, an application of link prediction in artificial intelligence.

Recommender systems
By analyzing observed user-item relations to predict a user's preferred items from millions of candidates, recommender systems [12,14] are recognized as a pivotal tool to alleviate the information overload problem.Among different user-item relations, implicit user-item interactions (e.g., user's historical clicks or buys on items) record the existence of a user's interactions with items, defined as a binary state using 1 and 0. From the perspective of a complex network, recommendation on implicit user-item interactions can be seen as a process of link prediction for bipartite networks, where users and items correspond to the two types of nodes and implicit user-item interactions represent the edges between nodes.Therefore, the designed experiments in this article are based on the recommendation with implicit user-item interactions, for most current models are based on them.

Data sets
In light of the no-free-lunch theorem [38] that no model can always perform well enough as expected in all different scenarios, this article designs control experiments to evaluate the performance of the AIProbS on diverse real recommendation scenarios, in order to explore not only the AIPobS's pros but also its cons in different scenarios.As shown in Tab. 1, |U | and |I| represent the number of users and items, respectively, and the interactions between users and items are implicit ones.The sparsity in Tab. 1 represents the ratio of the number of unobserved interactions to the maximum number of all possible interactions between users and items (e.g., that between m users and n items is mn).As a control group, the MovieLens 100K, MovieLens 1M, and LastFM are three classical data sets from two different recommender systems of movies and music, with distinctive ratios of |U | to |V |, data scales, and sparsity, based on which more persuadable results could yield compared with those based on newly published data sets, since these classical data sets have been widely used for evaluation in previous works In order to guarantee the reproducibility of experiments, either of the three data sets is obtained from the RecBole public resources (https://recbole.io/dataset_list.html), organized into tuples (user, item, 0/1) without preprocessing.Each of them is randomly split into a "train/evaluate/test" set by the ratio of "80/10/10%".After independently repeating the splitting process 30 times, 30 realizations are generated for each data set.One can get the split data used in this article through the hyperlink address posted in Sec. 6.

Evaluation metrics
In order to quantify the precision of the AIProbS on these data sets, three common-used metrics are chosen in this article.Given a user u ∈ U (U is the user set) and the length N of the recommendation list, the set of recommended items for the user is denoted by R(u) and the ground-truth set of items the user interacted with is denoted by R(u).Based on them, the first evaluation metric is the Recall@N [39], which calculates the fraction of predicted relevant items out of all ground-truth relevant items by where |R(u)| represents the item count of R(u).
To calculate the reciprocal rank of the first relevant item recommended to each user, the second evaluation metric MRR@N [40] is denoted as where rank * u is the rank position of the first relevant item recommended to user u.Moreover, as the third evaluation metric, the NDCG@N [41] can further measure the overall ranking quality in a manner that accounts for the position of the hit by assigning higher scores to hits at top ranks as where δ(•) is an indicator function and positions are discounted logarithmically.
In practice, the greater the values of these evaluation metrics are, the higher a model's precision is.

Baseline methods
This article constructs or chooses nine baseline models as follows, evaluating the pros and cons of the AIProbS compared with its predecessors of both classical and machine learning-based baselines.
Classical baselines include two models.As the bedrock, the ProbS [23] is a necessary baseline to evaluate the improvement of the AIProbS.In addition, one might expect to base the recommendation directly on the nodal representations generated by the methods proposed in Sec.2.1, not the ProbS framework.To test this strategy, this article constructs the Pure-DHC model, used to perform the recommendation by Eq. ( 2) based on the user-item similarity of their H-indices (i.e., nodal representations).
Machine learning-based baselines include seven models.To avoid the baseline pitfalls that have plagued earlier research on the comprehensive and objective evaluation of proposed models, this article further chooses seven representative machine learningbased models as baselines, among which were based on six different techniques of machine learning frameworks, including NeuMF [42] based on deep neural networks, ConvNCF [43] and SpectralCF [44] based on convolution operations, GCMC [45] based on graph auto-encoder frameworks, LINE [46] based on random walking, NGCF [47] based on graph neural networks, and DGCF [48] based on attention mechanisms.
To guarantee the fairness and reproducibility of experiments, the implementation and evaluation of models were hosted to the RecBole [49], a public open pipeline of recommender systems.One can get the codes used in this article through the hyperlink address posted in Sec. 6.

Results
Based on the experimental settings in Sec.2.4, this section presents the experimental results on the precision and robustness of the AIProbS and baseline models in Sec.3.1 and Sec.3.2, respectively, revealing their pros and cons in different recommendation scenarios.

Precision analysis
As shown in Tabs.2, 3 and 4, the results on model precision are presented, where the length N of the recommendation list is set to 10, and each model's precision is averaged from its independently implementation based on 30 different realizations.The values in parentheses indicate the percentage of improvement or decline in model precision of the AIProb model compared to the respective baseline models on each specific data set and evaluation metric, where the percentage of improvement is bold.
Conceivably, when speaking of the necessity of determinable and interpretable network representation and their utilization in link prediction, one might cast it into doubt: do not nodal representations generated by the two methods in Sec.2.1 be sufficient for link prediction?Can the precision of classical link prediction frameworks really be enhanced by being involved with nodal representations as intelligence?Are the machine learning-based network representation methods not precise enough?As shown in Tabs.2, 3 and 4, on all three data sets the Pure-DHC which directly utilizes the nodal representations generated by the methods in Sec.2.1 for recommendation achieved the worst model precision among the AIProbS and baseline models.That is to say, such generated nodal representations could be nothing with the recommendation if not be utilized in the ProbS or any other recommendation framework.After utilizing these nodal representations in the ProbS framework, as shown in Tabs.2, 3 and 4, the AIProbS outperformed the classical ProbS on all three data sets.When compared to machine learning-based baselines, the AIProbS indeed performed worse than some machine learning models, most obviously on MovieLens 1M.But it still can achieve state-of-theart performance on model precision on LastFM, suggesting that nodal representations generated by the methods in Sec.2.1 may be able to abstract the underlying attributes of a complex network better than those learned by machine learning methods.To put these results in more general terms, it is definite that designing control experiments to guarantee the comprehensiveness and objectivity of model performance evaluation is indispensable because, as shown in Tabs.2, 3 and 4, the comparative predominance between different models or even that between the classical and machine learning-based models are distinctive.For instance, compared to its predecessor (the ProbS framework), the AIProbS at best improved the Recall@10 by 17.6% on MovieLens 1M and at worst, by 3.1% on MovieLens 100K.Such a 14.5% gap shows that the predominance of the AIProbS over the ProbS is not necessarily that significant in all recommendation scenarios.Overall, on MovieLens 1M, although it achieved an appreciable improvement over the ProbS, the AIProbS still performed worse than the other five machine learning-based models, revealing the predominance of the machine learning-based frameworks over the classical ones on this data set.However, that predominance faded on MovieLens 100K because only two machine learning-based models (i.e., NGCF and DGCF) outperformed the AIProbS.On LastFM, none of the machine learning-based models outperformed the AIProbS, in other words, but the AIProbS achieved state-of-the-art performance on model precision.
Figuring out the determinant factors of model precision in different recommendation scenarios is not easy and intuitive, not to mention accurately predicting a model's performance for one specific scenario.Still, on the advice of the clues given in Tabs.2, 3 and 4, some discoveries could be summarized as follows.(1) The machine learning-based models might have a predominance on data sets with large scales.The recommendation scenario of MovieLens 1M and MovieLens 100K being equal, the machine learning-based models would show a more obvious predominance on the former with a comparative larger data scale than the latter.However, it is hard to assert that the distinctions of ratios of |U | to |V | and the sparsity of the two data sets play a silent role.(2) The classical frameworks might play a large role in the improvement after integration in recommendation scenarios with high sparsity.Since the sparsity of LastFM is the highest among the three data sets, where the machine learning-based models face fundamental limits on lack of enough user-item interactions for training, the AIProbS or the ProbS combined with or of the classical frameworks showed their predominance as a result of their network structure-oriented resolution.Nevertheless, the ratio of |U | to |V | of LastFM, which seems to be a little higher than the other two, could also be a decisive factor.

Robustness analysis
As revealed in Sec.3.1, with the increase in data scale the precision of the AIProbS decreased compared with that of machine learning-based baselines, for the mechanism of data fitting (or pattern representation) adopted by machine learning methods can give fully to its play more suitably in scenarios with larger data scale.Nevertheless, it does not mean that the AIProbS is useless in those scenarios.Since hyperparameters having no realistic meanings could make a model's implementing process indeterminable and vague, machine learning-based models lost almost all the interpretability for results, like why a machine learning-based model generates some recommendations for a user.This problem is not confined to recommender systems but still haunts other applications requiring high interpretability for results, such as machine translation or knowledge graph completion.Besides, since tedious hyperparameter tuning is required for up to the optimal performance of a model, generally machine learning methods reach higher precision by sacrificing the model's computing efficiency.Such a strategy brings about heavy financial (i.e., computing resources) and time costs for the implementation on a huge data.In contrast, the AIProbS is determinable and its results are interpreted, meaning that this model could be more suitable for applications requiring high interpretability and for scenarios with huge data scales but insufficient computing resources.
On two realizations of ml-100k for instance, Tabs. 1, 2, 3, and 4 present the relations between the different settings of two representative hyperparameters (i.e., representation dimension and learning rating) and a model's average precision when one hyperparameter is fixed and others are traversed within a specified search range, where the standard deviation of precision is presented by error arrow at a data point and each model's number of hyperparameters is presented in parentheses, reflecting a model's magnitude of performance fluctuation associated with setting disturbance, which actually can quantify the model's robustness.As shown in Tab. 1, the AIProbS had a stable performance on recall@10, since the representation dimension of its results is determined.However, as a hyperparameter different settings of representation dimension can largely influence machine learning-based baselines' precision.Although the performance on recall@10 of ConvNCF, LightGCN, and NGCF of different representation dimensions was relatively stable among machine learning-based baselines, that of GCMC and SpectralCF largely fluctuated with the change of representation dimension.For example, as for SpectralCF when the representation dimension is set to 48 its average precision could be around 27% higher than that when being set to 16.On top of that, even when the representation dimension of SpectralCF is set to 48, seemingly the optimal choice, its performance on recall@10 still faces a 125% gap between the peaks of performance, flowed from the different settings of other hyperparameters.Similar fluctuations on machine learning-based baselines' precision were revealed by the influence of different settings of representation dimension on mrr@10 and by results shown in Tab. 2 when considering learning rate as the controlled hyperparameter.

Discussion
This article proposes two determinable and interpretable node representation methods.Different from other attempts like automated machine learning methods by computer scientists and computational theory by mathematics to search out and analyze the suboptimal (or optimal) representation dimension of a machine learning-based network representation model, respectively, and to interpret the implementing process and the results come out of the model, from a perspective of physics the proposed two methods can substantially generate nodal representations with a determined dimension and interpretable elements, reaching its optimal performance once implemented.After utilizing these representations in link prediction for bipartite networks, experimental results showed that the AIProbS can make a good trade-off with machine learningbased models on precision, determinacy, and interpretability, indicating the effectiveness of nodal representations generated by the proposed two representation methods.
Importantly, these methods with good generalization may motivate further research.For example, nodal representations generated by the proposed two methods can also be utilized in machine learning-based models as initial features or network representation, and the AIProbS provides a unified architecture that various nodal representations generated by other methods, like machine learning-based methods, can be involved, which may further improve the precision of link prediction.
Nevertheless, like any model under the effect of the no-free-lunch theorem [38] that no model can always perform well enough as expected in all different scenarios, the AIProbS has its disadvantages in some scenarios.With the increase in data scales, the AIProbS overall underperformed machine learning-based models on precision.Although the AIProbS can make a good trade-off with machine learning methods on precision and interpretability, in some applications where results' interpretability is unnecessary, like computer vision, machine learning methods seem like a better choice.Besides, since quantum machine learning is usually claimed as the next generation of machine learning, which can exponentially uplift a model's computing efficiency, would costly hyperparameter tuning be no longer an apprehension in the future?In other words, would determinable network representation that could sacrifice some precision but not representation learning-based (i.e., machine learning-based) methods that are adept in precision still be worthy of quantum computing devices in the future?

Acknowledgments
The author would like to acknowledge Linyuan Lü, Hao Wang, and Fang Zhou for their discussions and suggestions.
100K for an instance, among which the combination that cosine + M-M + P used in this article achieves the best performance.

Figure 1 .
Figure 1.Relation between model precision and representation dimension reflected on ml-100k realization 1

Figure 2 .
Figure 2. Relation between model precision and learning rate reflected on ml-100k realization 1

Figure 3 .
Figure 3. Relation between model precision and representation dimension reflected on ml-100k realization 2

Figure 4 .
Figure 4. Relation between model precision and learning rate reflected on ml-100k realization 2
the same way, provided m + n nodes belonging to two sets U and I of two different types in a bipartite network, respective.Through the network representation methods proposed in Sec.2.1 the representation matrices F m×s U and F n×s I of the two types of nodes are generated, either of which is consist of nodal representations by row.Then, the m × n nodal similarity matrix S m×n calculated by the cosine similarity metric is

Table 1 .
Overview of data sets.

Table 2 .
Results of model precision on LastFM.

Table 3 .
Results of model precision on MovieLens 100K.

Table 4 .
Results of model precision on MovieLens 1M.

Table C1 .
Results of model precision based on other combinations on MovieLens 100K, as an instance.