Abstract
Determining the core structure of complex network systems allows us to simplify them. Using hbridge and hstrength measurements in a weighted network, we extract the hbackbone core structure. We find that focusing on the hbackbone in a network allows greater simplification because it has fewer edges and thus fewer adjacent nodes. We examine three practical applications: the cocitation network in an information system, the open flight network in a social system, and coauthorship in network science publications.
Introduction
The contemporary study of complex networks began with Watts & Strogatz^{1} and Barabási & Albert^{2}, and the resulting complex network science is now widely used in research on social, information, biological, and technological networks^{3,4,5,6,7,8,9}. Although extracting the network backbone is an important task in network analysis^{10,11,12}, it is difficult to extract the interactions between nodes or edges and the unique core structure. The numerous attempts to extract the backbone of a complex network have used different values—e.g., the degree distribution or the edgebetweenness centrality distribution^{13}—in an effort to preserve backbone information. Other approaches have focused on network type—e.g., economic systems^{14} or online recommendation networks^{15}. Another key issue is that backbones are not unique, and some parameters need an artificial setting.
Using the hindex^{16} metric, which is now commonly used in recommendation networks and its other network applications^{17}, we introduced hdegree and hstrength and extracted the hcore and hsubnet of a weighted network^{18,19,20}. Although in this work we were able to use hdegree and hstrength factors to extract functionally significant core information, we note that both hfactors overlook nodes and edges that have a relatively low weight—the very network nodes and edges often vital in transporting the flow of information. Also, according to the weak tie theory^{21,22}, we notice that some weak links can be structurally important in networks.
To quantify the importance of each node and edge in a given network, since the 1970s, different types of centralities have been defined^{23,24,25,26}. When extracting important network information, ranking edge centrality is more effective than ranking node centrality. This is because nodes can exist in isolation, but edges always connect two nodes. Edge weights are naturally generated in a network, better represent interaction levels between nodes, and thus provide an index that quantifies the importance of network functions. At the same time, edge betweenness reveals the structural characteristics of a network. In some of the literature^{13,27} edge betweenness is used to extract the structural skeleton of a network. Thus combining edge weight and edge betweenness can provide important information about both network function and structure.
In our research we combine the hbridge and hstrength to capture the structurally important interactions of edges with adjacent nodes. After extracting the structural hbridge and the functional hstrength in a weighted network, we synthesize an hbackbone that combines both structural and functional interactions.
Data
We use three sets of data in our research.

(1)
Cocitation network: From the ISI Web of Science (WoS) on 18 May 2017 we obtained the top 100 mostcited articles that cited Hirsch’s original paper that defined the hindex (“An index to quantify an individual’s scientific research output”). We examined the references that occurred more than five times, set up a cocitation network, and then deleted Hirsch’s original paper. Allowing it to remain would have affected the edge betweenness because it was connected to all the other references.

(2)
Open flight network: We obtained the updated open flight data online in January 2012 (https://openflights.org/data.html). It lists approximately 60,000 routes between over 3200 airports worldwide. We transformed the data into an undirected weighted network in which the weight of a route is the number of airlines flying between two nodes (two airports).

(3)
Coauthorship in network science publishing: We also use classic coauthorship network of scientists working on network theory and experiment^{24} compiled by M. Newman in May 2006. We assign the network weights as described in Newman’s work^{26}.

(4)
These three data sets represent two typical networks. The first and the last are information networks, and the second a social (transportation) network. Table 1 shows the main features of these weighted networks.
Results
We run experiments to test our method of identifying the hbackbone in a weighted network.
Figure 1 shows the procedure for identifying the hbackbone in a cocitation network. The left side shows the original network and the right its hbackbone.
Figure 1 shows both highlycited papers, such as Egghe’s paper in 2006 and Ball’s in 2005, and bridge papers that connect related research topics, such as Brin’s article in Computer Networks & ISDN Systems that provides a foundation for many other articles that combine later web search engine design and hindex research. Table 2 provides structural information and lists all of the nodes in the hbackbone that form the core of the weighted network. The percentages of edges and nodes in the hbackbone of the cocitation network vs. the total are 0.08% and 2.47%, respectively.
Figure 2 shows the hbackbone of the open flight network. On the left side is the image of the original network and on the right is its hbackbone.
In the original open flight network, a node is an airport labeled by its IATA code. To clarify the information, we add the name of the city to the IATA code.
Using the hbackbone network we identify the airports that structurally and functionally are most important, e.g., “ChicagoORD,” which is one of the world’s biggest passenger airports, and “AnchorageANC,” which is one of the world’s busiest cargo airports. We evaluate airport performance in terms of passengers, cargo (freight and mail), and aircraft movement. Table 3 supplies examples of important hbackbone nodes according to the ACI 2012 World Annual Traffic Report (WATR). The percentages of hbackbone edges and nodes in the open flight network vs. the total are 0.30% and 1.96%, respectively.
Here the airport importance is determined by combining its business in cargo and passengers and its movements. Thus the hbackbone quantifies its importance.
Figure 3 shows the hbackbone of coauthorship in network science publishing. On the left side is an image of the original network^{24} in which only the largest component of the resulting network is shown. On the right is the hbackbone of the entire network. The blue triangles on the left are the nodes in the hbackbone. Note that these hbackbone nodes are important in the original network. The percentages of edges and nodes in the hbackbone vs. the total are 0.9% and 0.5%, respectively.
These three cases show that we can identify an hbackbone in a weighted network, and that with fewer than 1% edges and 3% nodes the hbackbone is a core structure in the weighted network. This approach effectively locates and extracts the structurally and functionally important edges with adjacent nodes in weighted networks.
Discussion
Unlike that found in other backbone approaches^{10,11,12}, the structure of the hbackbone is unique in each network. In the Serrano approach, because the adjacent edges in some nodes are assumed to be more significant, they are assigned to the backbone. This “significance” is determined using a “disparity filter” with a variable α that strongly affects how many edges or nodes remain in the backbone. In the hbackbone algorithm, the number of edges remaining in the hbackbone is determined solely by network characteristics, i.e., edge weight (hstrength) and network structure (hbridge). In addition, the hbackbone algorithm is highly efficient, and it preserves the small number of edges and nodes that carry important information. In addition, because the hbackbone focuses on edges rather than nodes, it retains more structural characteristics. As a result, there are no isolated nodes in the hbackbone, and every node is connected to at least one other node. Figure 4 shows a comparative example.
Table 4 shows a computed numerical comparison of the hbackbone and the Serrano backbone in three realworld networks.
In Table 4, the number represents the amount of nodes or edges corresponding to the network. The number in parentheses stands for the percentage of nodes or edges overlapped by the hbackbone, which is the value of the number of nodes or edges both in Serrano backbone and hbackbone divided by the number of nodes or edges in Serrano backbone.
Note that the Serrano backbone requires the artificial parameter α. When this parameter changes, the number of network nodes and edges changes drastically. When α = 0.01, the similarity between the two backbones exceeds 30%, and in one case there is a complete 100% overlap (the cocitation network). When α = 0.05, the similarity is less, in part because the number of edges preserved by the hbackbone is smaller than those by the Serrano backbone.
Unlike those in the current literature, the hbackbone needs no parameter to adjust the size of the resulting backbone, and thus the hbackbone of each network is uniquely determined. Using the hbackbone method eliminates artificial interference in the process of backbone extraction.
Both the connected and unconnected hbackbones are determined by the original structure of the network. In our examples, the hbackbone of the cocitation network is connected and the hbackbone of open flight network is unconnected.
In general, if we assume that the hbackbone has m edges and n nodes, with the hbridge and hstrength of h_{b} and h_{s} respectively, the number of edges in the hbackbone will be fewer than or equal to h_{b} + h_{s}, and the number of nodes in the hbackbone will be fewer than or equal to 2(h_{b} + h_{s}). Because one edge links two nodes, m < n. Thus
The structure of hbackbones varies from network to network, and because of this complexity we have not attempted to provide a mathematical proof for the hbackbone, which limits our efforts, but recent research^{28} has demonstrated the relation between the hindex and the coreness. The hbackbone combines the structural importance of the hbridge with the functional importance of the hstrength, and thus it retains both structural and functional core interactions.
Conclusion
We have introduced a method of finding the hbackbone, which is a core structure in weighted networks. This core network structure of edges and adjacent nodes is important both structurally and functionally, and our method can be used to simplify complex weighted networks. Because the hbackbone integrates core edges with adjacent nodes, the important information of the weighted network is retained. Unlike previous backbones, the hbackbone is a unique core network structure.
The hbackbone methodology can be generalized to other weighted networks. Currently, our case study addresses only undirected weighted information networks, leaving directed weighted and heterogeneous and multilayer weighted networks^{29} for future research. Dynamic issues are also left for future study.
Method
A network (graph) consists of nodes (vertices) and edges (links)^{30,31}. When nodes and edges represent informationrelated and societyrelated objects, we designate the two systems information and social networks, respectively.
Theoretically, betweenness centrality is a measure of centrality in a graph based on shortest paths. There are node betweenness and edge betweenness, and we focus on edge betweenness because its centrality quantifies the number of times an edge acts as a bridge in the shortest path between two nodes. Introduced by Linton Freeman^{27}, the betweenness centrality of a node is the number of these shortest paths that pass through it. The edge betweenness of an edge can be similarly defined^{28}.
In a given network, the edge betweenness of an edge v in a network G = (V,E) is defined
where σ_{st} is the total number of shortest paths from node s to node t and σ_{st} (v) is the number of those paths that pass through edge v.
Edge betweenness quantifies the structural importance of a network edge. The edge with a higher edge betweenness often acts as a bridge to transmit information. Note, by definition, in a network of N nodes, the maximum edge betweenness of a given edge is N × (N1), i.e., the greater the number of nodes in a network, the larger the edge betweenness of most of the edges. Thus we introduce a new measurement, the bridge, which we obtain by dividing the edge betweenness with the number of all nodes N,
After we calculate the bridge for all edges, we rank them using an hindex approach.
Definition 1. hbridge
The hbridge (h_{b}) of a network is equal to h_{b}, if h_{b} is the largest natural number such that there are h_{b} links, each with bridge at least equal to h_{b} in the network.
We also define hstrength^{20}.
Definition 2. hstrength
The hstrength (h_{s}) of a network is equal to h_{s}, if h_{s} is the largest natural number such that there are h_{s} links, each with strength at least equal to h_{s} in the network.
Because the hbridge quantifies the structurally important edges connecting the network, and the hstrength characterizes the core edges of a network in terms of link strengths, we can obtain the core backbone structure by combining them.
Definition 3. The hbackbone
An hbackbone of a network is a core subnetwork consisting of all edges with strengths larger than or equal to the hbridge or the hstrength in the network, together with their adjacent nodes.
In a weighted network the algorithm for extracting the hbackbone has three steps (Fig. 5).
Step 1: Find the edges with a bridge higher than or equal to the hbridge;
Step 2: Find the edges with a weight higher than or equal to the hstrength;
Step 3: Identify the hbackbone by merging the edges of Step 1 and 2 and adding their adjacent nodes.
References
Watts, D. & Strogatz, S. Collective dynamics of ‘smallworld’ networks. Nature. 393, 440–442 (1998).
Barabasi, A. & Albert, R. Emergence of scaling in random networks. Science. 286, 509–512 (1999).
Wasserman, S. & Faust, K. Social Network Analysis: Methods and Applications. Cambridge University Press, Cambridge (1994).
Strogatz, S. Exploring complex networks. Nature. 410, 268–276 (2001).
Albert, R. & Barabási, A. Statistical mechanics of complex networks. Rev Mode Phy. 74, 47–97 (2001).
Otte, E. & Rousseau, R. Social network analysis: a powerful strategy, also for the information sciences. J Inf Sci. 28, 441–453 (2002).
Newman, M. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003).
Barrat, A., Barthelemy, M., PastorSatorras, R. & Vespignani, A. The architecture of complex weighted networks. Proc Natl Acad Sci USA 101, 3747–3752 (2004).
Borner, K., Sanyal, S. & Vespignani, A. Network science. Ann Rev Inf Sci Technol. 41, 537–607 (2007).
Serrano, M., Boguna, M. & Vespignani, A. Extracting the multiscale backbone of complex weighted networks. Proc Natl Acad Sci USA 106, 6483–6488 (2009).
Radicchi, F., Ramasco, J. J. & Fortunato, S. Information filtering in complex weighted networks. Phys Rev E. 83, 046101 (2011).
Zhang, X., Zhang, Z., Zhao, H., Wang, Q. & Zhu, J. Extracting the Globally and Locally Adaptive Backbone of Complex Networks. PLoS One. 9, e100428 (2014).
Kim, D., Noh, J. & Jeong, H. Scalefree trees: The skeletons of complex networks. Phys Rev E 70, 046126 (2004).
Glattfelder, J. & Battiston, S. Backbone of complex networks of corporations: The flow of control. Phys Rev E 80, 036104 (2009).
Zhang, Q., Zeng, A. & Shang, M. Extracting the Information Backbone in Online System. PLoS One. 8, e62624 (2013).
Hirsch, J. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA 102, 16569–16572 (2005).
Schubert, A., Korn, A. & Telcs, A. Hirschtype indices for characterizing networks. Scientometrics. 78, 375–382 (2009).
Zhao, S. X., Rousseau, R. & Ye, F. Y. hDegree as a basic measure in weighted networks. J Informetr. 5, 668–677 (2011).
Zhao, S. X. & Ye, F. Y. Exploring the directed hdegree in directed weighted networks. J Informetr. 6, 619–630 (2012).
Zhao, S. X., Zhang, P., Li, J., Tan, A. M. & Ye, F. Y. Abstracting the Core Subnet of Weighted Networks Based on Link Strengths. J Assoc Inf Sci Tech. 65, 984–994 (2014).
Granovetter, M. The strength of weak ties. Am J Sociol. 78, 1360–1380 (1973).
Jack, S. The role, use and activation of strong and weak network ties: A qualitative analysis. J Manage Stud. 42, 1233–1259 (2005).
Freeman, L. C. A Set of Measures of Centrality Based on Betweenness. Sociometry. 40, 35–41 (1977).
Newman, M. Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74, 036104 (2006).
Opsahl, T., Agneessens, F. & Skvoretz, J. Node centrality in weighted networks: Generalizing degree and shortest paths. Soc netw. 32, 245–251 (2010).
Newman, M. Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev E. 64(2), 016132 (2001).
Girvan, M. & Newman, M. Community structure in social and biological networks. Proc Natl Acad Sci USA 99, 7821–7826 (2002).
Lu, L., Zhou, T., Zhang, Q. & Stanley, H. E. The Hindex of a network node and its relation to degree and coreness. Nat Commun 7, 10168 (2016).
Li, S. X., Lin, X., Liu, X. Z. & Ye, F. Y. Hcrystal as a Core Structure in Multilayer WeightedNetworks. Am J Inf Sci Comput Eng. 2(4), 29–44 (2016).
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. U. Complex networks: Structure and dynamics. Phys Rep 424, 175–308 (2006).
Newman M. Networks: An Introduction. Oxford University Press, Oxford (2010).
Acknowledgements
We acknowledge the financial support from the National Natural Science Foundation of China Grant No. 71673131. The Boston University Center for Polymer Studies is supported by NSF Grants PHY1505000, CMMI1125290, and CHE1213217, by DTRA Grant HDTRA11410017, and by DOE Contract DEAC0705Id14517.
Author information
Authors and Affiliations
Contributions
R.J.Z. initiated the idea, collected data and processed figures and tables, H.E.S. checked the research and wrote the paper, and F.Y.Y. designed the research and wrote the paper.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhang, R.J., Stanley, H.E. & Ye, F.Y. Extracting hBackbone as a Core Structure in Weighted Networks. Sci Rep 8, 14356 (2018). https://doi.org/10.1038/s41598018324301
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598018324301
Keywords
 Core Structure
 Science Network Publishing
 Adjacent Nodes
 Complex Network System
 Edge Betweenness
This article is cited by

Simplifying Weighted Heterogeneous Networks by Extracting hStructure via sDegree
Scientific Reports (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.