Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Hierarchical structure and the prediction of missing links in networks

Abstract

Networks have in recent years emerged as an invaluable tool for describing and quantifying complex systems in many branches of science1,2,3. Recent studies suggest that networks often exhibit hierarchical organization, in which vertices divide into groups that further subdivide into groups of groups, and so forth over multiple scales. In many cases the groups are found to correspond to known functional units, such as ecological niches in food webs, modules in biochemical networks (protein interaction networks, metabolic networks or genetic regulatory networks) or communities in social networks4,5,6,7. Here we present a general technique for inferring hierarchical structure from network data and show that the existence of hierarchy can simultaneously explain and quantitatively reproduce many commonly observed topological properties of networks, such as right-skewed degree distributions, high clustering coefficients and short path lengths. We further show that knowledge of hierarchical structure can be used to predict missing connections in partly known networks with high accuracy, and for more general network structures than competing techniques8. Taken together, our results suggest that hierarchy is a central organizing principle of complex networks, capable of offering insight into many network phenomena.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A hierarchical network with structure on many scales, and the corresponding hierarchical random graph.
Figure 2: Application of the hierarchical decomposition to the network of grassland species interactions.
Figure 3: Comparison of link prediction methods.

Similar content being viewed by others

References

  1. Wasserman, S. & Faust, K. Social Network Analysis (Cambridge Univ. Press, Cambridge, 1994)

    Book  Google Scholar 

  2. Albert, R. & Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)

    Article  ADS  MathSciNet  Google Scholar 

  3. Newman, M. E. J. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)

    Article  ADS  MathSciNet  Google Scholar 

  4. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabási, A.-L. Hierarchical organization of modularity in metabolic networks. Science 30, 1551–1555 (2002)

    Article  ADS  Google Scholar 

  5. Clauset, A., Newman, M. E. J. & Moore, C. Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)

    Article  ADS  Google Scholar 

  6. Guimera, R. & Amaral, L. A. N. Functional cartography of complex metabolic networks. Nature 433, 895–900 (2005)

    Article  ADS  CAS  Google Scholar 

  7. Lagomarsino, M. C., Jona, P., Bassetti, B. & Isambert, H. Hierarchy and feedback in the evolution of the Escherichia coli transcription network. Proc. Natl Acad. Sci. USA 104, 5516–5520 (2001)

    Article  Google Scholar 

  8. Liben-Nowell, D. & Kleinberg, J. M. The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58, 1019–1031 (2007)

    Article  Google Scholar 

  9. Girvan, M. & Newman, M. E. J. Community structure in social and biological networks. Proc. Natl Acad. Sci. USA 99, 7821–7826 (2002)

    Article  ADS  MathSciNet  CAS  Google Scholar 

  10. Krause, A. E., Frank, K. A., Mason, D. M., Ulanowicz, R. E. & Taylor, W. W. Compartments revealed in food-web structure. Nature 426, 282–285 (2003)

    Article  ADS  CAS  Google Scholar 

  11. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V. & Parisi, D. Defining and identifying communities in networks. Proc. Natl Acad. Sci. USA 101, 2658–2663 (2004)

    Article  ADS  CAS  Google Scholar 

  12. Watts, D. J., Dodds, P. S. & Newman, M. E. J. Identity and search in social networks. Science 296, 1302–1305 (2002)

    Article  ADS  CAS  Google Scholar 

  13. Kleinberg, J. in Proc. 2001 Neural Inform. Processing Systems Conf. (eds Dietterich, T. G., Becker, S. & Ghahramani, Z.) 431–438 (MIT Press, Cambridge, MA, 2002)

    Google Scholar 

  14. Palla, G., Derényi, I., Farkas, I. & Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)

    Article  ADS  CAS  Google Scholar 

  15. Casella, G. & Berger, R. L. Statistical Inference (Duxbury, Belmont, 2001)

    MATH  Google Scholar 

  16. Newman, M. E. J. & Barkema, G. T. Monte Carlo Methods in Statistical Physics (Clarendon, Oxford, 1999)

    MATH  Google Scholar 

  17. Newman, M. E. J. Assortative mixing in networks. Phys. Rev. Lett. 89, 208701 (2002)

    Article  ADS  CAS  Google Scholar 

  18. Huss, M. & Holme, P. Currency and commodity metabolites: Their identification and relation to the modularity of metabolic networks. IET Syst. Biol. 1, 280–285 (2007)

    Article  CAS  Google Scholar 

  19. Krebs, V. Mapping networks of terrorist cells. Connections 24, 43–52 (2002)

    Google Scholar 

  20. Dawah, H. A., Hawkins, B. A. & Claridge, M. F. Structure of the parasitoid communities of grass-feeding chalcid wasps. J. Anim. Ecol. 64, 708–720 (1995)

    Article  Google Scholar 

  21. Bryant, D. in BioConsensus (eds Janowitz, M., Lapointe, F.-J., McMorris, F. R., Mirkin, B. & Roberts, F.) pp. 163–184 (Series in Discrete Mathematics and Theoretical Computer Science, Vol. 61, American Mathematical Society-DIMACS, Providence, RI, 2003)

    Book  Google Scholar 

  22. Dunne, J. A., Williams, R. J. & Martinez, N. D. Food-web structure and network theroy: The role of connectance and size. Proc. Natl Acad. Sci. USA 99, 12917–12922 (2002)

    Article  ADS  CAS  Google Scholar 

  23. Szilágyi, A., Grimm, V., Arakaki, A. K. & Skolnick, J. Prediction of physical protein–protein interactions. Phys. Biol. 2, S1–S16 (2005)

    Article  ADS  Google Scholar 

  24. Sprinzak, E., Sattath, S. & Margalit, H. How reliable are experimental protein-protein interaction data? J. Mol. Biol. 327, 919–923 (2003)

    Article  CAS  Google Scholar 

  25. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl Acad. Sci. USA 98, 4569–4574 (2001)

    Article  ADS  CAS  Google Scholar 

  26. Lakhina, A., Byers, J. W., Crovella, M. & Xie, P. in INFOCOM 2003: Twenty-Second Annual Joint Conf. IEEE Computer and Communications Societies (ed. Bauer, F.) Vol. 1 332–341 (IEEE, Piscataway, New Jersey, 2003)

    Google Scholar 

  27. Clauset, A. & Moore, C. Accuracy and scaling phenomena in Internet mapping. Phys. Rev. Lett. 94, 018701 (2005)

    Article  ADS  Google Scholar 

  28. Martinez, N. D., Hawkins, B. A., Dawah, H. A. & Feifarek, B. P. Effects of sampling effort on characterization of food-web structure. Ecology 80, 1044–1055 (1999)

    Article  Google Scholar 

  29. Hanely, J. A. & McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)

    Article  Google Scholar 

  30. Sales-Pardo, M., Guimerá, R., Moreira, A. A. & Amaral, L. A. N. Extracting the hierarchical organization of complex systems. Proc. Natl Acad. Sci. USA 104, 15224–15229 (2007)

    Article  ADS  CAS  Google Scholar 

Download references

Acknowledgements

We thank J. Dunne, M. Gastner, P. Holme, M. Huss, M. Porter, C. Shalizi and C. Wiggins for their help, and the Santa Fe Institute for its support. C.M. thanks the Center for the Study of Complex Systems at the University of Michigan for hospitality while some of this work was conducted.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aaron Clauset.

Supplementary information

Supplementary Notes

This file contains Supplementary Notes including the technical details of our hierarchical model and the methods used to fit it to empirical data. It also contains addition results on graph resampling and the prediction of missing links, and the algorithmic specifics of our experimental studies. (PDF 123 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clauset, A., Moore, C. & Newman, M. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98–101 (2008). https://doi.org/10.1038/nature06830

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature06830

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing