Link communities reveal multiscale complexity in networks

Journal name:
Nature
Volume:
466,
Pages:
761–764
Date published:
DOI:
doi:10.1038/nature09182
Received
Accepted
Published online

Networks have become a key approach to understanding systems of interacting objects, unifying the study of diverse phenomena including biological organisms and human society1, 2, 3. One crucial step when studying the structure and dynamics of networks is to identify communities4, 5: groups of related nodes that correspond to functional subunits such as protein complexes6, 7 or social spheres8, 9, 10. Communities in networks often overlap9, 10 such that nodes simultaneously belong to several groups. Meanwhile, many networks are known to possess hierarchical organization, where communities are recursively grouped into a hierarchical structure11, 12, 13. However, the fact that many real networks have communities with pervasive overlap, where each and every node belongs to more than one group, has the consequence that a global hierarchy of nodes cannot capture the relationships between overlapping groups. Here we reinvent communities as groups of links rather than nodes and show that this unorthodox approach successfully reconciles the antagonistic organizing principles of overlapping communities and hierarchy. In contrast to the existing literature, which has entirely focused on grouping nodes, link communities naturally incorporate overlap while revealing hierarchical organization. We find relevant link communities in many networks, including major biological networks such as protein–protein interaction6, 7, 14 and metabolic networks11, 15, 16, and show that a large social network10, 17, 18 contains hierarchically organized community structures spanning inner-city to regional scales while maintaining pervasive overlap. Our results imply that link communities are fundamental building blocks that reveal overlap and hierarchical organization in networks to be two aspects of the same phenomenon.

At a glance

Figures

  1. Overlapping communities lead to dense networks and prevent the discovery of a single node hierarchy.
    Figure 1: Overlapping communities lead to dense networks and prevent the discovery of a single node hierarchy.

    a, Local structure in many networks is simple: an individual node sees the communities it belongs to. b, Complex global structure emerges when every node is in the situation displayed in a. c, Pervasive overlap hinders the discovery of hierarchical organization because nodes cannot occupy multiple leaves of a node dendrogram, preventing a single tree from encoding the full hierarchy. d, e, An example showing link communities (colours in d), the link similarity matrix (e; darker entries show more similar pairs of links) and the link dendrogram (e). f, Link communities from the full word association network around the word ‘Newton’. Link colours represent communities and filled regions provide a guide for the eye. Link communities capture concepts related to science and allow substantial overlap. Note that the words were produced by experiment participants during free word associations.

  2. Assessing the relevance of link communities using real-world networks.
    Figure 2: Assessing the relevance of link communities using real-world networks.

    Composite performance (Methods and Supplementary Information) is a data-driven measure of the quality (relevance of discovered memberships) and coverage (fraction of network classified) of community and overlap. Tested algorithms are link clustering, introduced here; clique percolation9; greedy modularity optimization26; and Infomap21. Test networks were chosen for their varied sizes and topologies and to represent the different domains where network analysis is used. Shown for each are the number of nodes, N, and the average number of neighbours per node, left fencekright fence. Link clustering finds the most relevant community structure in real-world networks. AP/MS, affinity-purification/mass spectrometry; LC, literature curated; PPI, protein–protein interaction; Y2H, yeast two-hybrid.

  3. Community and membership distributions for the metabolic and mobile phone networks.
    Figure 3: Community and membership distributions for the metabolic and mobile phone networks.

    The distribution of community sizes and node memberships (insets). Community size shows a heavy tail. The number of memberships per node is reasonable for both networks: we do not observe phone users that belong to large numbers of communities and we correctly identify currency metabolites, such as water, ATP and inorganic phosphate (Pi), that are prevalently used throughout metabolism. The appearance of currency metabolites in many metabolic reactions is naturally incorporated into link communities, whereas their presence hindered community identification in previous work11, 15.

  4. Meaningful communities at multiple levels of the link dendrogram.
    Figure 4: Meaningful communities at multiple levels of the link dendrogram.

    ac, The social network of mobile phone users displays co-located, overlapping communities on multiple scales. a, Heat map of the most likely locations of all users in the region, showing several cities. b, Cutting the dendrogram above the optimum threshold yields small, intra-city communities (insets). c, Below the optimum threshold, the largest communities become spatially extended but still show correlation. d, The social network within the largest community in c, with its largest subcommunity highlighted. The highlighted subcommunity is shown along with its link dendrogram and partition density, D, as a function of threshold, t. Link colours correspond to dendrogram branches. e, Community quality, Q, as a function of dendrogram level, compared with random control (Methods).

References

  1. Newman, M. E. J., Barabási, A.-L. & Watts, D. J. The Structure and Dynamics of Networks (Princeton Univ. Press, 2006)
  2. Caldarelli, G. Scale-Free Networks: Complex Webs in Nature and Technology (Oxford Univ. Press, 2007)
  3. Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Critical phenomena in complex networks. Rev. Mod. Phys. 80, 12751335 (2008)
  4. Girvan, M. & Newman, M. E. J. Community structure in social and biological networks. Proc. Natl Acad. Sci. USA 99, 78217826 (2002)
  5. Fortunato, S. Community detection in graphs. Phys. Rep. 486, 75174 (2010)
  6. Krogan, N. J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae . Nature 440, 637643 (2006)
  7. Gavin, A.-C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631636 (2006)
  8. Wasserman, S. & Faust, K. Social Network Analysis: Methods and Applications. Structural analysis in the social sciences (Cambridge Univ. Press, 1994)
  9. Palla, G., Derény, I., Farkas, I. & Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814818 (2005)
  10. Palla, G., Barabási, A. & Vicsek, T. Quantifying social group evolution. Nature 446, 664667 (2007)
  11. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabási, A.-L. Hierarchical organization of modularity in metabolic networks. Science 297, 15511555 (2002)
  12. Sales-Pardo, M., Guimera, R., Moreira, A. & Amaral, L. Extracting the hierarchical organization of complex systems. Proc. Natl Acad. Sci. USA 104, 1522415229 (2007)
  13. Clauset, A., Moore, C. & Newman, M. E. J. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98101 (2008)
  14. Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104110 (2008)
  15. Guimerà, R. & Amaral, L. A. N. Functional cartography of complex metabolic networks. Nature 433, 895900 (2005)
  16. Feist, A. M. et al. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 orfs and thermodynamic information. Mol. Syst. Biol. 3, 121 (2007)
  17. Onnela, J.-P. et al. Structure and tie strengths in mobile communication networks. Proc. Natl Acad. Sci. USA 104, 73327336 (2007)
  18. González, M. C., Hidalgo, C. A. & Barabási, A.-L. Understanding individual human mobility patterns. Nature 453, 779782 (2008)
  19. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V. & Parisi, D. Defining and identifying communities in networks. Proc. Natl Acad. Sci. USA 101, 26582663 (2004)
  20. Newman, M. E. J. & Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004)
  21. Rosvall, M. & Bergstrom, C. T. Maps of random walks on complex networks reveal community structure. Proc. Natl Acad. Sci. USA 105, 11181123 (2008)
  22. Reichardt, J. & Bornholdt, S. Detecting fuzzy community structures in complex networks with a Potts model. Phys. Rev. Lett. 93, 218701 (2004)
  23. Li, D. et al. Synchronization interfaces and overlapping communities in complex networks. Phys. Rev. Lett. 101, 168701 (2008)
  24. Lancichinetti, A., Fortunato, S. & Kertesz, J. Detecting the overlapping and hierarchical community structure in complex networks. N. J. Phys. 11, 033015 (2009)
  25. Fortunato, S. & Barthélemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 3641 (2007)
  26. Clauset, A., Newman, M. E. J. & Moore, C. Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)
  27. Lancichinetti, A. & Fortunato, S. Community detection algorithms: a comparative analysis. Phys. Rev. E 80, 056117 (2009)
  28. The Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic Acids Res. 36, D440D444 (2008)
  29. Evans, T. S. & Lambiotte, R. Line graphs, link partitions and overlapping communities. Phys. Rev. E 80, 016105 (2009)
  30. Evans, T. S. & Lambiotte, R. Edge partitions and overlapping communities in complex networks. Preprint at left fencehttp://arxiv.org/abs/0912.4389right fence (2009)

Download references

Author information

  1. These authors contributed equally to this work.

    • Yong-Yeol Ahn,
    • James P. Bagrow &
    • Sune Lehmann

Affiliations

  1. Center for Complex Network Research, Department of Physics, Northeastern University, Boston, Massachusetts 02115, USA

    • Yong-Yeol Ahn &
    • James P. Bagrow
  2. Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Harvard University, Boston, Massachusetts 02215, USA

    • Yong-Yeol Ahn &
    • James P. Bagrow
  3. Institute for Quantitative Social Science, Harvard University, Cambridge, Massachusetts 02138, USA

    • Sune Lehmann
  4. College of Computer and Information Science, Northeastern University, Boston, Massachusetts 02115, USA

    • Sune Lehmann

Contributions

Y.-Y.A., J.P.B. and S.L. designed and performed the research and wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Information (2.5M)

    This file contains a Supplementary Information (see Table of Contents), Supplementary Figures S1-S32 with legends, Supplementary Tables S1-S2 and References.

Zip files

  1. Supplementary Table 1 (25K)

    This file contains the details for PPI link communities.

  2. Supplementary Table 2 (12K)

    This file contains the details for metabolic link communities.

  3. Supplementary Table 3 (92K)

    This file contains the details for word association link communities.

Additional data