Abstract
One of the great challenges of modern science is to faithfully model, and understand, matter at a wide range of scales. Starting with atoms, the vastness of the space of possible configurations poses a formidable challenge to any simulation of complex atomic and molecular systems. We introduce a computational method to reduce the complexity of atomic configuration space by systematically recognising hierarchical levels of atomic structure, and identifying the individual components. Given a list of atomic coordinates, a network is generated based on the distances between the atoms. Using the technique of modularity optimisation, the network is decomposed into modules. This procedure can be performed at different resolution levels, leading to a decomposition of the system at different scales, from which hierarchical structure can be identified. By considering the amount of information required to represent a given modular decomposition we can furthermore find the most succinct descriptions of a given atomic ensemble. Our straightforward, automatic and general approach is applied to complex crystal structures. We show that modular decomposition of these structures considerably simplifies configuration space, which in turn can be used in discovery of novel crystal structures, and opens up a pathway towards accelerated molecular dynamics of complex atomic ensembles. The power of this approach is demonstrated by the identification of a possible allotrope of boron containing 56 atoms in the primitive unit cell, which we uncover using an accelerated structure search, based on a modular decomposition of a known dense phase of boron, γB_{28}.
Introduction
Considerable growth in computational power and its ubiquity has been coupled with the development of efficient algorithms^{1, 2} and their implementation in robust^{3,4,5} and reliable^{6} computer codes. This has permitted the first principles, quantum mechanical (through density functional theory (DFT)^{7,8,9}), treatment of material,^{10} chemical^{11} and biological systems^{12} of increasing complexity. High throughput computations can be performed directly on preexisting data of more modest complexity, or some modification of it,^{13} or as a way to sample configuration space to discover previously unknown structures.^{14,15,16,17} Altogether, these approaches offer a route to computational materials discovery.^{18}
In parallel, increasing computational power and an abundance of data has given rise to another rapidly expanding field—the study of complex networks.^{19,20,21,22} A key reason for its success is the fact that related mathematical approaches can be applied to a wide range of realworld network data across many academic disciplines. The structure of networks can be studied at a variety of resolutions. Local measures of connectivity can quantify the topological properties of individual nodes or edges. Global measures, calculated across the entire network, such as the average shortest path length,^{19} can help us to compare networks as a whole. Between these two extremes however lies an entire field of research that searches for meaningful descriptions of intermediate structures, such as ‘cliques’,^{23} ‘communities’^{24} and ‘rich clubs’,^{25} among others. These are sets of nodes or edges which are particularly densely connected, or which share some other defining topological feature. Many definitions of such structures have been put forward.^{23,24,25,26,27,28} Here we select an approach that is particularly good at detecting hierarchical modularity^{27} and apply it to atomic networks, which until now have received scant attention from the networks research community, beyond the study of proteins.^{29,30,31} Our aim is to provide an automated coarsegraining of the atomic structures of crystal structures. The simplification of the space of possible configurations of complex atomic systems has the potential to vastly accelerate the process of computational materials discovery, among other tasks that can benefit from automatic coarsegraining based on hierarchical decomposition. We illustrate this through the identification of a possible new allotrope of boron.
Results
Determining the modularity of atomic networks
For atomic structures of a single atomic species we can generate an unweighted network of atoms by simply imposing a threshold d ^{*} on the interatomic distance and connecting atoms that are closer to each other than this threshold distance. The communities in this network can then be extracted by using the algorithm of Arenas et al.,^{27} as discussed below.
The extent to which a network has welldefined community structure can be quantified with a metric known as the modularity^{24} (Fig. 1). This is defined as the fraction of edges that run between nodes of the same community, minus the expected fraction if the edges of the network were positioned randomly:
Here i, j ∈ [1, n], where n is the number of nodes. δ(C _{ i }, C _{ j }) = 1 if nodes i and j belong to the same community, and 0 otherwise, and A _{ ij } and P _{ ij } are the adjacency matrices of the network and of the null model (the randomised network), respectively. \(A = \frac{1}{2}\mathop {\sum}\nolimits_{i,\, j} {A_{ij}}\) gives the total number of edges in the network.
This metric relies on the concept that a random graph is not expected to exhibit community structure. As such, the quality of the proposed community structure can be quantified by the difference between the network and the null model. The degree of a node i is defined as the number of edges of the node, \({A_i} = \mathop {\sum}\nolimits_j {A_{ij}}\;\). Choosing the null model to have the same degree distribution as the target network gives:
This can be rewritten as a sum over the M communities of the network:
Where for unweighted networks A _{ ss } is the number of edges within community s, and A _{ s } is the sum of the degrees of nodes within community s. (Equivalently, A _{ s } is the number of edges exiting the community +2A _{ ss }.) This metric allows community detection to be recast as an optimisation problem; maximising Q minimises the number of edges between communities. However, naive optimisation of the modularity has been shown to have a fundamental resolution limit.^{32} Communities smaller than a certain size \(A_{ss}^{min}\) will not be detected. This threshold depends on the total number of edges in the network:
This resolution limit arises due to the explicit dependence on the number of edges within the null model. This introduces a preferred size for the communities in the network.
The algorithm of Arenas et al.^{27} utilises this bound on the size of detectable communities. By adding a selfloop of strength w to each node in network, A → A + w I, the resolution limit inequality becomes:
where N _{ s } is the number of nodes in community s, and N the total number of nodes. By varying the effective resolution limit, the scale of the communities extracted can be varied. As w is increased, modularity optimisation will result in an increasingly fragmented decomposition. By optimising modularity for a range of this parameter w, and across a range of threshold values d ^{*}, one obtains a variety of hierarchical decompositions into modules (see the section ‘Relax and Shake Algorithm’ for details of the optimisation algorithm). The simplest quantity one can establish across this twodimensional (2D) space is the number of modules. Regions across which this value is stable represent more meaningful modules that may have real physical or biological meaning as rigid clusters or units of protein architecture.
Pauling’s rule of parsimony suggests that the number of different kind of constituents in a crystal is small.^{33} This suggests that in complex crystal structures, in which modules are likely to exhibit a degree of symmetry, it makes sense to minimise a more sophisticated quantity, namely the information content of a structure. The identification of modules corresponds to a compression if these modules contain symmetries or if the same module appears multiple times. We can calculate the amount of information I required to describe a given module structure in terms of the global degrees of freedom of that structure, and minimise this quantity over the space of d ^{*} and w.
In order to calculate I, consider M modules of M′ distinct types. The number of modules with one atom only is 0 ≤ M ^{*} ≤ M, and the number of modules with two atoms is 0 ≤ M ^{**} ≤ M. To position and rotate the M modules relative to one another we need 6M − 6 degrees of freedom in general, with one degree less for every twoatom module, and three degrees less for every oneatom module. Now consider each of the M′ distinct modules: If we have n _{ i } > 2 inequivalent atoms in module i we need 3n _{ i } − 6 degrees of freedom to describe them. If we have n _{ i } = 2 inequivalent atoms in module i we need 3n _{ i } − 5 = 1 degree of freedom to describe them, which corresponds to the distance between the atoms. If we have n _{ i } = 1 inequivalent atom in module i we need 0 degrees of freedom to describe it internally. The global number of degrees of freedom is then:
Note that the number of inequivalent atoms n _{ i } depends on the number of atoms in module i, as well as its point group symmetry. If all N atoms are in one module, repeated once, we have M = 1, M′ = 1, M ^{*} = 0, M ^{**} = 0, and n _{ i } = N. Hence I = 3N − 6, as required. If all N atoms are in N modules of one atom we have M = N, M′ = 1, M ^{*} = N, M ^{**} = 0, n _{ i } = 1. Hence in this case also I = 3N − 6, as required. The information I can be normalised by the maximum possible value of 3N − 6. This normalised value is used in all the heatmap figures in this manuscript.
The decomposition which has the minimum I gives us the most concise description of a structure. This minimisation of the description length is conceptually related to the idea of algorithmic information theory,^{34,35,36} as the symmetry operations and inequivalent atomic positions that form part of the compression can be thought of algorithms which allow us to reconstruct the original atomic structure. The length of the shortest such description is a quantitative measure of the structure’s complexity. Because of the presence of crystal symmetries, we need to establish the modularity of the atomic network with high accuracy. In addition, the modularity is highly degenerate; there is a greater than exponential number of distinct possible community structures, and many will have modularity values close to that of the global maximum.^{37} Moreover, these structures may have very different topologies to that of the true partition, resulting in a large change in the compression achieved. We therefore employ an algorithm similar to that of the ‘relax and shake’ algorithm^{15} or zero temperature basin hopping.^{38}
Relax and Shake Algorithm
The relax and shake (RASH) algorithm uses a repeated series of local modularity optimisations (relax) followed by the assignment of a small number of nodes into random communities (shake), in order to escape local maxima. The local optimisation follows existing work on community detection.^{39, 40} The modularity is optimised by moving each node in the network to the community of the neighbouring node resulting in the highest increase in the modularity (if > 0). This is then repeated until no further local optimisations increase the modularity. Following the local optimisation, a subset of the nodes (10%) are shaken into other communities within the network, and the local optimisation repeated. This continues until 200 consecutive relaxandshake iterations have failed to improve the modularity. As an additional check on the solution, the modularity change resulting in merging any two communities is calculated; if this results in a modularity increase, the merge is performed, and the relaxandshake iteration process is begun again using the new partition. The above can be considered a single optimisation step; following this, a larger subset of the nodes (20%) are shaken into either preexisting, or new communities. The optimisation is then performed again. This shakeandrelax is performed until 200 iterations fail to improve the modularity. The whole process is repeated until three consecutive runs have failed to produce a community structure with a higher modularity. The degree of repetition is parameterisable, and allows us to have confidence in the community structure obtained (at the cost of speed).
If we partition a structure of N atoms into M multiatom modules, so that typically M ≪ N, and assume that the modules correspond to rigid clusters, then we reduce the dimensionality of configuration space from 3N − 6 (atomic positions minus global translation and rotation) to 6M − 6 (as we have to specify a relative translation and rotation for each module) or less (if any of the modules have less than three atoms). We will show that this can be exploited in the first principles prediction of crystal structure.
Application to crystal structures
Boron. Boron is known to form several allotropes,^{41} including αB,^{42} βB^{43, 44} and γB.^{45,46,47} The structure of rhombohedral αB is widely recognised as being made up of interconnected B_{12} icosahedra (Fig. 2). However, because the bonds between different icosahedra are shorter than the bonds within, simple thresholds on bond length will not yield the underlying modular structure of this crystal phase. This observation motivated the development of our current scheme, which, based on network modularity, does yield the scientifically agreed icosahedral modules.
The structure of βB is much more complicated, and various models have been proposed, typically with 105 or 106 atoms per unit cell. We choose the 105atom model of Geist et al.^{44} for further investigation (Fig. 3). Our modularity detection scheme identifies four icosahedra, two larger 25atom modules with threefold cyclic point group symmetry C_{3v }, and one module with threefold dihedral symmetry D_{3d } of seven atoms. Two of the four icosahedra are slightly distorted, resulting in C_{2v } symmetry, rather than icosahedral I _{ h } symmetry. Of course, the decomposition of complex boron structures into compact and symmetric sub units is not unprecedented, see for example Fig. 2 in ref. 41. We emphasise, however, that our scheme performs the decomposition automatically and is suitable for integration into complex computational workflows.
Recently the structure of a highpressure phase of boron, γB_{28}, has been described in the literature^{41, 46, 47} (Fig. 4). A w _{ s } vs. d ^{*} heatmap of I for the unweighted network reveals a global minimum at w _{ s } = 1.0 and d ^{*} = 2.0 Å. This corresponds to two 14atom modules with D_{2h } symmetry, which are icosahedra plus two atoms on either side (Fig. 4). Note that this contrasts with the conventional decomposition into two icosahedra and two dimers found in the literature,^{46} which is less favoured in our scheme as it corresponds to a higher value of I. Our approach offers a meaningful partition of this structure into modules, providing insight into the organisation and visualisation of this structure and opening the door to the systematic exploration of the structure space that neighbours this γB_{28} allotrope.
Phosphorus. Like boron, phosphorous exhibits rich allotropism, from the highly metastable white phosphorous, to layered black phosphorous, and extremely complex fibrous, or layered, structures.^{48} There is considerable current interest in 2D black phosphorous, or phosphorene^{49} and other layered forms.^{50} Here we investigate the crystal structure of red phosphorus,^{51} and attempt to identify a simple decomposition into modules using our current scheme. In the 42atom primitive unit cell (space group P\(\bar 1\)) the modularity decomposition finds two modules that each occur twice (Fig. 5). The bigger module has symmetry (C_{ s }) and contains 13 atoms, while the smaller has C_{2v } symmetry and contains eight atoms. The relatively low degree of symmetry of red phosphorous means that the landscape of I with varying w _{ s } and d ^{*} is flatter, but our approach nevertheless finds a parsimonious decomposition of the crystal structure.
Metalorganic frameworks. We extend our framework to multispecies structures requiring only a definition of the relationship between d ^{*} and the interatomic distances used to determine the network connectivity. In principle, a separate d ^{*} could be defined for each pair of atomic species. However, this introduces the cost of exploring a higher dimensional space in the search for an optimal I. Instead, we define a single dimensionless parameter \(d_{{\rm{eff}}}^{\rm{*}}\) that specifies the distance threshold as the \(d_{{\rm{eff}}}^{\rm{*}}\)fold multiple of the sum of the fixed atomic radii for a given pair of atoms. We apply this multispecies version of our approach to the metalorganic framework MOF5, or Zn_{4}O(BDC)_{3}, where BDC^{2–} is 1,4benzenedicarboxylate.^{52} Metalorganic frameworks exhibit a vast range of structures and are of great interest because their porosity allows them to be used for the storage of gases, such as hydrogen, or carbon dioxide.^{53} As can be seen in Fig. 6 our algorithm finds two similar decompositions with almost equally low Ivalues. The lowest minimum corresponds to a decomposition of the structure into six 16atom modules with D_{2h } symmetry, and two sixatom modules with tetrahedral symmetry (T_{ d }). The 16atom modules correspond to the BDC^{2–} molecules that are sometimes referred to as the ‘struts’ of metalorganic frameworks. This suggests that the decomposition derived automatically through our procedure is chemically meaningful.
Structure prediction
Ab initio random structure searching (AIRSS)^{14, 15} is a simple, and demonstrably effective, approach to first principles structure prediction. It has been applied to a wide range of systems, from the crystal structures of dense hydrogen^{54} and hydrogen rich compounds,^{14} to matter under extreme compression,^{55} and interfaces.^{56} The approach involves selecting initially random structures from distributions defined by physically motivated constraints (for example, density, composition, atomic distances, symmetry, molecular units or fragments). These random ‘sensible’^{15} structures are fully relaxed (moved to the nearest local minimum in the energy landscape) under forces derived from DFT. Once a large number of computations have been performed the resulting structures can be ranked according to energy (free enthalpy) or any computable property of interest.
It has been a surprise to many that such a naive approach performs well, but the method’s success is linked to intrinsic features of the first principles energy landscape, such as its smoothness (a result of the quantum mechanical interactions between the atoms and electrons) and the relatively large number of large energy basins. In a smooth energy landscape the size of the basins correlates with their depth (deep basins occupying a large volume of configuration space), there is a natural bias in random sampling, plus relaxation, to the stable, low energy and relevant structures. In what follows we exploit our new approach for the automatic decomposition of known structures into minimum I fragments to accelerate the search for complex structures by restricting the regions of configuration space that must be explored.
Application to dense boron. We generate 3303 initial random structures based on packing four 14 atom D_{2h } modules derived from γB_{28} into unit cells with a randomly chosen shape, and the same density as γB_{28}. The units are not permitted to overlap each other, or be closer than 1.63 Å (the measured intericosahedral distance in a computed αB structure), and are related to each other by symmetry. The symmetry is chosen at random from those space groups with four symmetry operators in the primitive cell. The random initial structures are then relaxed to nearby local energy minima (see ‘Methods’ for computational details). Four of the initial structures relaxed to supercells of the Pnnm γB_{28} structure, and three of them relaxed to a previously unreported structure with space group Pbcn and 56 atoms in a unit cell. This structure has a density very close to that of γB_{28}, and is just 3 meV/atom less stable. In Fig. 7 it is shown that this near degeneracy persists over a wide range of pressures.
Our new 56 atom structure corresponds to a distorted hexagonal packing of boron icosahedra, whereas γB_{28} corresponds closely to cubic packing. The small energy difference indicates that γB may be susceptible to polytypism or stacking disorder. The 3 meV/atom energy difference between the hexagonal and cubic polytypes is small compared to the 25 meV difference between cubic and hexagonal (carbon) diamond computed at the same level. The situation is very similar to that for α − B_{12}, for which an alternatively packed structure of 24 atom and space group Cmcm has been identified,^{15} and further discussed.^{57, 58}
Discussion
In contrast to other applications of communitydetection algorithms, in which a degree of ambiguity in the definition of communities is often tolerated, our application relies on the detection of robust and compressible modules of atoms with a maximum amount of symmetry. Small differences in the assignments of atoms to modules can have large effects on the information measure I if modular symmetries are established or broken as a result. For this reason the implementation of the modularity optimisation requires particular care in order to find maximally robust modular decompositions.
In the context of multispecies atomic structures, and particularly biological molecules, it can be valuable to consider weighted atomic networks, in which the edge weight scales with the interatomic distance. The simplest choice is a linear one:
for d _{ ij } < d ^{*} and w _{ ij } = 0 otherwise, where d _{ ij } is the distance between atoms i and j, and K is an arbitrary constant. Increasing all edge weights by a constant factor leaves the modularity unchanged, so in practice this is chosen to ensure numerical precision. This choice of edge weighting reflects the fact that equilibrium bond lengths scale with the equilibrium bond energies.
Our approach can be applied in the context of proteins, similar to,^{29,30,31} where the vastness of configuration space has traditionally also been a difficult barrier to overcome. The lack of symmetry in biological molecules mean that the information required to describe the structure is less useful than in crystal structures. Other measurements, such as the stability of the module number, can replace the information measurement as a criterion for assessing the quality of a modular decomposition in biomolecules. While methods for rigidity analysis in proteins already exist, such as the FIRST algorithm,^{59} the tuning parameter in our method allows for the detection of rigid clusters on a variety of length scales, making it a complementary approach. Like FIRST, our method could inform coarsegrained multiscale molecular dynamics methods such as FRODA,^{60} supplying the rigid units described as ‘ghost templates’ in FRODA, and lead to more efficient computational models of protein dynamics.
In the context of complex crystal structures this approach has numerous potential applications. It first of all suggests an automatic coarsegraining and thereby provides an intuitive simplification and visual aid. The discovery of modules also facilitates the accelerated exploration of configuration space, particularly in the context of random structure search. This has been demonstrated by the new structure of boron closely related to γB_{28} that we find using a modulebased search. We believe that these results show the potential of atomic network analysis as a tool for materials discovery. Our algorithm is evidently fast enough for these purposes—the run time for calculating the metalorganic framework MOF5 decomposition (106 atoms per unit cell) was 2.3 s on a 3.1 GHz Intel Core i7 processor.
Methods
The density functional computations on boron were performed using CASTEP 17.2,^{3} a full featured plane wave pseudopotential total energy code. The GGAPBE^{61} density functional was used, along with a legacy Vanderbilt ultrasoft pseudopotential,^{62} a plane wave cutoff of 240 eV, and a kpoint sampling density of 0.1 × 2π Å^{−1}, for the random searches. The enthalpies were calculated using a higher precision default onthefly pseudopotential, 700 eV plane wave cutoff and a kpoint sampling density of 0.03 × 2π Å^{−1}.
Data availability
The data associated with this manuscript is made available in ref. 63.
References
Car, R. & Parrinello, M. Unified approach for molecular dynamics and densityfunctional theory. Phys. Rev. Lett. 55, 2471 (1985).
Payne, M. C., Teter, M. P., Allan, D. C., Arias, T. & Joannopoulos, J. Iterative minimization techniques for ab initio totalenergy calculations: molecular dynamics and conjugate gradients. Rev. Mod. Phys. 64, 1045 (1992).
Clark, S. J. et al. First principles methods using castep. Zeitschrift für KristallographieCrystalline. Materials 220, 567–570 (2005).
Giannozzi, P. et al. Quantum espresso: a modular and opensource software project for quantum simulations of materials. J. Phys.: Condens. Matter 21, 395502 (2009).
Kresse, G. & Furthmüller, J. Efficient iterative schemes for ab initio totalenergy calculations using a planewave basis set. Phys. Rev. B 54, 11169 (1996).
Lejaeghere, K. et al. Reproducibility in density functional theory calculations of solids. Science 351, aad3000 (2016).
Hohenberg, P. & Kohn, W. Inhomogeneous electron gas. Phys. Rev. 136, B864 (1964).
Kohn, W. & Sham, L. J. Selfconsistent equations including exchange and correlation effects. Phys. Rev. 140, A1133 (1965).
Parr, R. G. & Yang, W. Densityfunctional theory of the electronic structure of molecules. Annu. Rev. Phys. Chem. 46, 701–728 (1995).
Hasnip, P. J. et al. Density functional theory in the solid state. Philos. Trans. R. Soc. A 372, 20130270 (2014).
Zhao, Y. & Truhlar, D. G. Density functionals with broad applicability in chemistry. Acc. Chem. Res. 41, 157–167 (2008).
Cole, D., Skylaris, C.K., Rajendra, E., Venkitaraman, A. & Payne, M. Proteinprotein interactions from linearscaling firstprinciples quantummechanical calculations. Europhys. Lett. 91, 37004 (2010).
Jain, A., Shin, Y. & Persson, K. A. Computational predictions of energy materials using density functional theory. Nat. Rev. Mater. 1, 15004 (2016).
Pickard, C. J. & Needs, R. J. Highpressure phases of silane. Phys. Rev. Lett. 97, 045504 (2006).
Pickard, C. J. & Needs, R. J. Ab initio random structure searching. J. Phys.: Condens. Matter 23, 053201 (2011).
Oganov, A. R. & Glass, C. W. Crystal structure prediction using ab initio evolutionary techniques: Principles and applications. J. Chem. Phys. 124, 244704 (2006).
Wang, Y., Lv, J., Zhu, L. & Ma, Y. Calypso: a method for crystal structure prediction. Comput. Phys. Commun. 183, 2063–2070 (2012).
Needs, R. J. & Pickard, C. J. Perspective: role of structure prediction in materials discovery and design. APL Mater. 4, 053210 (2016).
Watts, D. & Strogatz, S. Collective dynamics of ‘smallworld’ networks. Nature. 393, 440–442 (1998).
Barabasi, A. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Albert, R. & Barabasi, A. Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002).
Newman, M. Networks: An Introduction (Oxford University Press, 2010).
Palla, G., Derényi, I., Farkas, I. & Vicsek, T. Uncovering the overlapping community structure of complex networks in nature and society. Nature. 435, 814–818 (2005).
Newman, M. & Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004).
Colizza, V., Flammini, A., Serrano, M. A. & Vespignani, A. Detecting richclub ordering in complex networks. Nat. Phys. 2, 110–115 (2006).
Blondel, V. D., Guillaume, J.L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech.Theory Exp. 2008, P10008 (2008).
Arenas, A., Fernández, A. & Gómez, S. Analysis of the structure of complex networks at different resolution levels. New. J. Phys. 10, 053039 (2008).
Ahn, Y.Y., Bagrow, J. P. & Lehmann, S. Link communities reveal multiscale complexity in networks. Nature 466, 761–U11 (2010).
Delvenne, J. C., Yaliraki, S. N., Barahona, M. & Newman, M. Stability of graph communities across time scales. Proc. Natl Acad. Sci. USA 107, 12755–12760 (2010).
Delmotte, A., Tate, E. W., Yaliraki, S. N. & Barahona, M. Protein multiscale organization through graph partitioning and robustness analysis: application to the myosin–myosin light chain interaction. Phys. Biol. 8, 055010 (2011).
Amor, B., Yaliraki, S. N., Woscholski, R. & Barahona, M. Uncovering allosteric pathways in caspase1 using Markov transient analysis and multiscale community detection. Mol. Biosyst. 10, 2247 (2014).
Fortunato, S. & Barthelemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 36–41 (2007).
Pauling, L. The principles determining the structure of complex ionic crystals. J. Am. Chem. Soc. 51, 1010–1026 (1929).
Kolmogorov, A. N. Three approaches to the quantitative definition of information’. Probl. Inf. Transm. 1, 1–7 (1965).
Chaitin, G. J. Algorithmic Information Theory (Wiley Online Library, 1982).
Ahnert, S., Johnston, I., Fink, T., Doye, J. & Louis, A. Selfassembly, modularity, and physical complexity. Phys. Rev. E 82, 026117 (2010).
Good, B. H., de Montjoye, Y.A. & Clauset, A. Performance of modularity maximization in practical contexts. Phys. Rev. E 81, 046106. doi:10.1103/PhysRevE.81.046106 (2010).
Leary, R. H. Global optimization on funneling landscapes. Journal of Global Optimization 18, 367–383 (2000).
Blondel, V. D., Guillaume, J.L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech.Theory. E 10, P10008. http://iopscience.iop.org/article/10.1088/17425468/2008/10/P10008 (2008).
Massen, C. & Doye, J. Identifying communities within energy landscapes. Phys. Rev. E 71, 046101. https://journals.aps.org/pre/abstract/10.1103/PhysRevE.71.046101 (2005).
Albert, B. & Hillebrecht, H. Boron: elementary challenge for experimenters and theoreticians. Angewandte Chemie International Edition 48, 8640–8668 (2009).
Decker, B. & Kasper, J. The crystal structure of a simple rhombohedral form of boron. Acta. Crystallogr. 12, 503–506 (1959).
Talley, C. P., La Placa, S. & Post, B. A new polymorph of boron. Acta. Crystallogr. 13, 271–272 (1960).
Geist, D., Kloss, R. & Follner, H. Verfeinerung des βrhomboedrischen bors. Acta. Crystallogr. B. 26, 1800–1802 (1970).
Wentorf, R. H. Boron: Another form. Science 147, 49–50 (1965).
Oganov, A. R. et al. Ionic highpressure form of elemental boron. Nature. 457, 863–867 (2009).
Zarechnaya, E. Y. et al. Superhard semiconducting optically transparent high pressure phase of boron. Phys. Rev. Lett. 102, 185501 (2009).
Bachhuber, F. et al. The extended stability range of phosphorus allotropes. Angewandte Chemie International Edition 53, 11629–11633 (2014).
Liu, H. et al. Phosphorene: An unexplored 2d semiconductor with a high hole mobility. ACS nano 8, 4033–4041 (2014).
Schusteritsch, G., Uhrin, M. & Pickard, C. J. Singlelayered hittorfs phosphorus: a widebandgap high mobility 2d material. Nano. Lett. 16, 2975–2980 (2016).
Ruck, M. et al. Faserförmiger roter phosphor. Angew. Chem. 117, 7788–7792 (2005).
Li, H., Eddaoudi, M., O’Keeffe, M. & Yaghi, O. M. Design and synthesis of an exceptionally stable and highly porous metalorganic framework. Nature 402, 276–279 (1999).
Kitagawa, S. et al. Metal–organic frameworks (mofs). Chem. Soc. Rev. 43, 5415–5418 (2014).
Pickard, C. J. & Needs, R. J. Structure of phase iii of solid hydrogen. Nat. Phys. 3, 473–476 (2007).
Pickard, C. J. & Needs, R. Aluminium at terapascal pressures. Nat. Mater. 9, 624–627 (2010).
Schusteritsch, G. & Pickard, C. J. Predicting interface structures: from SrTiO_{3} to graphene. Phys. Rev. B 90, 035424 (2014).
He, C. & Zhong, J. Structures, stability, mechanical and electronic properties of αboron and α*boron. AIP Adv. 3, 042138 (2013).
Zhu, Q., Oganov, A. R., Lyakhov, A. O. & Yu, X. Generalized evolutionary metadynamics for sampling the energy landscapes and its applications. Phys. Rev. B 92, 024106 (2015).
Jacobs, D. J., Rader, A. J., Kuhn, L. A. & Thorpe, M. F. Protein flexibility predictions using graph theory. Proteins 44, 150–165 (2001).
Wells, S., Menor, S., Hespenheide, B. & Thorpe, M. F. Constrained geometric simulation of diffusive motion in proteins. Phys. Biol. 2, S127–S136 (2005).
Perdew, J. P., Burke, K. & Ernzerhof, M. Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865 (1996).
Vanderbilt, D. Soft selfconsistent pseudopotentials in a generalized eigenvalue formalism. Phys. Rev. B 41, 7892 (1990).
Pickard, C. Complex atomic networks, Ahnert, Grant, and Pickard, 2017 URL https://figshare.com/articles/Complex_atomic_networks_Ahnert_Grant_and_Pickard_2017/4780456 (2017).
Acknowledgements
S.E.A. is supported by the Royal Society University Research Fellowship and Gatsby Career Development Fellowship. C.J.P. acknowledges financial support from the Engineering and Physical Sciences Research Council (EPSRC) of the United Kingdom (Grant Nos. EP/G007489/2 and EP/K013688/1). C.J.P. is also supported by the Royal Society through a Royal Society Wolfson Research Merit award. W.P.G. acknowledges financial support from the EPSRC Centre for Doctoral Training in Computational Methods for Materials Science under grant EP/L015552/1.
Author information
Authors and Affiliations
Contributions
SEA and CJP designed the study and methodology. SEA, CJP and WPG performed computations, generated data and wrote the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing financial interest.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ahnert, S.E., Grant, W.P. & Pickard, C.J. Revealing and exploiting hierarchical material structure through complex atomic networks. npj Comput Mater 3, 35 (2017). https://doi.org/10.1038/s415240170035x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1038/s415240170035x
This article is cited by

Determining dimensionalities and multiplicities of crystal nets
npj Computational Materials (2020)

Structure prediction drives materials discovery
Nature Reviews Materials (2019)

De novo exploration and selfguided learning of potentialenergy surfaces
npj Computational Materials (2019)

Unexpectedly large energy variations from dopant interactions in ferroelectric HfO2 from highthroughput ab initio calculations
npj Computational Materials (2018)