Introduction

Within recent years, network theory became an indispensable tool in a broad range of applied sciences ranging from social psychology and epidemiology to transport engineering, biology and physics1,2. It allows us to study the spreading of rumours and ideas3,4, but also the spreading of diseases within populations5 and cascades of failures in the electricity grid6. These and alike works established the universal language of network science that is valid across disciplines: it is conceivable that a study on the neural network of the human brain may teach us to better optimise transportation networks in growing cities7. The beginnings of network science, by which we refer to a combination of graph and probability theories, are commonly linked with works of S. Milgram and P. Erdős, see for example ref.8. Nonetheless, not of least importance for the foundation of the field played works of Flory who proposed to use random graph-like structures to study hyperbranched and cross-linked polymers9. In fact, Flory might have been the first to show interest in the percolation phenomenon even before Erdős introduced his famous random graph concept10. Furthermore, the same scientists who applied percolation to polymerisation developed the percolation theory of networks by working extensively on the network and polymer sides of the problem11,12,13,14,15,16,17,18,19,20. Although network science developed as a separate vibrant discipline of its own ever since, current theories of hyperbranched polymers are largely based on the early developments of network science, and the modern viewpoint of complex networks is only starting to diffuse back into polymer chemistry where this theory has arguably originated21,22,23,24,25,26,27.

Conventional polymer networks are formed by a process called polymerization, during which small molecules bind together by means of covalent bonds and form large molecular structures. The functionality of these molecules is typically limited by the underlying chemistry and the bonds appear symmetric or asymmetric depending on the nature of functional groups that the reactants of the binding reaction bear28. It is mainly due to the variations of their topologies that polymeric materials feature such a broad range of physical properties. One of the most common polymerisation processes is the step-growth polymerisation of multifunctional monomers. This process leads to hyperbranched polymers of disperse sizes and irregular topologies that undergo a phase transition in their connectivity structure during the course of the polymerisation28. This transition is closely related to the percolation on networks3,4,5,6. The phase transition is marked by emergence of the gel, that is the giant molecule that spans the whole volume1,29. Flory provided simple analytical expressions for the average molecular size in these systems and was first to explain the onset of gelation, but limited himself to monomers of prescribed type An or An + B29. Later, Stockmayer presented a formal expression for the whole distribution of molecular sizes30, however the practical use of this expression is limited due to combinatorial complexity of computations. Durand and Claude derived a more general analytical expression for averages of the molecular size distribution31. Considerable progress has been made for the case of multifunctional monomers of type An, which feature symmetric bonds22,32,33, whereas among asymmetric multifunctional monomers only monomers of the type AB2 have a known analytical expression for the molecular size distribution as was demonstrated by Ziff11. For these reasons, the search continued resulting in a wave of fast approximate methods: as in works of Kryven et al.34,35, Wulkow et al.36, Tobita37, Hillegers and Slot38. Although these methods are computationally fast, the approximate methods are hard to adapt to new polymerisation schemes, and especially the schemes requiring description with multidimensional distributions. In the same time, multidimensional distributions naturally arise in master equations describing monomers with multiple functional groups.

Yet another approach that has been applied to polymer networks only recently, the Molecular Dynamics (MD) simulation, is especially attractive as it produces very detailed information on the structure of polymer networks39,40. Molecular dynamics simulations are notorious for being computationally expensive, and therefore, limited to small samples and short time scales. In our previous work39 we have demonstrated on the case of an acrylate polymer featuring predominantly symmetric covalent bonds, that many of the MD-generated network properties can be also reproduced by the configuration model for undirected random networks41,42. Furthermore, the recent developments in directed configuration models29,43 present an opportunity to develop a generic polymerisation framework that will cover asymmetrical bonds as well. The latter, despite posing a more complex mathematical problem, are also more ubiquitous in polymerisation chemistry, and especially in that of hyperbranched and super-molecular polymers44,45,46.

The current paper presents a new look on exactly solvable expressions for hyperbranched polymers by utilising latest developments of random graph models1,41,47. Being inspired by the kinetic theory of Krapivsky, Redner, and Ben-Naim17,48,49, we employ a two-stage approach: we first devise a kinetic model for the transformations the monomer units undergo in time, and then we construct a configuration random graph, which deduces the global properties of the network from the two-variate degree distribution that is obtained on the first stage. Analytical expressions are obtained for various distributional properties of the polymer resulting from step-growth polymerisation of random combination of arbitrary functional monomers. The advantages of the proposed random graph model are grounded in the generic applicability and analytical expressions that are also fast to compute.

Results

A polymer is a large molecule that consists of many repeat units, the monomers, and is formed as a result of chemical reactions that lead to covalent bonding between the monomers. Step-growth polymerization does not require an initiator and occurs between monomers that carry reactive functional groups. Many polymers with real world applications are formed as a result of step-growth polymerisation. Figure 1 features a few important examples related to polyesters, polyamides, and polyurethanes. The maximum number of chemical bonds that a single monomer bears is limited by the number and type of functional groups that are present in this monomer. If a system consists of solely two-functional monomers, only linear polymers are formed in the course of the step-growth polymerisation. However, if some (or all) monomers have more than two functional groups, it is possible to form hyperbranched polymers and networks.

Figure 1
figure 1

Examples of linear and branched polymers. The structural formulas of the monomers together with their AB representations that define the underlaying network topology are displayed. The list of polymers include: polyester59, branched polyurethane60, polyurea61 and polyamide62.

In many chemical systems, two monomers bind through an asymmetric reaction that occurs between functional groups of different kinds. When two functional groups of different kinds are reacting, for example, as in the reaction between an acid and an alcohol leading to an ester, we refer to one group as the A-group, and the other – as the B-group. This asymmetric reaction is at the main focus of the current paper. Symmetric reactions occurring between two groups of the same kind, e.g. two alcohols reacting to form an ether, have been covered elsewhere22. Figure 1 exemplifies this notation on a few cases of polymers that feature linear and branched topologies. For each case, we indicate the structural formulas of the relevant monomers, highlight the functional groups, and give the corresponding AB notation. In Figure 2 the asymmetric reaction between A and B groups is illustrated on the example of an A2 monomer (a monomer with two A-groups) reacting with an AB2 monomer (a monomer with one A-group and two B-groups).

Figure 2
figure 2

Illustration of an AB-reaction binding one A-group of an A2 and one B-group of an AB2 monomer. In (a) the colour of the bond indicates which monomer provided the A- and the B-group. In (b), the graph representation, the type of the functional group is stored in the directionality of the edge. An in-edge corresponds to a reacted A-group, an out-edge to a reacted B-group.

Before introducing the random graph model, we briefly summarise the terminology commonly used in graph theory. A directed graph consists of nodes and directed edges connecting them. A subgraph of a graph, in which any two nodes are connected by an undirected path is called a weakly connected component. In this work, we drop the prefix weakly, and refer to these components as connected components. When representing a polymer system as a graph, the monomers are identified with nodes, the chemical bonds with edges and thus a polymer molecule with a whole connected component. As the two sides of a chemical bond in the AB-reaction are not identical, we represent this asymmetry with directed edges. Without loss of generality, the directionality is defined as pointing from the B-group towards the A-group. The graph representation and the mapping of a reacted A-/B-group to an in-/out-edge is depicted in Figure 2.

The number of bonds that a monomer is bearing equals to the degree of the corresponding node. Generally speaking, monomers with distinct numbers of bonds may have different concentrations. To capture these differences, we refer to the node degree distribution, u(i, j), which defines the probability of a randomly chosen node to have i adjacent in-edges, and j adjacent out-edges, and therefore, u(i, j) is proportional to the concentration monomers with i and j reacted A- and B-groups. Figure 3b, demonstrates that the directed degree distribution u(i, j) and the projected undirected degree distribution \(u(k)=\sum _{i+j=k}u(i,j)\) that ignores the direction of the edges may lead to different sizes for connected components. In this extreme example, the directed degree distribution only allows connected components of size s = 3, whereas the undirected degree distribution does not limit these sizes at all. It turns out that much of the global information about the polymer system can be deduced from the degree distribution.

Figure 3
figure 3

The concept of the directed configuration model. (a) (left:) Nodes are signed “half-edges” according to the bivariate degree distribution that provides an input for the model; (right:) these nodes are connected randomly so that the degree distribution is strictly satisfied. (b) An example of a simple degree distribution and the corresponding ensemble of connected components for the case of undirected and directed graphs. Note that in this example there is a qualitative difference between the structure of undirected and directed chains.

Hyperbranched molecules appear in a broad range of topologies that result from various chains of reactions, which occur between selected functional groups. One important feature here is the asymmetry of such reactions: they often occur between functional groups of complementary types (as opposed to symmetric reactions occurring between groups of identical types). To capture this variability we employ the configuration model that is defined by a directed degree distribution. The degree distribution reflects the state of the polymer system as driven by the chemical kinetics, and therefore, this distribution is time-dependent. The configuration model maximises entropy of all possible configurations that satisfy a given degree distribution at a time point of interest. This means that the model is egalitarian with respect to functional groups: every pair of complementary functional groups has equal probability to establish a bond. This principle is illustrated in Figure 3a.

The output of the configuration model is then processed to obtain various global properties of the polymer network, as for instance, the molecular weight distribution (probability that a randomly chosen node belongs to a weakly connected component of size s), the gel fraction (probability that a randomly chosen node belongs to the giant component), the mean-square gyration radii (related to the Wiener index of connected components), and the length of the average shortest path. The mathematical derivations of these results are presented in the Methods.

Reaction kinetics and degree distribution

The evolution of the degree distribution is governed by the reaction kinetics of the step-growth polymerization. This process converts an arbitrary pair of A- and B-groups into a bond, which is represented as an edge between nodes in the model. Since a monomer may carry multiple A- and B-groups, probability that a pair of monomers react is dependent on the number of unreacted functional groups they carry. The formal mechanism for the reaction between two monomers is given by:

$$(i,j,I,J)+(i^{\prime} ,j^{\prime} ,I^{\prime} ,J^{\prime} )\,\mathop{\longrightarrow }\limits^{{k}_{AB}(I-i)(J^{\prime} -j^{\prime} )}\,(i+1,j,I,J)+(i^{\prime} ,j^{\prime} +1,I^{\prime} ,J^{\prime} ),$$
(1)

where vector (i, j, I, J) denotes the state of a monomer: I, J ≥ 0 are the numbers of respectively A- and B-groups on this monomer; i = 0, 1, …, I and j = 0, 1, … J denote the number of groups of respectively type A or B that have already been converted into bonds by the reaction. For each monomer, the indices (I, J) are defined a priori, whereas (i, j) change over time. The reaction rate is given by the product of the rate constant kAB, the number of unreacted A-groups on the first monomer (I − i), and the number of unreacted B-groups on the second monomer (J′ − j′). The following assumptions are made: (1) the reactivity for any pair of A- and B-groups is equal; (2) the reactivity does not change throughout the process. Let Mi,j,I,J(t) be the probability that a randomly chosen monomer has configuration (i, j, I, J) at time t. This probability is proportional to the concentration of the monomers. The time variation \(\frac{\partial {M}_{i,j,I,J}(t)}{\partial t}\) as governed by the process eq. (1) is described by the corresponding master equation, see Methods. In the general case of more than two distinct functional groups (e.g. A-, B-, C-, D-groups) and several reaction mechanisms (e.g. AB-, CD- reaction), the master equation becomes more complex due to a combinatorial number of monomer species of defined type and state. In that case, automated reaction networks can be applied to algorithmically construct the corresponding master equation23.

The temporal degree distribution u(i, j, t) is directly deduced from Mi,j,I,J(t) by summating over all monomer types I, J. As discussed in Methods, the step-growth process eq. (1) leads to the following degree distribution u(i, j, t):

$$u(i,j,t)=\sum _{I,J\ge 0}\,(\begin{array}{c}I\\ i\end{array}){p}_{{\rm{A}}}{(t)}^{i}{(1-{p}_{{\rm{A}}}(t))}^{I-i}(\begin{array}{c}J\\ j\end{array}){p}_{{\rm{B}}}{(t)}^{j}{(1-{p}_{{\rm{B}}}(t))}^{J-j}P(I,J).$$
(2)

This expression is given by the product of binomial distributions for the in- and out-edges, with pA(t) and pB(t) being the probability that a random A-/B-group is reacted (also referred to as A-/B-conversion), see eqs (12) and (13). The probability distribution P(I, J) defines the initial concentration of monomer types and is referred to as the monomer functionality distribution. Probabilities P(I, J) provide the sole input to the model.

In some cases, the degree distribution u(i, j, t) can be measured directly by nuclear magnetic resonance (NMR). Due to the chemical shift in the NMR spectrum, every monomer state has its own distinct frequency. However, the bigger the monomer is and the more functional groups it has, the harder it is to identify the states. In Figure 4 we compare the degree distributions predicted by eq. (2) with the experimental NMR data from ref.50, and also, for a different polymerisation system, against MD simulation data from ref.40.

Figure 4
figure 4

Comparison of the theory to data from experiments and computer simulations. (a) Experimental degree distribution as extracted form the 13C NMR spectrum of hyperbranched polyester (HPE) in dimethylformamide (DMF)50 compared to theoretical degree distribution in the A2:B3 = 1:1 system at pA = 0.93. (b) Phenolic unit degree distribution (only out-edges) in a phenol-methylene system predicted by MD simulation (doted lines)40 compared to the theory (solid lines). The gel point is predicted at pA = 0.58 by the MD simulation, the theory predicts it at pA = 0.5.

Gelation point and gel fraction

The gelation point marks the transition of the system from a liquid-like to a solid-like state during the polymerization process. After this transition, an increasing positive fraction of monomers becomes part of the single gel molecule. The transition is typically observed by measuring the fraction of the insoluble part of the polymer, or performing rheological experiments. From the network theory perspective, this is a well-known phenomenon as the gel transition point corresponds to the emergence of the giant component in the network topology. The size of the giant component is of the order of the whole system size, and we therefore quantify the gel size gf as the probability that a randomly chosen monomer belongs to the giant component. The Methods section explains how gf can be calculated if the degree distribution is known.

The above described process of polymerisation is closely related to percolation on networks. The phenomenon of percolation is always studied on a specific networks, e.g square lattice, Bethe lattice, or any arbitrary network. In polymer chemistry, it is mostly introduced as a process of randomly adding bonds to an empty template of the network51. This process is precisely reverse to the process of removing bonds from the same full network. Let p be the probability that a randomly selected edge is not removed.

Under this notation, percolation can be thought of as a temporal process that starts at p = 0 and ends at p = 1. Moreover, this process is known to feature a phase transition at critical probability pcritical, the point at which the giant component appears. When there are equal amounts of A and B functional groups, this process is precisely reverse to the step-growth polymerisation where the edges are being added to a network randomly, and pcritical coincides with the gel-point conversion of functional groups. If there are more B-groups than A-groups present in the system initially, the A-groups are limiting the reaction and we conventionally refer to the conversion of A groups as the conversion: p = pA. Without loss of generality it may be assumed that the number of A-groups is less or equal to the number of B-groups. The moment in time when gelation occurs is completely defined by the proportion of monomers of different functionalities that are present initially in the system. Generally speaking, the higher the functionality, the earlier gelation occurs in time and/or conversion, and a precise quantitative estimate of the gelation conversion is discussed in Methods. It turns out that one can connect the critical conversion directly to the functionality distribution P(I, J):

$${p}_{{\rm{critical}}}=\frac{{{\rm{\nu }}}_{01}}{{{\rm{\nu }}}_{11}+\sqrt{({{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{01})({{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10})}}.$$
(3)

where \({{\rm{\nu }}}_{mn}=\sum _{I,J\ge 0}{I}^{m}{J}^{n}P(I,J)\). It is important to note that in the case of a symmetric functionality distribution P(I, J) = P(J, I) (see also Methods), eq. (3) gives the same result as its counterpart for an AA-system (undirected system22) with \(P(K)={\sum }_{K=I+J}P(I,J)\). In the case of P(K) being distinct from zero for only one K-functional species, eq. (3) simplifies to the Flory-Stockmayer equation for gelation \({p}_{{\rm{critical}}}=\frac{1}{z-1}\), with z = I + J the number of first neighbours. See Table 1 for illustration.

Table 1 Examples of mappings between symmetric AB-systems and AA-systems and the corresponding critical conversion pcritical. The critical conversions in the first column are calculated by eq. (3), of the second by eq. (43)22 and the third column by the Flory-Stockmayer criterium \(\frac{1}{z-1}\).

Equation (3) allows one to screen a vast number of systems and determine their gel-point conversions if such occur. We also can deduce from the theory discussed in Methods that some systems will never form gel. Here again, one can identify whether a system of given monomer functionalities gelates by studying P(I, J). Namely, a monomer system forms gel if the following inequality is satisfied:

$$({{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{01})({{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10})-{({{\rm{\nu }}}_{11}-{{\rm{\nu }}}_{01})}^{2} > 0.$$
(4)

For the physical properties of the final material, both factors play a definitive role: when does gelation start and how does the gel fraction evolve in the course of the polymerisation? The growth rate of the gel fraction, gf, is determined by the functionalities of the initial monomers and their concentration distribution. In Figure 5, a few examples illustrate different types of behaviour of the gel buildup. In these examples, we optimise the initial functionalities and concentrations of monomers to reach two final target properties: (1) a fixed gel point conversion of either pA,critical = 0.5, as depicted by the solid lines, or pA,critical = 0.33, as indicated by the dashed lines; (2) we distinguish three different types of growth behaviour: (a) a steep growth with most monomers being incorporated into the gel rapidly after gel point, (b) a slow growth with the gel reaching full size only at full conversion, (c) a slow growth with the gel never reaching the system size. Behaviour (a) is observed for systems with purely high-functional monomers, (b) for systems with few high-functional monomers and many 2-functional monomers, and (c) for few high-functional monomers and many 1-functional monomers that act as terminal units. The reason for the gel in (c) never reaching the full system size is the formation of small connected components that stop growing because of having all functional groups being capped with one-functional terminal units. For example, when a component is composed of one 6-functional monomer connected to six 1-functional monomers. The Methods section gives the general equations for the gel-point conversion and gel fractions.

Figure 5
figure 5

Illustration of three types of gel growth behaviour: (a) steep growth, (b) slow growth, (c) gel does not reach full system size. The different types of behaviour are cased by the composition of monomers. Two groups are depicted: (1) solid lines, gel point at pA,critical = 0.5, (a) A3:B3 = 1:1, (b) A6:B6:A2:B2 = 1:1:9:9, (c) A6:B6:A1:B1 = 1:1:9:9; (2) dashed lines, gel point at pA,critical = 0.33, (a) A4:B4 = 1:1, (b) A8:B8:A2:B2 = 1:1:8:8, (c) A8:B8:A1:B1 = 1:1:11:11.

These results are also interesting when studying polymer ageing and degradation. Consider a degradation process under which every chemical bond dissociates independently with equal probability. This process is reverse to the introduced polymerization process, and the gel fraction is a measure of how strongly the system is interconnected. Clearly, the systems of type (a) will show a different behaviour during degradation than systems of type (b). For (a), the system will stay connected for a long time, but will eventually collapse into many small pieces quite abruptly. Type (b) systems will show a more continuous, and therefore more predictable, degradation behaviour, which might be more desirable from application point of view.

Molecular size distribution, averages and asymptotes

From a network theory perspective, a separate polymer molecule is a connected component. The sizes of the latter are typically characterised by a size distribution. There are two common ways that such distributions can be defined. The molecular weight distribution w(s) corresponds to the probability that a randomly chosen monomer belongs to a connected component of size s, whereas the molecular size distribution n(s) is the probability that a randomly chosen component has size s. One can be converted into the other by an appropriate weighting and normalisation: n(s) = Cs−1w(s). In Methods we present an exact equation that connects the degree distribution with the molecular weight distribution w(s). As a general rule, the exact values of w(s) can be computed spending \({\mathscr{O}}({s}^{2}\,\mathrm{log}\,s)\) multiplicative operations, and in a special case of only one initial monomer type, the analytic expression for the molecular weight distribution is given in Methods.

The global behaviour that is observed in all polymerising systems can be summarised as follows: Initially, all monomers are unconnected, thus only molecules of size s = 1 are present. With increasing conversion pA, larger molecules emerge. The size distribution features the exponential decrease at the tail, and becomes broader with progressing conversion until the gel point pA,critical is reached. Only at this single point the size distribution becomes scale-free. Figure 6a demonstrates this behaviour on an example. After the gel transition point, the size distribution describes only the soluble part of the system, so that the size of the gel is given by the gel fraction \({g}_{f}=1-\sum _{s=1}^{\infty }\,w(s)\). Furthermore, the size distribution returns to its exponential behaviour and becomes narrower with increasing conversion.

Figure 6
figure 6

(a) Molecular size distributions of the system A2:B3 = 3:2 at different conversions as indicated by the colour scheme. At pA,critical = 0.707 the distribution is scale-free, \(n(s)\propto {s}^{-5/2}\). (b) Molecular size distributions n(s) and their critical asymptotes as obtained for two systems: (1) A2:B3 = 3:2 and (2) AB2. (c) Weight-average molecular weight for the system A2:B3 = 3:2 features singularity at pcritical = 0.7. (d) Oscillating size distributions of the system AB2:A1:B1 = 1:2:1 at different conversions as indicated by the colour scheme. Examples of terminated (black colour) and living (grey colour) polymer topologies are indicated for reference.

Surprisingly, there are two distinct types of polymerisation systems featuring different types of asymptotic behaviour. Most of polymer systems feature a size distribution with asymptote

$$n(s)\propto {e}^{-{C}_{1}s}{s}^{-5/2}.$$
(5)

This, for instance, includes the A2:B3 = 3:2 system as illustrated in Figure 6a. However, some polymers may also feature a different asymptotic mode, namely

$$n(s)\propto {e}^{-{C^{\prime} }_{1}s}{s}^{-3/2}.$$
(6)

This asymptote arises in all systems of type ABn, for n > 1. These two distinct types of asymptotic behaviours can also be attributed to the different types of kernel functions (additive versus product kernel) in the underlying aggregation equation11,12. In Figure 6b, this peculiar case of asymptotic behaviour is illustrated by comparing two very similar polymer systems AB2 and A2 + B3 that yet feature different asymptotic modes.

The gel transition is also noticeable in the evolution of weight average molecular weight Mw, which features a singularity at the critical point. The evolution of Mw as predicted by the theory is compared against stochastic simulations in Figure 6c. The figure shows good agreement between the theory and the numerical simulation except for critical conversion. At this point, the stochastic simulations (scatter plot) suffer form the small-system-size effect. The Methods section gives analytical equations for the weight average molecular weight in the pre-gel and gel regimes.

Interestingly, in some cases molecular size distributions feature oscillations. One such example is given in Figure 6d, depicting the system AB2:A1:B1 = 1:2:1. At full conversion pA = 1 (dark red line) only molecules of specific favoured sizes are present. At lower conversions, also other sizes occur that exhibit strongly reduced probability as compared to the favoured sizes. In this example the concentration of odd-size component is favoured, however, depending on the distribution of functionalities such oscillations may occur with arbitrary large “periods”. It turns out, that the monomers of functionality one play an important role in these oscillations, as they terminate the growth of polymer molecules and thus fix sizes of these molecules at a constant value.

Gyration radius

Consider a branched polymer molecule that is composed of s monomers. The actual volume this molecule spans is related to how’branched’ this molecule is. In systems that contain no gel, or are below the gel transition, it is conventional to characterise this volume by the a quantity called gyration radius Rg(s), which can also be estimated by light scattering experiments in a polymer solution52.

Linear chains feature \({R}_{{\rm{g}},{\rm{lin}}}(s)=b\sqrt{\frac{s}{6}}\), where b is the Kuhn length. In Methods, we derive the analytical equation that links the degree distribution and the mean square gyration radius for \(s\gg 1\). Figure 7a shows how one can influence Rg(s) by tuning the set of initial monomers. This figure also compares the theoretical gyration radii against gyration radii obtained form stochastically generated networks. An alternative way of looking at the gyration radius is the contraction factor53, which is defined by \(g(s)={R}_{{\rm{g}}}^{2}(s)/{R}_{{\rm{g}},{\rm{lin}}}^{2}(s)\). The contraction factor tells us how much more compact the actual molecule is in comparison to the linear one having the same number of monomers. Figure 7b compares the theoretical predictions of this quantity versus simulations.

Figure 7
figure 7

Comparison of the theory (solid lines) to simulation data (scattered data). The simulation data is the average over 100 generated networks that consist of N = 10000 nodes. (a) Gyration radius, three different systems are investigated: (1) linear AB, (2) sparsely branched A2B2:AB = 1:49, (3) hyperbranched A2B2. (b) Contraction factor of branched components, two systems are considered: (1) sparsely branched A2B2:AB = 1:49, (2) hyperbranched A2B2.

Another unexpected result that is revealed by directed random graph theory, is that during the progress of step-growth polymerisation, for all molecules of size s the mean gyration radius is constant in time. Since the size distribution does change in time, this time-independency is lost if one calculates an average of this quantity over different molecular sizes. It is likely that other than step-growth polymerisation processes do not feature time-invariant gyration radii.

The gyration radius is also directly proportional to the Wiener index54, which is another topological index to characterise branched molecules. The Wiener index \({\mathscr{W}}(s)\) is defined as the sum of the lengths of the shortest paths between all pairs of the monomer units in the molecule. The relation between the mean square gyration radius and the Wiener index is given by \({\mathscr{W}}(s)=\frac{{R}_{{\rm{g}}}^{2}(s)}{{s}^{2}}\).

Scaling of the node neighbourhood size and criticism of well-mixing assumption

Since the radius of gyration characterises only finite-sized molecules rather than the gel, which is virtually infinite in size, we are in need of devising an additional measure for the degree of connectedness of the gel. With this aim, we investigate the number of nodes at distance l from a randomly selected one, which we shortly refer to as \({ {\mathcal B} }_{l}\). Here, we utilise the topological notion of distance: nodes i and j are at distance l if the shortest undirected path connecting these nodes has length l. Since the giant component is infinite in size, the larger the distance l the more nodes are incorporated in the volume of a sphere. In fact, we are mainly interested in the way this quantity asymptotically depends on \(l\gg 1\). Newman derived an expression for a similar quantity for directed paths41, whereas in the polymer context we are interested in the weak sense of connectivity. This means that we consider the shortest paths that does not respect the directionality of the bonds.

An analytical expression for this behaviour is derived in Methods. We observe that the average path length in the gel exponentially depends on l:

$${ {\mathcal B} }_{l}=C{e}^{l/{l}_{0}},\,l\gg 1,$$
(7)

where C and l0 are constants. Note that structures that reside in three-dimensional Euclidean space must feature a different scaling law:

$${ {\mathcal B} }_{l}=C{N}^{{d}_{f}},$$
(8)

where 1 ≤ df ≤ 3 is the cluster growth dimension. The estimate given in eq. (7) is quite a discouraging result, as a network that features an exponential scaling cannot be physically embedded in the Euclidean space under the condition that the nodes are uniformly distributed with constant density. This unphysical exponential growth of a node neighbourhood is caused by one of the fundamental assumptions made in the random graph model, and many other popular models that do not explicitly track the spatial configuration of the network also suffer from the same criticism. In fact, we refer here to one of the most commonly used assumptions of a well-mixed system: any two functional groups react with an equal probability irrespectively of their location in the topology. In real systems, however, monomers interact with the rest of the network that may locally hinder them to react. That said, it is important to note, that the scaling given by eq. (7) does not hold before, and precisely at, the gel transition, and the well-mixed system assumption may remain a good approximation in these regimes.

Discussion

This paper employs recent developments in network science to formalise the step-growth polymerisation process as a problem fully described by network generation. In order to do so, we proposed to view the polymer architecture as a directed network that is described by a dynamic degree distribution. Although the physical connection between monomers by means of covalent bonding is completely symmetrical, the directionality of the edges keeps record of the asymmetry of the chemical reaction that created the corresponding covalent bond. This approach allows a classification and a general treatment of a vast range of real-world polymerisation problems and can be used for optimisation and design of new materials. As a general rule, the parameters of step-growth polymerising systems comprise a high-dimensional parameter space that dictates the reaction kinetics, network structures, and physical properties of the final material. Therefore, it is important to have a fast way to map the polymerisation parameters to the final topological properties of the polymer network.

We have matched various idioms present in polymer chemistry to corresponding graph-theoretical analogues and indicated how these can be predicted knowing the input parameters of the system by means of analytical expressions. For instance, we gave analytical quantifications of the gelation time, the topological phase transition and the associated to it molecular weight singularity, the molecular size distribution and its asymptotes, the gyration radii of polymers, and the scaling of the monomer neighbourhood size. Some of these findings also provide an unexpected qualitative insight on chemistry of polymerisation. For instance, we have revealed the existence of two asymptotical modes that appear in the molecular size distribution, the fact that these size distributions might feature a peculiar oscillating behaviour and that the gyration radii in step-growth polymerised molecules are not dependent on time but only on the sizes of these molecules.

On a broader scope, this work paves the way to viewing polymer networks, as well as other types of network formations in condensed matter, as being complex networks in which the topology and additional layers of information (multiplexity) all contribute to the formation and function of the macroscopic material. The theoretical framework that addresses multiplexity is a topic of current network science, and one of the goals of the authors is to advance the understanding of multiplex networks in condensed matter in their future works.

Although they were produced to aid polymer chemistry in the first place, these findings are also relevant to a broader network science community as one can view polymerisation as a process that is related to percolation. In this way, the paper amounts to understanding percolation in directed networks, which turn out to feature a richer behaviour then in undirected networks. Moreover, maintaining the link to polymers and polymerisation allows one to validate complex network theories with physics. Newly developed theories of such kind have been predominantly compared to experimental data derived from single point observations, which are hard to reproduce. For example, we cannot grow the Internet from scratch again. Yet, the degree distribution may be measured by NMR and more experimental techniques that will extract network properties from matter are on their way.

We finalise the paper with a note of caution that is addressed to the whole modelling community of polymer networks. The network analysis of the node-neighbourhood scaling in the configuration model points out the existence of unphysical features that appear after the gel transition. Importantly, the unphysical scaling is not an artefact of our approach but rather an implication of the commonly trusted assumption of chemical systems being well-mixed. This assumption is standard for many modelling methods that do not account for spacial embedding of the chemical species in the three-dimensional space, as for instance is the case for the rate equations, Flory-Stockmayer theory, population balance equations, kinetic Monte Carlo, and other methods. We therefore would like to encourage the search of new network models that bring together mutual interaction of the topology and space.

Methods

Master equation of the degree distribution

In this subsection we briefly summarise the theory from ref.29 that allows us to recover the time evolution of the bivariate degree distribution by constructing an analytically solvable master equation. We distinguish monomer species by counting the numbers of functional groups of both types I, J and the numbers of in- and out-edges i, j. During the progress of polymerisation the functional groups are converted into chemical bonds between the monomers, and the concentration profiles Mi,j,I,J(t) obey the following master equation:

$$\begin{array}{rcl}\frac{\partial }{\partial t}{M}_{i,j,I,J}(t) & = & (I-i+1)({{\rm{\nu }}}_{01}-\mu (t)){M}_{i-1,j,I,J}(t)\\ & & +\,(J-j+1)({{\rm{\nu }}}_{10}-\mu (t)){M}_{i,j-1,I,J}(t)\\ & & -\,(\,(I-i)({{\rm{\nu }}}_{01}-\mu (t))+(J-j)({{\rm{\nu }}}_{10}-\mu (t))\,){M}_{i,j,I,J}(t).\end{array}$$
(9)

Since initially, at t = 0, there are no bonds, the system is completely described by the distribution of functional groups: Mi,j,I,J(0) = P(I, J), i, j = 0 and Mi,j,I,J = 0 for i > 0 or j > 0. This linear master equation describes the evolution of monomer species Mi,j,I,J(t) in time. Such a description is different from the commonly used aggregation equation, which is a non-linear master equation that describes the evolution of the polymer species of different size. The most important information we like to extract from this master equation is the time dependant degree distribution u(i, j, t). The latter is readily obtained by lumping together all monomer species having the same numbers of in- and out-edges:

$$u(i,j,t)=\sum _{I,J\ge 0}\,{M}_{i,j,I,J}(t).$$
(10)

The master eq. (10) is a linear differential-difference equation that can be transformed to an analytically solvable system of partial differential equations by applying the generating function (GF) transform. We thus directly proceed by writing the expression for the degree distribution:

$$u(i,j,t)=\sum _{I,J\ge 0}\,(\begin{array}{c}I\\ i\end{array}){p}_{{\rm{A}}}{(t)}^{i}{(1-{p}_{{\rm{A}}}(t))}^{I-i}(\begin{array}{c}J\\ j\end{array}){p}_{{\rm{B}}}{(t)}^{j}{(1-{p}_{{\rm{B}}}(t))}^{J-j}P(I,J).$$
(11)

where

$$\begin{array}{l}{p}_{{\rm{A}}}(t)=\frac{\mu (t)}{{{\rm{\nu }}}_{10}},\,{\rm{and}}\,{p}_{{\rm{B}}}(t)=\frac{\mu (t)}{{{\rm{\nu }}}_{01}},\end{array}$$
(12)

denote the fractions of converted in- and out-edges and

$$\mu (t)={\mu }_{10}(t)={\mu }_{01}(t)={\nu }_{01}-\frac{{{\rm{\nu }}}_{01}({{\rm{\nu }}}_{01}-{{\rm{\nu }}}_{10})}{{{\rm{\nu }}}_{01}-{{\rm{\nu }}}_{10}{e}^{t({{\rm{\nu }}}_{10}-{{\rm{\nu }}}_{01})}}$$
(13)

is the expected in/out degree. Here, νmn and μmn(t) refer to the mixed moments of respectively P(I, J) and u(i, j, t):

$$\begin{array}{rcl}{{\rm{\nu }}}_{mn} & = & \sum _{I,J\ge 0}\,{I}^{m}{J}^{n}P(I,J),\\ {\mu }_{mn}(t) & = & \sum _{i,j\ge 0}\,{i}^{m}{j}^{n}u(i,j,t).\end{array}$$
(14)

Note, that since the expected in- and out-degrees coincide, we have μ(t) = μ10(t) = μ01(t). Mixed moments of the degree distribution μmn(t) can be obtained in the form of analytical expressions by performing the appropriate summations of eq. (11). For instance, the list of mixed moments up to order n + m = 2 is as follows:

$$\begin{array}{rcl}{\mu }_{00}(t) & = & 1,\\ {\mu }_{10}(t) & = & {\mu }_{01}(t)=\mu (t)={p}_{{\rm{A}}}(t){{\rm{\nu }}}_{10}={p}_{{\rm{B}}}(t){{\rm{\nu }}}_{01},\\ {\mu }_{20}(t) & = & {p}_{{\rm{A}}}(t)({p}_{{\rm{A}}}(t){{\rm{\nu }}}_{20}-{p}_{{\rm{A}}}(t){{\rm{\nu }}}_{10}+{{\rm{\nu }}}_{10}),\\ {\mu }_{02}(t) & = & {p}_{{\rm{B}}}(t)({p}_{{\rm{B}}}(t){{\rm{\nu }}}_{02}-{p}_{{\rm{B}}}(t){{\rm{\nu }}}_{01}+{{\rm{\nu }}}_{01}),\\ {\mu }_{11}(t) & = & {p}_{{\rm{A}}}(t){p}_{{\rm{B}}}(t){{\rm{\nu }}}_{11},\\ {\mu }_{22}(t) & = & {p}_{{\rm{A}}}(t){p}_{{\rm{B}}}(t)({p}_{{\rm{A}}}(t){p}_{{\rm{B}}}(t)({{\rm{\nu }}}_{22}-{{\rm{\nu }}}_{21}-{{\rm{\nu }}}_{12}+{{\rm{\nu }}}_{11})\\ & & +\,{p}_{{\rm{A}}}(t)({{\rm{\nu }}}_{21}-{{\rm{\nu }}}_{11})+{p}_{{\rm{B}}}(t)({{\rm{\nu }}}_{12}-{{\rm{\nu }}}_{11})+{{\rm{\nu }}}_{11}).\end{array}$$
(15)

Below we make use of the expressions for μi,j(t) to determine the gelation conversion, whereas u(i, j, t) is linked to the distribution of molecular weights, to the average molecular weight, and to the typical shortest path length.

Generating functions of the degree and excess degree distributions

In this section, we omit the time dependance in the degree distribution (11), and demonstrate how one can study directed networks defined by the bivariate degree distribution as combinatorial species55,56. The GF of a bivariate distribution is formally given by57:

$$U(x,y)=\sum _{i,j\ge 0}\,u(i,j){x}^{i}{y}^{j},$$
(16)

with |x|, |y| ≤ 1, \(x,y\in {\mathbb{C}}\) and \(U(x,y){|}_{x,y=1}=1\). The excess distributions uin(i, j) and uout(i, j), are defined as the degree distributions of nodes that are reached by randomly choosing an in- or out-edge:

$$\begin{array}{rcl}{u}_{{\rm{in}}}(i,j) & = & \frac{1}{\mu }(i+1)u(i+1,j),\\ {u}_{{\rm{out}}}(i,j) & = & \frac{1}{\mu }(j+1)u(i,j+1).\end{array}$$
(17)

The GFs of the latter distributions are given by:

$$\begin{array}{l}{U}_{{\rm{in}}}(x,y)=\frac{1}{\mu }\frac{\partial U(x,y)}{\partial x},\\ {U}_{{\rm{out}}}(x,y)=\frac{1}{\mu }\frac{\partial U(x,y)}{\partial y},\end{array}$$
(18)

and satisfy \({U}_{{\rm{in}}}(x,y{)|}_{x,y=1}=1\) and \({U}_{{\rm{out}}}(x,y{)|}_{x,y=1}=1\). Plugging the degree distributions (11) into eqs (1618) gives:

$$\begin{array}{l}U(x,y)=\sum _{I,J\ge 0}P(I,J)((x-\mathrm{1)}{p}_{{\rm{A}}}+{\mathrm{1)}}^{I}{((y-\mathrm{1)}{p}_{{\rm{B}}}+\mathrm{1)}}^{J},\\ {U}_{{\rm{in}}}(x,y)=\frac{1}{\mu }\sum _{I,J\ge 0}P(I,J)I{p}_{{\rm{A}}}{((x-\mathrm{1)}{p}_{{\rm{A}}}+\mathrm{1)}}^{I-1}{((y-\mathrm{1)}{p}_{{\rm{B}}}+\mathrm{1)}}^{J},\\ {U}_{{\rm{out}}}(x,y)=\frac{1}{\mu }\sum _{I,J\ge 0}P(I,J)J{p}_{{\rm{B}}}{((x-\mathrm{1)}{p}_{{\rm{A}}}+\mathrm{1)}}^{I}{((y-\mathrm{1)}{p}_{{\rm{B}}}+\mathrm{1)}}^{J-1}.\end{array}$$
(19)

Having these expressions in hand allows us to link the moments of the degree distribution u(i, j, t) as defined by eq. (14) to the partial derivatives of the GFs by writing:

$${\mu }_{mn}={[{(x\frac{\partial }{\partial x})}^{m}{(y\frac{\partial }{\partial y})}^{n}U(x,y)]|}_{x,y=1}.$$
(20)

These relations are used below to derive analytical expressions for various global features of the polymer network.

Molecular weight distribution

From chemical point of view any cluster of monomers that are connected together by means of covalent bonds is considered to be a molecule. In our directed network a molecule is therefore represented by a connected component, whereas the molecular weight is simply the size of this component. The distribution of molecular weights is a popular descriptor of polymer materials. In fact, there are two ways to define such distribution: the probability w(s) that a randomly chosen monomer belongs to a component of size s is called the molecular weight distribution. Alternatively, by applying the weight \(\frac{1}{s}\) we obtain the molecular size distribution,

$$n(s)=\frac{C}{s}w(s),$$

that is the probability that a randomly chosen molecule has size s. In the latter equation C provides the appropriate normalisation of probability.

Here we link the molecular weight distribution w(s) to the size distribution of connected components in the directed configuration model as derived in ref.43, and briefly discuss the insights that this interpretation brings to understanding the step-growth polymerisation polymerisation process. The first values of w(s) can be found by following simple considerations. For instance w(1) is the probability to choose an isolated node with no neighbours, and therefore:

$$w(1)=u(0,0).$$

Furthermore, w(2) is the probability that a randomly chosen node has one edge and its only neighbour has no edges except the one that connects it with the first node:

$$w(2)=\frac{2}{\mu }u(1,0)u(0,1).$$

Continuing this list would lead to a combinatorial explosion of possibilities. A much faster alternative is to employ the GFs. The GF for w(s) is formally defined as:

$$W(x)=\sum _{s}\,w(s){x}^{s},\,|x|\le 1,\,x\in {\mathbb{C}},$$
(21)

and is obtained from the following system of functional equations:

$$\begin{array}{rcl}W(x) & = & xU[{W}_{{\rm{out}}}(x),{W}_{{\rm{in}}}(x)],\\ {W}_{{\rm{in}}}(x) & = & x{U}_{{\rm{in}}}[{W}_{{\rm{out}}}(x),{W}_{{\rm{in}}}(x)],\\ {W}_{{\rm{out}}}(x) & = & x{U}_{{\rm{out}}}[{W}_{{\rm{out}}}(x),{W}_{{\rm{in}}}(x)],\end{array}$$
(22)

where the GFs U(x, y), Uin(x, y) and Uout(x, y) are defined by eqs (16) and (18). The functions \({W}_{{\rm{in}}}(x)={\sum }_{s > 0}\,{w}_{{\rm{in}}}(s){x}^{s}\) and \({W}_{{\rm{out}}}(x)={\sum }_{s > 0}\,{w}_{{\rm{out}}}(s){x}^{s}\) denote the GFs of the excess component size distributions win(s) and wout(s).

The formal solution to eq. (22) is given by the following relation43:

$$w(s)=\{\begin{array}{cc}\sum _{m=0}^{s-1}a(m,s-m-1), & s > 1,\\ u(0,0), & s=1,\end{array}$$
(23)

where

$$a(m,n)={u(i,j)\ast {u}_{{\rm{out}}}{(i,j)}^{\ast m-1}\ast {u}_{{\rm{in}}}{(i,j)}^{\ast n-1}\ast d(i,j)|}_{\begin{array}{l}i=m\\ j=n\end{array}},\,m,\,n\ge 0,$$
(24)

and

$$d(i,j)=[{u}_{{\rm{out}}}(i,j)-i{u}_{{\rm{out}}}(i,j)]\,\ast \,[{u}_{{\rm{in}}}(i,j)-j{u}_{{\rm{in}}}(i,j)]-j{u}_{{\rm{out}}}(i,j)\,\ast \,i{u}_{{\rm{in}}}(i,j\mathrm{)}.$$
(25)

Here, f(i, j)*n denotes the convolution power \(f{(i,j)}^{\ast n}=f{(i,j)}^{\ast n-1}\,\ast \,f(i,j),\,f{(i,j)}^{\ast 0}:\,=1\), and the bivariate convolution is defined as

$$f(i,j)\,\ast \,g(i,j)=\sum _{{i}_{1}+{i}_{2}=i,{j}_{1}+{j}_{2}=j}\,f({i}_{1},{j}_{1})g({i}_{2},{j}_{2}\mathrm{)}.$$
(26)

In practice, numerical values of the convolution can be conveniently obtained by Fast Fourier Transform FFT. The asymptotical analysis of w(s) yields two distinct asymptotes:

$${w}_{\infty }(s)\propto {s}^{-\frac{3}{2}}{e}^{-{C}_{1}s},\,{\rm{if}}\,{{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10} > 0\,{\rm{or}}\,{{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{10} > 0,$$
(27)

and

$${w}_{\infty }(s)\propto {s}^{-\frac{1}{2}}{e}^{-{C^{\prime} }_{1}s},\,{\rm{if}}\,{{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10}=0\,{\rm{or}}\,{{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{10}=0.$$
(28)

The exact expression for the coefficients C1 and C1′ are given in ref.43 and νnm are as defined in eq. (14).

Systems with a single monomer type

Although the numerical values of eq. (23) are accessible at the cost of \({\mathscr{O}}({s}^{2}\,\mathrm{log}\,s)\) operations, explicit analytical relations can be obtained in some special cases. Consider the case when there is only one monomer species bearing I groups of type A and J groups of type B, that is the AIBJ monomer. We will now derive an explicit analytical expression for w(s). Note that in this case, the distribution of initial functionalities is trivial, P(I, J) = 1, and therefore the expression for the degree distribution given in eq. (11) simplifies to a bivariate binomial distribution:

$$u(i,j,t)=(\begin{array}{c}I\\ i\end{array}){p}_{{\rm{A}}}{(t)}^{i}{(1-{p}_{{\rm{A}}}(t))}^{I-i}(\begin{array}{c}J\\ j\end{array}){p}_{{\rm{B}}}{(t)}^{j}{(1-{p}_{{\rm{B}}}(t))}^{J-j},$$
(29)

where

$${p}_{{\rm{A}}}(t)=J\frac{{{\rm{e}}}^{t(I-J)}-1}{I{{\rm{e}}}^{t(I-J)}-J}\,{\rm{and}}\,{p}_{{\rm{B}}}(t)=I\frac{{{\rm{e}}}^{t(I-J)}-1}{I{{\rm{e}}}^{t(I-J)}-J}.$$
(30)

The first values of w(s) are readily obtained by writing:

$$\begin{array}{rcl}w\mathrm{(1)} & = & u\mathrm{(0,0)}={\mathrm{(1}-{p}_{A})}^{I}{\mathrm{(1}-{p}_{B})}^{J},\\ w\mathrm{(2)} & = & \frac{2}{\mu }u\mathrm{(1,0)}u\mathrm{(0,1)}=2J{p}_{B}{\mathrm{(1}-{p}_{A})}^{2I-1}{\mathrm{(1}-{p}_{B})}^{2J-1}\\ & = & 2I{p}_{A}{\mathrm{(1}-{p}_{A})}^{2I-1}{\mathrm{(1}-{p}_{B})}^{2J-1}.\end{array}$$

Since u(i, j) has a binomial form, one can analytically solve the convolution powers appearing in eq. (24) to obtain:

$$\begin{array}{rcl}w(s) & = & {p}_{{\rm{B}}}^{s-2}{\mathrm{(1}-{p}_{{\rm{A}}})}^{(I-\mathrm{1)}s+1}{\mathrm{(1}-{p}_{{\rm{B}}})}^{(J-\mathrm{1)}s+1}\\ & & \times \,({p}_{{\rm{B}}}(\begin{array}{c}Js\\ s-1\end{array})\,A-{p}_{{\rm{B}}}J(\begin{array}{c}Js-1\\ s-2\end{array})B-{p}_{{\rm{A}}}(I-\mathrm{1)}(\begin{array}{c}Js-2\\ s-2\end{array})C),\end{array}$$
(31)

where factors A, B, and C are defined via the Hypergeometric function:

$$\begin{array}{rcl}A & = & {}_{2}{\rm{F}}_{1}(1-s,\,(I-\mathrm{1)}s+\mathrm{1;}\,\,\,-Js;\,\,\frac{{p}_{{\rm{A}}}}{{p}_{{\rm{B}}}}),\\ B & = & {}_{2}{\rm{F}}_{1}(2-s,\,(I-\mathrm{1)}s+\mathrm{1;}\,\,\,1-Js;\,\,\frac{{p}_{{\rm{A}}}}{{p}_{{\rm{B}}}}),\\ C & = & {}_{2}{\rm{F}}_{1}(2-s,\,(I-\mathrm{1)}s+\mathrm{2;}\,\,\,2-Js;\,\,\frac{{p}_{{\rm{A}}}}{{p}_{{\rm{B}}}}).\end{array}$$

As a quick check-up, we consider the AB2-system by setting I = 1, J = 2 and \(p=\frac{{p}_{A}}{2}={p}_{B}\le 0.5\). This substitution reduces eq. (31) to:

$$w(s)=s\mathrm{(1}-2p){p}^{s-1}{\mathrm{(1}-p)}^{s+1}\frac{\mathrm{(2}s)!}{s!(s+\mathrm{1)!}}.$$
(32)

This result is identical, up to the normalisation factor s, to the size distribution derived in ref.11. Note we normalise with s to guarantee that w(s) is a proper probability mass function, that is \(\sum _{s\ge 1}\,w(s)=1\).

Weight-average molecular weight

The weight-average molecular weight is a widely-used quantity in polymer chemistry. It is defined by the follwing ratio:

$$\langle s\rangle :\,=\frac{{\sum }_{s > 0}\,sw(s)}{{\sum }_{s > 0}\,w(s)}=\frac{W^{\prime} \mathrm{(1)}}{W\mathrm{(1)}}.$$
(33)

Note that before the gel point, w(s), win(s) and wout(s) are appropriately normalised,

$$\sum _{s > 0}\,w(s)=W(1)={W}_{{\rm{in}}}(1)={W}_{{\rm{out}}}(1)=1.$$

By plugging eq. (22) into definition (33) we obtain:

$${\langle s\rangle }_{t < {t}_{{\rm{gel}}}}=\frac{W^{\prime} \mathrm{(1)}}{W\mathrm{(1)}}=W^{\prime} \mathrm{(1)}=\tfrac{{\mu }^{2}(\,-2{\mu }_{11}+{\mu }_{20}+{\mu }_{02})}{{\mu }_{11}^{2}-2\mu {\mu }_{11}-{\mu }_{02}{\mu }_{20}+\mu ({\mu }_{20}+{\mu }_{02})}+1.$$
(34)

After gel transition, the latter expression becomes more complex and reads:

$$\begin{array}{c}{\langle s\rangle }_{t > {t}_{{\rm{g}}{\rm{e}}{\rm{l}}}}={\textstyle \tfrac{{W}^{{\rm{^{\prime} }}}(1)}{W(1)}}={\textstyle \tfrac{{\mu }^{2}}{U({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})}}\cdot {\textstyle \tfrac{2{r}_{{\rm{i}}{\rm{n}}}{r}_{{\rm{o}}{\rm{u}}{\rm{t}}}(\mu -{U}_{11}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}}))+{r}_{{\rm{i}}{\rm{n}}}^{2}{U}_{02}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})+{r}_{{\rm{o}}{\rm{u}}{\rm{t}}}^{2}{U}_{20}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})}{{\mu }^{2}-2\mu {U}_{11}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})-{U}_{02}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}}){U}_{20}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})+{U}_{11}^{2}({r}_{{\rm{o}}{\rm{u}}{\rm{t}}},{r}_{{\rm{i}}{\rm{n}}})}}+1,\end{array}$$
(35)

where \({U}_{lm}(x,y)={(\frac{\partial }{\partial x})}^{l}{(\frac{\partial }{\partial y})}^{m}U(x,y)\) denote the partial derivatives, and (rin, rout) is the solution of the following system of equations:

$$\begin{array}{rcl}{r}_{{\rm{in}}} & = & {U}_{{\rm{in}}}({r}_{{\rm{out}}},{r}_{{\rm{in}}}),\\ {r}_{{\rm{out}}} & = & {U}_{{\rm{out}}}({r}_{{\rm{out}}},{r}_{{\rm{in}}}\mathrm{)}.\end{array}$$
(36)

Phase transition and gel fraction

During the evolution of the network the functional groups are converted into edges and at some critical point the system accumulates so many edges that it percolates. This critical moment can be identified by a few alternative methods. For instance, one may study the asymptotical behaviour of the size distribution of connected components as given by (27). This asymptote becomes scale-free at the critical point. The other alternative is to directly detect the percolation phase transition by looking at the degree distribution itself. In this case, the changes that occur in the degree distribution at the critical point are more subtle, yet they can be detected by a specially designed criticality criterion. This criterion was introduced by Molloy and Reed for undirected networks47, and was later generalised to the case of directed networks in ref.29. In this section we briefly discuss the implications of the latter theory on our dynamic polymer network.

If the only available information about a system is its degree distribution, we can detect whether the system is in the gel regime by the following criterion:

$${\mu }_{11}^{2}-2\mu {\mu }_{11}-{\mu }_{02}{\mu }_{20}+\mu ({\mu }_{20}+{\mu }_{02})\le 0.$$
(37)

The conversion of A-groups at the critical point is given by:

$${p}_{{\rm{A}},{\rm{critical}}}=\frac{{{\rm{\nu }}}_{01}}{{{\rm{\nu }}}_{11}+\sqrt{({{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{01})({{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10})}}.$$
(38)

Thus, if pA(t) > pA,critical the system contains gel. Some system, however, never produce gel. This happens because the initial configuration of the system does not have a sufficient amount of high functional monomers, or too many monomers of functionality one that terminate the growth of the network. In either case this statement can be quantified by looking at the moments of the functionality distribution P(I, J): the phase transition occurs in finite time if at least one of the following conditions is true:

$$\begin{array}{ll} & ({{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{01})({{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10})-{({{\rm{\nu }}}_{11}-{{\rm{\nu }}}_{10})}^{2} > 0,\,{\rm{and}}\,{{\rm{\nu }}}_{01}\le {{\rm{\nu }}}_{10}.\\ {\rm{or}} & \\ & ({{\rm{\nu }}}_{02}-{{\rm{\nu }}}_{01})({{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10})-{({{\rm{\nu }}}_{11}-{{\rm{\nu }}}_{01})}^{2} > 0,\,{\rm{and}}\,{{\rm{\nu }}}_{01}\ge {{\rm{\nu }}}_{10},\end{array}$$
(39)

The gel fraction is defined as the probability that a randomly selected node belongs to the gel molecule. The GF of the component size distribution W(x) only describes the components of finite size, and the gel fraction is found as the mass deficit that departs from zero at the phase transition. The amount of this ‘lost’ mass, that is the probability that a randomly chosen monomer belongs to the gel, is given by

$${g}_{f}=1-r,$$
(40)

where r = W(1). This means that in order to recover gf, one needs to solve the equation for W(x) only at a single point x = 1. By substituting x = 1 into (22) one obtains:

$$\begin{array}{rcl}r & = & U({r}_{{\rm{out}}},{r}_{{\rm{in}}}),\\ {r}_{{\rm{in}}} & = & {U}_{{\rm{in}}}({r}_{{\rm{out}}},{r}_{{\rm{in}}}),\\ {r}_{{\rm{out}}} & = & {U}_{{\rm{out}}}({r}_{{\rm{out}}},{r}_{{\rm{in}}}),\end{array}$$
(41)

where U(x, y), Uin(x, y) and Uout(x, y) are given by eq. (19).

It is important to note that the directionality of the network, as it is introduced in the present paper, is decisive only in the case of an asymmetric degree distribution \(u(i,j)\ne u(j,i)\). In case of a symmetric distribution u(i, j) = u(j, i), the directed network model will give the same results as the undirected model supplied with degree distribution \(u(k)={\sum }_{k=i+j}\,u(i,j)\). This implies that an AB-system (undergoing AB-reactions) with symmetric initial input (P(I, J) = P(I, J)) leads to the same network as an AA-system (undergoing AA-reactions) with \(P(K)={\sum }_{K=I+J}\,P(I,J)\). For example, the AB-system A3:B3 = 1:1 forms a topologically identical polymer network (when disregarding the directionality of the edges) as the AA-system A3. With this in mind, eq. (38) simplifies for the case of symmetric functionality distribution P(I, J) = P(I, J) as follows:

  1. 1.

    Symmetry leading to

    $${p}_{{\rm{critical}}}=\frac{{{\rm{\nu }}}_{10}}{{{\rm{\nu }}}_{11}+{{\rm{\nu }}}_{20}-{{\rm{\nu }}}_{10}},$$
    (42)

    with ν01 = ν10 and ν02 = ν20.

  2. 2.

    Mapping the symmetric AB-system to an AA-system (e.g. A3:B3 = 1:1 ≡ A3, A2B:AB2 = 1:1 ≡ A3) simplifies to

    $${p}_{{\rm{critical}}}=\frac{{{\rm{\nu }}}_{1}}{{{\rm{\nu }}}_{2}-{{\rm{\nu }}}_{1}},$$
    (43)

    using \({{\rm{\nu }}}_{10}=\frac{1}{2}{{\rm{\nu }}}_{1}\) and \({{\rm{\nu }}}_{11}+{{\rm{\nu }}}_{20}=\frac{1}{2}{{\rm{\nu }}}_{2}\). The deduced equations give the critical conversion in undirected networks22.

  3. 3.

    For systems of one initial monomer type we note \({{\rm{\nu }}}_{2}={{\rm{\nu }}}_{1}^{2}\), giving

$${p}_{{\rm{critical}}}=\frac{1}{{{\rm{\nu }}}_{1}-1}=\frac{1}{I+J-1}.$$
(44)

This corresponds to the Flory-Stockmayer equation, where z = ν1 is the expected number of first neighbours9.

Derivation of the gyration radius

In polymer physics, the radius of gyration is used to describe the dimensions of a branched polymer and can be experimentally observed by light scattering experiments. Consider a branched polymer with s monomers having coordinates \({r}_{i}\in {{\mathbb{R}}}^{3}\), i = 1, …, s. The radius of gyration \(R{(s)}_{{\rm{g}}}^{2}\) of this topology is conventionally defined as:

$${R}_{{\rm{g}}}^{2}(s)=\frac{1}{{s}^{2}}\,\sum _{k\mathrm{=1}}^{s}\,\sum _{l=k}^{s}\,{({\overrightarrow{r}}_{k}-{\overrightarrow{r}}_{l})}^{2}.$$
(45)

This quantity can be estimated using the Kramer’s theorem58 that states:

$$\frac{{R}_{{\rm{g}}}^{2}(s)}{{b}^{2}}=\frac{1}{{s}^{2}}\,\sum _{j\mathrm{=1}}^{s-1}\,{s}_{L}(j){s}_{R}(j),$$
(46)

where b is the Kuhn’s length, which is related to the size of a monomer unit. This sum runs over all possible cuts of the branched structure into two fragments: the left fragment of size sL and the right fragment of size sR. There are s − 1 of such partitions. In a statistical ensemble, the size distributions of sL and sR are given by respectively win(s) and wout(s), which are defined by their GFs in eq. (22). Hillegers & Sloot38 formulated the ensemble average for eq. (46) with respect to win(s) and wout(s):

$$\frac{\langle {R}_{{\rm{g}}}^{2}(s)\rangle }{{b}^{2}}=\frac{1}{{s}^{2}}\frac{{\sum }_{{s}_{A}+{s}_{B}=s}\,{s}_{A}{w}_{{\rm{in}}}({s}_{A}){s}_{B}{w}_{{\rm{out}}}({s}_{B})}{\frac{1}{s-1}\,{\sum }_{{s}_{A}+{s}_{B}=s}\,{w}_{{\rm{in}}}({s}_{A}){w}_{{\rm{out}}}({s}_{B})},$$
(47)

which we further process using a discrete Fourier transform \({ {\mathcal F} }^{-1}G(k{)|}_{s}\):

$$\begin{array}{l}\frac{\langle {R}_{{\rm{g}}}^{2}(s)\rangle }{{b}^{2}}=\frac{s-1}{{s}^{2}}\frac{(s{w}_{{\rm{in}}}(s))\,\ast \,(s{w}_{{\rm{out}}}(s))}{{w}_{{\rm{in}}}(s)\,\ast \,{w}_{{\rm{out}}}(s)}=\frac{s-1}{{s}^{2}}\frac{{ {\mathcal F} }^{-1}({W^{\prime} }_{{\rm{in}}}({x}_{k}){W^{\prime} }_{{\rm{out}}}({x}_{k}{))|}_{s-2}}{{ {\mathcal F} }^{-1}({W}_{{\rm{in}}}({x}_{k}){W}_{{\rm{out}}}({x}_{k}{))|}_{s}},\end{array}$$
(48)

with \({x}_{k}={e}^{-2\pi {\rm{i}}\frac{k}{N+1}}\), k = 0, …, N. Therefore, if Win(x) and Wout(x) are already available from the computation of the component size distribution w(s), the ensemble-average radius of gyration is obtained by applying the FFT algorithm at the cost of \({\mathscr{O}}(s\,\mathrm{log}\,s)\) multiplicative operations. For small s, it is possible to compute the convolution directly, whereas eq. (48) is the most advantageous for \(s\gg 1\), where the direct evaluation of the convolution becomes unfeasible.

Another related quantity to the gyration radius is the contraction factor g(s), which is given by the ratio between the mean square gyration radius for a branched polymer \({R}_{{\rm{g}}}^{2}(s)\) and the square gyration radius of a reference linear polymer with the same length \({R}_{{\rm{g}}}^{2}(s{)|}_{{\rm{linear}}}\):

$$g(s)=\frac{{R}_{{\rm{g}}}^{2}(s)}{{R}_{{\rm{g}}}^{2}(s{)|}_{{\rm{linear}}}},$$
(49)

where \({R}_{{\rm{g}}}^{2}(s{)|}_{{\rm{linear}}}\) is typically estimated from the Gaussian coil model:

$${\frac{{R}_{{\rm{g}}}^{2}(s)}{{b}^{2}}|}_{{\rm{linear}}}=\frac{{s}^{2}-1}{6s}.$$
(50)

Scaling of the node neighbourhood size

In this section we apply the Joyal’s theory of combinatorial species to investigate the number of nodes that are contained within a given topological distance from a randomly chosen node. The expected number of first-degree neighbours is defined as the sum of the neighbours reached by the in-edges and the neighbours reached by the out-edges. By using GF this number can be extracted from the degree distribution:

$${z}_{1}=(\frac{{\rm{\partial }}}{{\rm{\partial }}x}+\frac{{\rm{\partial }}}{{\rm{\partial }}y})U(x,y{)|}_{x,y=1}=2\mu ,$$
(51)

and moreover, the expected number of the mth-degree neighbours is given by a composition of U(x, y) and (m − 1)-fold composition of the excess GF:

$${z}_{m}=(\frac{{\rm{\partial }}}{{\rm{\partial }}x}+\frac{{\rm{\partial }}}{{\rm{\partial }}y}){U}^{[m]}(x,y{)|}_{x,y=1}.$$
(52)

Here U[m] generates the probability for the number of mth neighbours:

$${U}^{[m]}\,:\,=\{\begin{array}{cc}U(x,y), & {\rm{f}}{\rm{o}}{\rm{r}}\,m=1,\\ {U}^{[m-1]}({U}_{{\rm{o}}{\rm{u}}{\rm{t}}}(x,y),{U}_{{\rm{i}}{\rm{n}}}(x,y)), & {\rm{f}}{\rm{o}}{\rm{r}}\,m > 1.\end{array}$$
(53)

Therefore, for the first-degree neighbours we have z1 = 2μ. For second-degree neighbours we have:

$$\begin{array}{ccc}{z}_{2} & = & (\frac{{\rm{\partial }}}{{\rm{\partial }}x}+\frac{{\rm{\partial }}}{{\rm{\partial }}y})U({U}_{{\rm{o}}{\rm{u}}{\rm{t}}}(x,y),{U}_{{\rm{i}}{\rm{n}}}(x,y{))|}_{x,y=1}\\ & = & {\mu (\frac{1}{\mu }\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}x}+2\frac{1}{\mu }\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}+\frac{1}{\mu }\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}y})U(x,y)|}_{x,y=1}\\ & = & \mu {{\bf{1}}}^{T}({\frac{1}{\mu }{\bf{L}}U(x,y)|}_{x,y=1}){\bf{1}},\end{array}$$
(54)

where \({\bf{L}}=(\begin{array}{cc}\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y} & \frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}y}\\ \frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}x} & \frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}\end{array})\) and \({{\bf{1}}}^{T}=(11)\). The expected number of the third-degree neighbours is given by:

$$\begin{array}{ccc}{z}_{3} & = & (\frac{{\rm{\partial }}}{{\rm{\partial }}x}+\frac{{\rm{\partial }}}{{\rm{\partial }}y})U({U}_{{\rm{o}}{\rm{u}}{\rm{t}}}({U}_{{\rm{o}}{\rm{u}}{\rm{t}}}(x,y),{U}_{{\rm{i}}{\rm{n}}}(x,y)),{U}_{{\rm{i}}{\rm{n}}}({U}_{{\rm{o}}{\rm{u}}{\rm{t}}}(x,y),{U}_{{\rm{i}}{\rm{n}}}(x,y{)))|}_{x,y=1}\\ & = & \mu (\frac{1}{{\mu }^{2}}\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}(\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}+\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}y})+\frac{1}{{\mu }^{2}}\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}y}(\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}x}+\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y})\\ & & {+\frac{1}{{\mu }^{2}}\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}x}(\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}+\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}y})+\frac{1}{{\mu }^{2}}\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}(\frac{{{\rm{\partial }}}^{2}}{{{\rm{\partial }}}^{2}x}+\frac{{{\rm{\partial }}}^{2}}{{\rm{\partial }}x{\rm{\partial }}y}))U(x,y)|}_{x,y=1}\\ & = & \mu {{\bf{1}}}^{T}{(\frac{1}{\mu }{\bf{L}}U(x,y{)|}_{x,y=1})}^{2}{\bf{1}}.\end{array}$$
(55)

By using induction we arrive at the following expression for the expected number of the mth-degree neighbours in terms of the degree distribution moments:

$${z}_{m}=\mu {{\bf{1}}}^{T}{{\bf{A}}}^{m-1}{\bf{1}},$$
(56)

where

$${\bf{A}}={\frac{1}{\mu }{\bf{L}}U(x,y)|}_{x,y=1}=\frac{1}{\mu }(\begin{array}{cc}{\mu }_{11} & {\mu }_{02}-\mu \\ {\mu }_{20}-\mu & {\mu }_{11}\end{array}).$$
(57)

Now, the total number of nodes N contained within a topological ball of radius l is given as a sum:

$$N=1+\sum _{m=1}^{l}\,{z}_{m}=1+\mu \,\sum _{m=1}^{l}\,{{\bf{1}}}^{T}{{\bf{A}}}^{m-1}{\bf{1}}=1+\mu {{\bf{1}}}^{T}(\sum _{m=1}^{l}\,{{\bf{A}}}^{m-1}){\bf{1}}.$$
(58)

Using the equation for the sum of the geometric series, \({\sum }_{m=0}^{l}\,{{\bf{A}}}^{m}=({{\bf{A}}}^{l+1}-{\bf{I}}){({\bf{A}}-{\bf{I}})}^{-1}\), with I denoting the identity matrix, we obtain:

$$N=1+\mu {{\bf{1}}}^{T}({{\bf{A}}}^{l}-{\bf{I}}){({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}.$$
(59)

The latter equality transforms to:

$${{\bf{1}}}^{T}{{\bf{A}}}^{l}{({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}=\frac{N-1}{\mu }+{{\bf{1}}}^{T}{({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}.$$
(60)

We will now perform an asymptotic analysis of the latter equation by assuming that \(l\gg 1\), in which case, the asymptotic behaviour of Al is driven by the leading eigenvalue of A. Using the eigendecomposition Al = PDlP−1 gives \({{\bf{1}}}^{T}{\bf{P}}{{\bf{D}}}^{l}{{\bf{P}}}^{-1}{({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}=\frac{N-1}{\mu }+{{\bf{1}}}^{T}{({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}\), and defining aT = 1TP, b = P−1(A − I)−11 and \({\mathscr{C}}={{\bf{1}}}^{T}{({\bf{A}}-{\bf{I}})}^{-1}{\bf{1}}\) leads to

$${{\bf{a}}}^{T}{{\bf{D}}}^{l}{\bf{b}}=\frac{N-1}{\mu }+{\mathscr{C}},$$

or equivalently,

$${a}_{1}{b}_{1}{\lambda }_{1}^{l}+{a}_{2}{b}_{2}{\lambda }_{2}^{l}=\frac{N-1}{\mu }+{\mathscr{C}},$$
(61)

where the eigenvalues of the matrix A, \({\lambda }_{1,2}=\frac{1}{\mu }({\mu }_{11}\pm \sqrt{({\mu }_{20}-\mu )({\mu }_{02}-\mu )})\), are defined by the characteristic polynomial:

$$|(\frac{1}{\mu }\begin{array}{cc}{\mu }_{11} & {\mu }_{02}-\mu \\ {\mu }_{20}-\mu & {\mu }_{11}\end{array})-\lambda {\bf{I}}|=0.$$
(62)

Note that the gel criterion given by eq. (37) can also be rewritten as a determinant:

$$det\,{\bf{A}}^{\prime} =|\frac{1}{\mu }(\begin{array}{cc}{\mu }_{11} & {\mu }_{02}-\mu \\ {\mu }_{20}-\mu & {\mu }_{11}\end{array})-{\bf{I}}|\le 0.$$
(63)

Thus, at the gel point, the matrix A′ has at least one eigenvalue equal to zero. The relation of the eigenvalues of matrix A′, \({\lambda ^{\prime} }_{1}\) and \({\lambda ^{\prime} }_{2}\), to the eigenvalues of A, λ1 and λ2, is as follows: \({\lambda ^{\prime} }_{1}={\lambda }_{1}-1\) and \({\lambda ^{\prime} }_{2}={\lambda }_{2}-1\). Furthermore, above the gel transition, \(det\,{\bf{A}}^{\prime} ={\lambda ^{\prime} }_{1}{\lambda ^{\prime} }_{2} < 0\). This is the case, only if one eigenvalue, \({\lambda ^{\prime} }_{1} > 0\), is positive and the other one, \({\lambda ^{\prime} }_{2} < 0\), is negative, and therefore, the eigenvalues of A satisfy λ1 > 1, λ2 < 1 and |λ1| > |λ2|. The implications for eq. (61) are as follows: for large l, \({\lambda }_{1}^{l}\gg {\lambda }_{2}^{l}\), and consequently, the total number of nodes N contained within a topological ball of radius l features the exponential growth after the gel transition:

$$N\propto {\lambda }_{1}^{l}={e}^{l/{l}_{0}},\,{l}_{0}={(\mathrm{log}{\lambda }_{1})}^{-1},\,l\gg 1.$$
(64)