Combinatorial entropy behaviour leads to range selective binding in ligand-receptor interactions

From viruses to nanoparticles, constructs functionalized with multiple ligands display peculiar binding properties that only arise from multivalent effects. Using statistical mechanical modelling, we describe here how multivalency can be exploited to achieve what we dub range selectivity, that is, binding only to targets bearing a number of receptors within a specified range. We use our model to characterise the region in parameter space where one can expect range selective targeting to occur, and provide experimental support for this phenomenon. Overall, range selectivity represents a potential path to increase the targeting selectivity of multivalent constructs.

lines) and 1 year later (AP-PEGPDPA psomes 1 year, blue dashed lines). In panel B "Batch reproducibility", we report the DLS characterisation of the samples used for the experiments in this work (AP-PEGPDPA psomes t=0, blue solid lines) and a second independent batch of ligand-and fluorophore-conjugated polymersomes immediately after the sample preparation (AP-PEGPDPA independent batch, green solid lines). In order to prove that the presence of the ligands or the fluorophore in our polymersome formulations does not affect their size, in panel C "Batch reproducibility among pristine and functionalized psomes" we report a comparison between the number distribution of the ligand-and fluorophore-conjugated polymersomes used for the experiments in this study (AP-PEGPDPA psomes t=0, blue solid lines) and pristine PEG-PDPA polymersomes immediately after the preparation (pristine psomes t=0, black solid line) and 1 year after their preparation (pristine psomes 1 year, red solid line). As the reader can appreciate, no significant differences can be observed among different batches and over long periods of time.

SUPPLEMENTARY METHODS III
Polydispersity and its effect on range-selectivity . In order to gauge polydispersity in our sample, Transmission Electron Microscopy was used together with a Matlab algorithm for image analysis (a link to the GitHub repository with the Matlab script used to compute the PDI is available at the following link: https://github.com/GabrieleMarchello/PDI). Briefly, the algorithm adopted was comprised of various steps, as we present here. Firstly, the image was scanned in a raster way and divided into small patches (i.e. areas of pixels) partially overlapping. Then, the mean intensity value of these patches was subtracted from the intensity of the patches, in order to compensate the uneven illumination of the image. The current implementation of the algorithm extracted 128x128 square patches, extracted every 2 pixels in both directions [3]. Furthermore, the image was filtered by using a Gaussian filter 2 pixels wide. This filter smoothed the details of the image down, reducing significantly the amount of misleading information on the image, in such a way to make the particle identification more robust [4].
In order to simplify the particle identification step, the edges (i.e. the sharp variations of brightness) in the image were computed, creating a mask with the profiles of the imaged elements. The edges were computed by applying the Canny method [5]. At this point, the circular elements in the image were identified, saving only the elements with the two main P DI = (µ/σ) 2 (1) Representative TEM images and the size distribution of the polymersomes obtained from all the aggregate TEM data are reported in Supplementary Figure 3 here.
A question that might arise is what is the effect of polydispersity on range selectivity.
Whereas the polydispersity of the system affects the exact quantitative details of the binding curve, the qualitative non-monotonic behaviour that we dubbed range selectivity is not affected. That this is the case can be deduced by calculating the average binding probability given a certain number distribution for our sample, i.e.
In order to show the effect of including polydispersity, we plot in Supplementary Figure 8 the value of θ calculated for a mono-disperse dispersion as well as a log-normal distribution for the size in our samples (consistent with the fact that R > 0): Note that the meanR = R and varianceR 2 = R 2 -R 2 in our samples as measured by Transmission electron microscopy can be used to derive µ and σ for the lognormal distribution as: and For the sake of providing a relevant example, we chose here to use a mean radius of 23 nm and a mean-square root deviation of 12 nm, as for the case of 1% functionalisation in our polymersomes. All other values characterising the polymersome and the ligand-receptor pair are taken to be exactly those in our experimental system. As it can be seen, the effect of polydispersity is minimal. Moreover, it should be appreciated how the occurrence of range selectivity, even for this relatively elevated values of polydispersity, is again not much affected and its non-monotonic behaviour is robust with respect to it (see Supplementary  Figure 5. Comparison of the binding probability of a mono vs polydisperse system. In the polydisperse system, we use a mean and variance equal to that of our sample at 1% ligand loading, which gives a value of the polydispersity index of approximately 0.25, representative to that of all our samples. As it can be seen, range selectivity is preserved and the effect of polydispersity is simply to slightly smooth the adsorption curve.

SUPPLEMENTARY METHODS III
Polymersomes adsorption on the membrane. We show here in Supplementary   Figure 6 a representative set of images, whose analysis was used to establish the adsorption data reported in Fig.4a) in the main text. As it is also qualitatively clear by a naked-eye analysis, due to the non-monotonic nature of the adsorption probability in range-selective  We do this by assuming that every ligand whose grafting point is at a maximum distance d = 2R g from the surface, R g being the ligand gyration radius, can bind receptors (note that this value is the most probable value of the end-to-end distance in a Gaussian chain).
In practice, ligands that are much farther than that will have to stretch too much, and their effective bond energy will become very large (see e.g. [6] for a full treatment of the distance-dependent single-bond energy), or in other words the bond very weak, providing a negligible contribution to binding. Similarly, ligands and receptors away from this region (or its projection on the binding surface A surf int for the case of receptors) are barely confined between the nanoparticle and the cell surface, thus also providing a negligible contribution to the repulsive energy. Overall, this provides the following formulas for the interactive area on the nanoparticle and on the surface, A np int and A surf int , respectively: where R and R g are the nanoparticle radius and the gyration radius of the ligand, respectively, please see Supplementary  Similarly to what has been done in [7], we can start by assuming that any particle whose distance to the surface is below a certain value z 0 , defined as the value at which bonds can form, can be considered as bound. As we shall see, however, the exact choice of this distance is not important. Let us call ∆F (z) the total free-energy as a function of distance z between the nanoparticle and the surface, measured with respect to a reference state where no bonds are possible (i.e., the particle is in the bulk far away from the binding surface). The bound partition function (q in Eq.(??) in the main text) can be defined as [7]: where A site = πR 2 is the area occupied by a single particle and excluded to others, R being the nanoparticle radius. Now if we look at the form of ∆F (z), this will have a minimum in a region of width ≈ R g around z ≈ R g , thus allowing us to use a saddle-point approximation for calculating the integral in Eq. (8). The minimum is expected to be at around Rg for the following reason: for z R g , both the ligands and the receptors are highly compressed (against the binding surface or the nanoparticle's brush, respectively), quickly increasing the repulsive contribution F rep and thus F tot . For z R g , whereas the repulsive contribution quickly drops to zero, ligands must instead stretch a lot to bind the receptors, leading to very weak bonds and in turns to a small value (in magnitude), of the binding free-energy. For this reason, the minimum will be around z = R g , where F rep ≈ 0 but F att gives a sizeable (negative) contribution. We note here that we are not the first to use this approximation, see e.g. [7], which gives good agreement with detailed molecular simulations.

SUPPLEMENTARY NOTES III
A mean-field approximation to the radial binding scenario. Eq.7 in the main text provides a mean-field approximation to the radial binding scenario. This approximation can be arrived at in the following way. Consider N L ligands, each of which can bind N R receptors with a single-bond energy ∆G. In a mean-field approximation, ligands are considered independent from each other and thus the total partition function Q can be just written as the product of the single-ligand partition function q, i.e: where the term 1 in q corresponds to the ligand in the unbound state and the second term corresponds to each of the N R states where the ligand is bound to one of the N R different receptors available. One thus arrives at the following approximation for the attractive part of the free energy as: (11) Note that some authors define Q as the sum over all states where at least a single bond is present, and thus add a term of −1 from Q to remove the contribution from the state where no bonds at all a present [8]. Here instead we use a definition for the partition function so that it is the attractive free-energy F att rather than Q that goes to zero when no bonds can be made (formally, when ∆G → +∞) [9].
By assuming a mean-field approximation for the receptors, i.e. assuming they are independent from each other, the symmetric formula where N L is substituted with N R , and vice-versa, is obtained. The reason the mean-field approximation works well in the regime N L N R (or vice-versa, when receptors are considered) is because in this regime the ligands are almost independent. Whereas for a more in-depth and general discussion we refer the reader to Refs. [6,9], we consider here for simplicity the case of two ligands, each of which can bind the same N R receptors. Let us thus consider the conditional probabilities p 1,2 and p 1,2 of ligand 1 being bound, given that ligand 2 is not, and that ligand 1 is bound given that ligand 2 also is. Using Bayes theorem, we can write these as: where we defined χ = exp(−β∆G) and Q 1&2 and Q 1&2 are the partial partition functions summing over all contributions where ligand 1 is bound, and 2 is not, or where both are bound, respectively. For independent ligands, one would have that the two conditional probabilities must be the same, or in other words what happens to ligand 2 is irrelevant for the state of ligand 1. If we take the limit for a large number of receptors and use to make an approximation via a Taylor expansion in the variable α = N −1 R we have both probabilities are the same to first order in α, p 1,2 ≈ p 1,2 ≈ 1 − χα, hence ligands behave independently as suggested, and a mean-field approximation becomes a good approximation.

SUPPLEMENTARY NOTES IV
Super-selectivity parameter in our system In order to evaluate how sharp is the response in binding to a variation in the number of receptors, we report in Supplementary   Figure 8 the equivalent of the curves in Fig.2 of the main text as a log-log plot. Note that the derivative of this curve is the so-called super-selectivity parameter α [8]. However, because of the non-monotonic behaviour of the bound partition function and the corresponding binding probability, the typical interpretation of α as in standard (i.e., monotonic) multivalent adsorbing systems is not valid. More precisely, α > 1 does not necessarily mean a superlinear response, nor can α be interpreted as the Hill-exponent to be used to fit multivalent binding adsorption curves.

SUPPLEMENTARY NOTES V
Estimating the kinetics for binding from the brush. In order to give a rough estimate for the binding kinetics and how this is influenced from the brush, we use a simple Debye-Smoluchowsky description to determine the average time at which a nanoparticle reaches the surface of a cell, taken to be the average distance at which a ligand-receptor bond is made [10]. As we are only interested in an order of magnitude estimate, we treat both the cell and the nanoparticles as floating spheres in solution with radii R = R np + h (R np being the bare nanoparticle size and h the average height of its protective polymer brush) and R cell , respectively. The interaction (free-)energy is constant (and set to zero) as long as the surface-to-surface distance is larger than R + R cell and equal to F rep , the total repulsive force due receptors and ligands (see Eq.4-6 in the main text) for a distance R np + R g < d << R, at which the contact is considered to occur. Note that in this way we are over-estimating repulsion, since we are assuming that its value is actually constant in this region whereas in reality it will increase continuously from 0 to this value. As a result, we expect to obtain a timescale larger than the actual one.
Under the previous assumptions, the average time it takes two particles under this interaction free-energy to meet is: [10]: where z is the concentration of the nanoparticles in solution. Within this description, we can interpret τ infty the time for the cell to get in contact with the outer part of the nanoparticle's brush and τ b the time required to compress it by the value necessary to reach the ligand buried below it. Furthermore, the effective diffusion coefficient where D np and D cell are the diffusion coefficient of a cell and nanoparticle, respectively, and we further use Einstein's relation to calculate their value in water, i.e.: η ≈ 8.9 · 10 −4 Pa s being the viscosity of water. Considering the size of a cell is about 3 orders of magnitude larger than any of the characteristics lengthscales describing the polymersomes, i.e. R cell R np , h, R g , we can further simplify D ≈ D np and R cell +R np +R g ≈ R cell , thus giving: which, substituting the values for our system h = 8 nm, R g = 3 nm, and F rep = 9.6 (corresponding to the maximum value, achieved with the highest ligands loading), R cell ≈ 1µm gives an impingement time of 10 −1 s, much shorter than the timescale of our experiments (1h). It is noticed that for our system and within our assumptions τ is dominated by τ b (compared to τ ∞ ≈ 10 −2 s) and overcoming the repulsive force is the rate limiting step as long as F rep > 5k B T . Generally speaking, one should expect kinetics consideration to come into play if the adsorption measurements are taken at a time t exp < τ .

SUPPLEMENTARY NOTES VI
Deriving an optimal number of ligands for multivalent constructs to display maximum binding strength. In order to derive Equation 8 in the main text, we work in the regime where N R N L , where the growth of the attractive part of the free energy is logarithmic as a function of the number of ligands and we expect the total adsorption energy F tot to start to increase after reaching its minimum, thus corresponding to the maximum binding strength. In this case, the total binding energy as a function of the number of ligands and receptors is, assuming for the ligands a linear repulsive contribution: from which the minimum, corresponding to the maximum binding strength, can be easily calculated by imposing N optimal L : ∂βFtot ∂N L | N L =N optimal L = 0 leading to: where again χ = exp(−β∆G) can be interpreted as the bond strength, or simply a rescaled version of the ligand-receptor binding constant, since χ = K bind /v 0 , v 0 being the ligand-receptor binding volume (more precisely, a measure of the change in the molecular partition function of the bound ligand-receptor pair [6]). A few interesting predictions can be made based on this formula. The first is that for weak enough bonds (χ → 0), the value of N optimal L can be negative. Since for N L > N optimal L the derivative of F tot with respect to N L is positive, this means that in this case the total binding energy is a monotonically increasing function of the number of ligands for any physical value, hence adding ligands would only decrease the overall binding strength. The same expression also shows that even for very strong ligands (χ → ∞), although as long as at least N L > N R , the optimal number of ligands is still finite and depends on the number of receptors as well as the strength of the repulsion, i.e. N optimal L = N R B .