Size conservation emerges spontaneously in biomolecular condensates formed by scaffolds and surfactant clients

Biomolecular condensates are liquid-like membraneless compartments that contribute to the spatiotemporal organization of proteins, RNA, and other biomolecules inside cells. Some membraneless compartments, such as nucleoli, are dispersed as different condensates that do not grow beyond a certain size, or do not present coalescence over time. In this work, using a minimal protein model, we show that phase separation of binary mixtures of scaffolds and low-valency clients that can act as surfactants—i.e., that significantly reduce the droplet surface tension—can yield either a single drop or multiple droplets that conserve their sizes on long timescales (herein ‘multidroplet size-conserved’ scenario’), depending on the scaffold to client ratio. Our simulations demonstrate that protein connectivity and condensate surface tension regulate the balance between these two scenarios. The multidroplet size-conserved scenario spontaneously arises at increasing surfactant-to-scaffold concentrations, when the interfacial penalty for creating small liquid droplets is sufficiently reduced by the surfactant proteins that are preferentially located at the interface. In contrast, low surfactant-to-scaffold concentrations enable continuous growth and fusion of droplets without restrictions. Overall, our work proposes one thermodynamic mechanism to help rationalize how size-conserved coexisting condensates can persist inside cells—shedding light on the roles of protein connectivity, binding affinity, and droplet composition in this process.


Scientific Reports
| (2021) 11:15241 | https://doi.org/10.1038/s41598-021-94309-y www.nature.com/scientificreports/ recruitment of clients can alter significantly the stability and structural properties of condensates [22][23][24] . Therefore, together, scaffolds and clients shape the biophysical properties of the biomolecular condensates that play fundamental roles within the cell [25][26][27][28] . A striking observation is the presence of condensates inside cells that do not grow beyond a certain size 7,[29][30][31] . Basic thermodynamics suggest that over time, LLPS should result in the formation of a single large condensate rather than multiple coexisting small droplets 32 . The latter case is disfavored because, cummulatively, it yields a high surface area to volume ratio (i.e., high interfacial free energy penalty), while the former ensures that the interfacial free energy penalty is minimized. Despite of the thermodynamic preference for single condensates over multidroplet systems, both types of architectures are present within cells. Indeed, single large condensates have been observed 26,33 , but also diverse coexisting size-restricted droplets have been reported in different in vivo systems, such as in the amphibian oocyte nucleolus 7 , in the Saccharomyces cerevisiae cytoplasmic processing bodies 34 and elsewhere [35][36][37] . Additionally, size-conserved multidroplet architectures have been found in noncoalescing ribonucleoprotein condensates 26 and in multiphase complex coacervates in vitro 38 . However, the underlying molecular mechanisms and biophysical driving forces behind the formation of multiple coexisting droplets, also known as emulsification 39 , require further investigation.
Several explanations for why or how emulsions can be thermodynamically stable at biological relevant timescales are currently under debate 40 . One possible mechanism is that the presence of active ATP-dependent processes might conveniently regulate the conditions where droplets grow and coalesce [41][42][43][44] . Other studies suggest that proteins with various highly distinct interacting domains may form micelle-like condensates 31,45,46 . In multicomponent mixtures, another possibility could be that specific binding proteins act as powerful surfactants and, thus, reduce the droplet surface tension penalty leading to multicondensate coexistence 35,36,38,47 . Moreover, a recent alternative explanation suggests that the interplay between protein diffusion and saturation of protein binding sites can also induce size-conservation in condensates 29,30 . It is plausible that all these different mechanisms contribute to the size conservation of condensates inside cells under different conditions.
In this work, we use a minimal protein model, which recapitulates the experimentally-observed relationship between protein valency and critical parameters 23,48,49 , to investigate the regulation of droplet size in binary mixtures of multivalent proteins (scaffolds and clients). We show that liquid-liquid phase separation of scaffolds and clients mixtures, where clients act as surfactants, can give rise to single droplets or multiple size-conserved droplets (Fig. 1a). Further, we reveal that the transition between the two scenarios can be regulated by the condensate scaffold/surfactant client ratio. Our simulations suggest how general molecular features such as protein connectivity, binding affinity, and droplet composition can critically modulate and stabilize the formation of size-conserved condensates 29,30 , and might have also implications to understanding the phase behavior of multilayered condensates 23,38,50 .

Results
A minimal protein model for scaffold and client mixtures. Coarse-grained potentials have emerged as powerful tools for describing the phase behavior of biomolecules, such as proteins and nucleic acids, and delineating the underlying physicochemical features that drive LLPS 51,52 . Various levels of molecular resolution can be achieved with coarse-grained models; encompassing mean field descriptions 53, 54 , lattice-based approaches 55,56 , minimal models 23,48,49,[56][57][58][59][60] , and sequence-dependent representations 46,[60][61][62][63][64][65] . Here, we employ our minimal protein model 48 , which has been previously applied to unveil the role of protein multivalency in multicomponent condensates 49 and multilayered condensate organization 66 , as well as to investigate the role of RNA in RNA-binding protein nucleation and stability 67 . In this model, proteins are described by a pseudo hardsphere potential 68 that accounts for their excluded volume, and by short-range potentials for modeling the different protein binding sites, and thereby mimicking protein multivalency 48 (Fig. 1b). For computational efficiency, an implicit solvent is used; therefore, the condensed phase corresponds to a liquid phase, and the diluted phase to a vapor. In what follows, the unit of distance is σ , the molecular diameter of the proteins (both scaffolds and clients), and the unit of energy k B T (for further details on the model parameters and the employed reduced units see the "Methods" section).
Following the framework of Banani et al. 20 , scaffolds are defined as proteins that can establish both homotypic interactions and heterotypic interactions with clients, while clients (or hereafter called surfactants) as proteins that are limited to bind only to scaffolds (i.e., they do not bind to other clients; except where otherwise stated). Within this scheme, phase separation is driven by scaffolds (high-valency proteins), whereas surfactants (proteins with lower valency) are recruited to condensates at the expense of depleting LLPS-stabilizing scaffold-scaffold interactions. We model scaffolds as 4-binding sites particles and surfactants as 3-binding sites particles. (Fig. 1b). This choice fulfills two important requirements: (1) it ensures that scaffolds have a higher valency than the clients, and (2) it allows us to easily establish a common simulation temperature for the system at which each type of patchy particle distinctly behaves as either a scaffold or a surfactant client. The two-phase coexisting densities as a function of temperature for our model of scaffold proteins (blue), a 50:50 binary mixture of scaffolds and surfactants (black), and a system composed of just surfactants (red) are depicted in Fig. 1c (Top panel). Note that while in the 50:50 mixture, surfactant proteins do not interact homotypically, in the pure surfactant system they do; the latter allows us to compare the effect of protein valency on the critical parameters of pure self-interacting protein systems. These simulations show that the addition of surfactant proteins that are strong competitors for the scaffold binding sites significantly hinders the ability of scaffolds to phase separate (i.e., clients lower the critical temperature) 20,49 . Moreover, the presence of surfactant clients drastically reduces the surface tension of the condensates (black curve) as shown in Fig. 1c (Bottom panel). In the following section, we elucidate the implications of the client-induced surface tension reduction on the behavior of phase-separated condensates.  [68][69][70][71] : (1) for the pure scaffold system, and (2) for the 50:50 binary mixture of scaffold and surfactants shown in Fig. 1c. At a constant temperature of T * /T * c = 0.75 and a global (system) density of ρ * = 0.136 (both in reduced units; see "Methods" section for further details on the employed reduced magnitudes), we create three different simulation box geometries with the dimensions summarized in Table 1. Using this approach, we can effectively modulate the surface/volume ratio of the condensed phase (hereafter called droplet): where the droplet surface is S = 2 * L y * L z and its volume is V = L y * L z * L x,slab ; with L x,slab representing the width of the condensate in the x direction, and L x , L y and L z being the different sides of the simulation box. Figure 2a-c, summarizes the phase behavior of the pure scaffold system (Top panels) and the 50:50 binary mixture (Middle panels) along the three designed simulation box geometries. The pure scaffold condensate exhibits distinct surface/volume ratios depending on the box geometry (see Table 1), while maintaining the same droplet density on all three cases. In other words, the scaffold condensate would be able to continuously grow as a single droplet at expense of the diluted phase until reaching equilibrium. In contrast, the scaffold-surfactant mixture yields various coexisting equilibrium droplets with a roughly constant surface/volume ratio of 0.21(2) σ −1 in all systems Minimal coarse-grained model for protein LLPS: Blue and red spheres represent the excluded volume of scaffold and surfactant (client) proteins respectively, while gray patches represent the binding sites of the proteins. Two different proteins are modeled: scaffold proteins, with 4 promiscuous binding sites in a tetrahedral arrangement, and surfactant proteins, with 3 binding sites in a planar equidistant arrangement that can only bind to scaffold binding sites (except where otherwise stated). Details on the model parameters are provided in the "Methods" section. (c) Top: Phase diagram in the temperature-density plane for a scaffold protein system (blue), a 50:50 scaffold-surfactant mixture (black) and for a hypothetical surfactant system in which client proteins can self-interact (red). The same self-interacting potential employed for scaffold proteins (blue) is also applied for surfactant proteins that can hypothetically self-interact (red). This system serves to further illustrate the effect of multivalency in LLPS. Filled circles indicate the estimated coexisting densities from Direct Coexistence simulations [68][69][70][71] , and empty ones depict the critical points calculated via the law of rectilinear diameters and critical exponents 72 (see Supporting Information for details on these calculations). Bottom: Surface tension ( γ ) dependence on temperature for the scaffold system (blue), the 50:50 scaffold-surfactant mixture (black), and the hypothetical system in which surfactant proteins can self interact (red). Filled squares represent direct estimations of γ , continuous curves depict fits to our data of the following form: γ ∝ (T * − T * c ) 1.26 , and empty squares show the critical temperature of each system evaluated through the law of rectilinear diameters and critical exponents 72 . The vertical dashed line indicates the temperature at which the remainder of our study is performed. Note that temperature (in reduced units, T * ) is renormalized by the critical temperature T * c (also in reduced units) of the scaffold protein system ( T * c = 0.12).  35,36 . We next analyze the composition of the different coexisting scaffold-surfactant condensates along the distinct box geometries [ Fig. 2 (Bottom panels)]. In all cases, including the pure scaffold condensate, the properties of the droplets are remarkably similar; i.e., the density of all droplets, as well as their composition and surface tension are roughly constant (Table 2). Notably, we find that the surfactant density profile becomes higher than that of scaffolds at the droplet interface, showing how the partition coefficient of surfactants is greater than that of scaffolds in the outer region. Previous works suggest that accumulation of surfactants at the interface is preferable as it minimizes the condensate surface tension 35,38,66 . We also verify that the presence of multiple coexisting droplets is the thermodynamically stable state, rather than just metastable, by simulating over sufficiently long timescales to allow for multiple droplet fusion events and variations in droplet composition. Importantly, these tests reveal that even when in contact the droplets coexist without coalescing or altering their equilibrium composition. The multidroplet behavior of size-conserved condensates (in our case of ∼ 0.21 σ −1 surface/volume ratio) is a consequence of the thermodynamic conditions of our system (i.e., mixture composition, temperature, and density). Note that droplet curvature effects such as Laplace internal pressure 80 or surface tension dependence on droplet curvature 81 have not been considered in our simulations, since we do not expect them to play an important role at biologically relevant droplet size scales ( O µm) 40 . Those effects are only expected to be dominant in the nanometer scale (i.e., up to droplet radii of tens of nanometers) [81][82][83][84] .
The presence of surfactant clients within the condensate substantially lowers the liquid network connectivity 49,85,86 and, therefore, reduces the enthalpic gain sustaining LLPS. Consequently, the system minimizes its free energy by optimizing the number of surfactants that are incorporated into the condensed phase; i.e., by creating higher surface/volume ratios, where surfactants are preferentially located towards the interface rather than in the core. Such free energy optimization yields multiple coexisting condensates of a certain size, rather than a single-condensate system. Moreover, the emergence of multiple coexisting droplets stabilised by surfactant Table 1. Simulation box dimensions and condensate surface/volume ratios (S/V) of the three box geometries represented in Fig. 2 for the pure scaffold system and the 50:50 binary mixture of scaffold and surfactant proteins. Geometries (a), (b) and (c) account for the (a), (b) and (c) panels shown in Fig. 2 Figure 2. Direct Coexistence simulations for a scaffold protein system (Top panels) and a 50:50 binary mixture of scaffold and surfactant proteins (Middle panels) with different simulation box geometries (see Table 1 for

Surfactant concentration critically modulates droplet size.
As discussed in the previous section, our minimal protein model shows that for a given composition of scaffold and surfactant clients, independent of the imposed box geometry, droplets can only grow to a certain size. This size-restricted growth is, in turn, determined by the optimal surface/volume ratio that minimizes the free energy of the coexisting liquid phases. Therefore, larger system sizes lead to higher number of coexisting size-conserved droplets. On the other hand, when the condensate is only composed of scaffold proteins, as the system size increases, the size of the condensate simply grows instead of yielding new multiple size-conserved droplets (Fig. 2). These results illustrate how a simple model for scaffold and surfactant proteins, merely controlled by protein valency and binding affinity, can recapitulate mesoscale features of in vivo and in vitro condensates that exhibit size-conserved growth 7, 26, 34-38 .
Since this phase behavior only arises when the concentration of surfactants is not negligible, we now investigate how condensates can switch between both scenarios, and how their surface/volume ratio is modulated by their relative scaffold-surfactant composition. By gradually increasing the surfactant client concentration of the scaffold-surfactant mixture (at a constant temperature and system density), the condensate size progressively decreases to accommodate the equilibrium droplet surface/volume ratio to the simulation box geometry; i.e., the condensate splits into two and, subsequently, into three coexisting liquid droplets (see Fig. 3). We note that the composition of the different coexisting droplets at a given surfactant concentration is remarkably similar, highlighting that all droplets are in equilibrium. In parallel, we evaluate the surface tension of the different coexisting droplets as a function of surfactant concentration. We find that γ monotonically decreases (but not linearly) as the surfactant concentration increases (Fig. 3). This result is not surprising (see Fig. 1c) given that one of the key molecular driving forces behind size-conserved multidroplet formation is the reduction of γ by surfactants coating the droplet surface. Above a certain surfactant concentration-exceeding 65 % for our system and at the given conditions-LLPS is inhibited. Beyond this limit, the condensate liquid network connectivity sustained by scaffold proteins can no longer compensate the mixing entropy of the system. We also analyze the surface/volume ratio of the droplets as a function of surfactant concentration. At infinitely low surfactant concentration, the condensate displays a ratio of ∼ 0.09 σ −1 . However, such a ratio is fully determined by the total number of proteins in the system and the box geometry, since as shown in Fig. 2, scaffold condensates can reach any droplet size when surfactants are absent. At low surfactant concentrations (i.e., % surfactant < 27.5% ), the maximum droplet size corresponds to a surface/volume ratio of ∼ 0.11 σ −1 . Beyond that threshold concentration, the condensate shrinks, and to achieve the equilibrium surface/volume ratio, it splits into smaller coexisting droplets. The maximum equilibrium droplet size in the two-droplet regime is that corresponding to ratios of ∼ 0.19 σ −1 . Finally, for surfactant compositions higher than 38% , three coexisting condensates emerge. The maximum surface/volume ratio that droplets can achieve is ∼ 0.27 σ −1 at 60% client composition, which is only possible due to the extreme reduction in the surface tension ( γ = 0.15 k B T σ −2 )-more than one order of magnitude lower than that of the pure scaffold condensate ( γ = 1.58 k B T σ −2 ) at the same temperature and system density.
Previous studies have highlighted the challenges associated with measuring condensate surface tensions due to the small size of protein droplets 3,87 . Nonetheless, there are available estimates of this magnitude for ribonucleoprotein condensates, and these measurements demonstrate that surfactant proteins can reduce γ by orders of magnitude 26 . With our minimal model, we qualitatively observe such behavior when surfactant proteins are recruited to the condensate, giving rise to emulsions of multiple coexisting droplets with very low surface tension. Surfactant proteins can lead to the formation of multidroplet emulsions by inducing multilayered condensate architectures 26,38 . Diverse biomolecular organelles, either in the cell nucleus, such as the nucleoli 26 , or nuclear speckles 77 , as well as in the cytosol, such as stress granules, 75,76 exhibit this type of organization. Moreover, different in vitro complex coacervates 38,50,73,74 , bioengineered ribonucleoprotein condensates in living cells 88 and mixtures of RNA-binding proteins and RNA molecules 78,79 are also known to show multilayered assemblies. In Fig. 4a, we analyze the droplet architecture of a protein condensate with a 50:50 scaffold-surfactant composition Table 2. Properties of systems presented in Fig. 2 containing 50:50 scaffold-surfactant compositions in different simulation box geometries. In all cases, 6 independent simulations (with different initial velocity distributions), each starting from a pre-equilibrated configuration, were performed. For the geometries with more than one droplet, the values are averaged over the different droplets, although the variance between distinct droplets is significantly small as shown in Fig. 2 (Bottom panels).  Fig. 2a (Middle panel)]. We find that in the droplet core, the scaffold to surfactant ratio is much higher than along the interfacial region (Fig. 4a), where it drops to almost half that of the surfactant proteins. Nonetheless, the surfactant concentration within the droplet core is still remarkably high considering its destabilizing role in the condensate liquid network connectivity 49 . The observed non-homogeneous condensate organization stems from the higher valency of scaffold proteins, which allows them to establish higher molecular connectivity within the core condensate and, thus, induce higher enthalpic gain upon multilayered assembly. Finally, to gain further insight on the droplet liquid network connectivity, we evaluate the average number of engaged binding sites per protein ( ϕ ) as a function of distance from the center of mass of the condensates (Fig. 4b). Scaffold proteins present a significantly higher amount of molecular connections per protein than surfactants (i.e., ϕ ∼ 3.7 and ϕ ∼ 2.5 for scaffolds and surfactants, respectively), at the droplet core. This observation highlights how surfactants negatively contribute to the stability of the condensate. However at the interface, such diminished connectivity of surfactant proteins ( ϕ ∼ 1 ) with respect to that of scaffolds ( ϕ ∼ 3 ) substantially reduces the condensate surface tension-by decreasing the enthalpic cost ( h i ) of creating an interface. This energetically favorable protein arrangement, controlled in our model just by the variance in protein valency of the components, is expected to be contributed also by changes in relative binding affinities among proteins, and  Average number of engaged binding sites per protein ( ϕ ) as a function of distance from the droplet center of mass for scaffold (blue) and surfactant proteins (red). One binding site is considered to be engaged to another if the distance between them is less than 0.145 σ (i.e., the maximum bond length interaction between distinct protein binding sites; for further details on these calculations see Supporting Information).  7,34 . Furthermore, such variations can be also relevant to understand the physical parameters controlling multilayered condensate organization and, ultimately, regulation of the formation of size-conserved multidroplet emulsions [29][30][31] .

Conclusions
In this work, we employ our minimal protein model 48 to demonstrate how biomolecular multidroplet emulsification can be controlled by the subtle balance between liquid network connectivity and droplet surface tension, and how general molecular features such as protein valency and binding affinity can critically regulate this behavior. By using a binary mixture of scaffold and client proteins that act as surfactants (following the original definition proposed by Banani et al. 20 ), we design a set of Direct Coexistence simulations in which we can conveniently modulate the simulation box geometry to assess the propensity of the condensates to accommodate different surface/volume ratios. The ability (or disability) of these mixtures to adopt different surface/volume ratios, imposed by the box geometry, can be regarded as an indirect measurement of the droplet propensity to grow beyond a certain size. We find that pure component scaffold condensates can easily adapt to distinct surface/volume ratios; in support of their ability to grow and fuse into a single droplet. However, 50:50 binary scaffold-surfactant mixtures stabilize instead several coexisting liquid condensates with roughly constant surface/volume ratios to accommodate the imposed system size geometry. Such behavior is a clear signature of size-conserved multidroplet emulsification, as found in the nucleolus 7 , ribonucleoprotein condensates 26,88 , micelle-like condensates 31,45 , and in vitro complex coacervates 38 .
We also elucidate the role of surfactant concentration in size-conserved droplet growth. By gradually decreasing the scaffold-surfactant ratio in our mixtures, we observe that the maximum droplet size is reduced, while simultaneously increasing the number of coexisting condensates. This trend continues until a sufficiently high surfactant concentration is reached, where LLPS is no longer possible. Moreover, as clients are added, the droplet surface tension dramatically decreases, facilitating the formation of multiple coexisting small liquid droplets at low interfacial energetic cost. Client proteins, besides decreasing the stability of the condensates 49 , can effectively behave as natural droplet surfactants 35,36 . Due to their considerably lower molecular connectivity compared to that of scaffolds, surfactants preferentially migrate towards the droplet interface; thereby, minimizing the enthalpic cost of creating an interface 23 . Heterogeneous molecular organizations of condensates have been observed in stress granules 75,76 , the nucleoli 26 and nuclear speckles 77 . We find that such heterogeneity enable the maximization of the condensate liquid network connectivity through scaffold-scaffold protein interactions within the droplet core.
Rationalizing the underlying mechanisms employed by cells to precisely regulate the size of their diverse membraneless compartments and processing bodies 7, 34, 36 represents a crucial step towards understanding intracellular spatiotemporal cell organization. Taken together, our coarse-grained simulations help to elucidate the relationship between single-droplet phase formation and size-conserved multidroplet architecture, and put forward general molecular features such as valency and binding affinity as chief drivers in these scenarios.

Methods
We model our coarse-grained multivalent proteins using the MD-Patchy potential proposed in Ref. 48 , which is composed by two different set of potentials: a Pseudo Hard-Sphere (PHS) potential 89 to continuously describe the repulsive interaction and excluded volume between different protein replicas, and a continuous square-well (CSW) 90 potential to describe the patch-patch interactions among different protein binding sites. The u PHS potential is described by the following expression: where a = 49 and r = 50 are the exponents of the attractive and repulsive terms respectively, ε R accounts for the energy shift of the PHS interaction, σ is the molecular diameter (and our unit of length) and r is the center-tocenter distance between different PHS particles. For the patch-patch interaction we use the following expression: where ε CSW is the depth of the potential energy well, r w the radius of the attractive well, and α controls the steepness of the well. We choose α = 0.005σ and r w = 0.12σ so that each patch can only interact with another single patch.
The mass of each patch is a 5% of the central PHS particle mass, which is set to 3.32 × 10 −26 kg, despite being this choice irrelevant for equilibrium simulations. This 5% ratio fixes the moment of inertia of the patchy particles (our minimal proteins). The molecular diameter of the proteins, both scaffold and clients, is σ = 0.3405 nm, and the value of ε R /k B is 119.81K. All our results are presented in reduced units: reduced temperature is defined as T * = k B T/ε CSW , reduced density as ρ * = (N/V )σ 3 , reduced pressure as p * = pσ 3 /(k B T) , and the reduced unit of time as σ 2 m/(k B T) . In order to keep the PHS interaction as similar as possible to a pure HS interaction, we fix k B T/ε R at a value of 1.5 as suggested in Ref. 89 (fixing T = 179.71K). We then control the effective strength of the binding protein attraction by varying ε CSW such that the reduced temperature, T * = k B T/ε CSW , is of the order of O (0.1). www.nature.com/scientificreports/ Since both u PHS and u CSW potentials are continuous and differentiable, we perform all our simulations using the LAMMPS Molecular Dynamics package 91 . Periodic boundary conditions are used in the three directions of space. The timestep chosen for the Verlet integration of the equations of motion is t * = 3.7 × 10 −4 . The cut-off radius of the interactions of both potentials is set to 1.17σ . We use a Nosé-Hoover thermostat 92, 93 for the NVT simulations with a relaxation time of 0.074 in reduced units. For NpT simulations, a Nosé-Hoover barostat is employed with the same relaxation time 94 .
The methodological details of the calculation of the phase diagram, surface tension and engaged binding sites per protein through a local order parameter, are provided in the Supporting Information document.