A physical wiring diagram for the human immune system

The human immune system is composed of a distributed network of cells circulating throughout the body, which must dynamically form physical associations and communicate using interactions between their cell-surface proteomes1. Despite their therapeutic potential2, our map of these surface interactions remains incomplete3,4. Here, using a high-throughput surface receptor screening method, we systematically mapped the direct protein interactions across a recombinant library that encompasses most of the surface proteins that are detectable on human leukocytes. We independently validated and determined the biophysical parameters of each novel interaction, resulting in a high-confidence and quantitative view of the receptor wiring that connects human immune cells. By integrating our interactome with expression data, we identified trends in the dynamics of immune interactions and constructed a reductionist mathematical model that predicts cellular connectivity from basic principles. We also developed an interactive multi-tissue single-cell atlas that infers immune interactions throughout the body, revealing potential functional contexts for new interactions and hubs in multicellular networks. Finally, we combined targeted protein stimulation of human leukocytes with multiplex high-content microscopy to link our receptor interactions to functional roles, in terms of both modulating immune responses and maintaining normal patterns of intercellular associations. Together, our work provides a systematic perspective on the intercellular wiring of the human immune system that extends from systems-level principles of immune cell connectivity down to mechanistic characterization of individual receptors, which could offer opportunities for therapeutic intervention.

These different factors of course need to be integrated to make reliable predictions about the propensity of two cells to interact with each other. It is essential to quantiatively model these factors in the physically-realistic way, because simple heuristic 'rules' would otherwise fail to resolve ambiguous situations such as answering how likely two cells are to interact if they have high expression of a low-affinity receptor interaction compared to if they have low expression of a high-affinity receptor.
Our approach is to use equations derived from the law of mass action to provide a simple yet biophysically-grounded method of quantitatively aggregating these different factors into a final prediction of cellular interconnectivity. The concentrations of surface proteins (e.g. [ ], [ ]) can be combined with the binding affinity (expressed as a dissociation constant, K D ) to calculate a hypothetical quantity of bound receptors between the cells if they reached a perfect equilibrium. Summing those quantities of bound proteins for a cellular pair then provides a basis for inferring cellular interconnectivity.
Although the cell-cell binding scores generated by the model are not the exact quantities that would be seen in vivo given assumptions described in the next section, they are nevertheless very useful predictors of cellular behavior in the context of multicellular communities like human immune cells from circulation.
As a concluding illustration of the power of these quantitative modeling approaching, consider a system made up of 3 distinct cell types that are mixed together. At the simplest level, one could add-up the number of distinct interacting protein pairs between each cell pair. Although simple, the basic premise of this has proven successful in other contexts, such as in a previous analysis that counted the number of distinct receptor binding pairs detected between two cell types and found that could correlate with the cell's co-occurrence in spatial images of a tissue [83]. However, those successes could be due to a backwards causality: cells physically proximal in a tissue would be likely to evolve more complementary receptors to communicate. As depicted in the schematic below, often in contexts like the immune system this rudimentary counting approach will either fail to detect substantial differences between cell pairs or produce misleading estimates due to counting the numerous receptors which have little functional role in adhesion.
Similarly, as mentioned before, only considering one variable at a time, such as constructing a model purely based on interacting proteins and their corresponding expression levels, can be misleading because it neglects the substantial role that quantitative binding strength plays in determining which protein interactions are major contributors to cellular adhesion.
With an intuition for the conceptual basis of the model now described, in the next sections the technical features and derivation of the model will be explained in detail. 3

Core model
The core model integrates information about molecular interactions, their affinities, and expression levels in order to predict relative 'interaction scores' that represent the fundamental tendency of two cells to physically interact with one another. This reductionist model uses the equations of mass action kinetics to make principled inferences on cellular connectivity. It is important to note that many layers of complexity beyond this can influence intercellular interactions [85], but the formulation of this model provides a useful intermediate-scale representation of the system without demanding prohibitive molecular dynamics calculations or others that are common for microscale representations.

Cell size measurements
Average values for cell sizes were taken from the BioNumbers database [90]. Cell radius measurements are used in cell geometry calculations, to adjust for size differences affecting receptor densities. It is worth noting that we found the most accurate cell geometry values available result in the total surfaceprotein expression levels between cell types to become roughly normalized, so no cell type has broadly higher surface expression densities than the others. This criteria of eliminating biases can be used when multiple values are available in the literature.

Protein expression data
A previous study used high-sensitivity mass spectrometry proteomics to measure all the proteins expressed by purified human immune cell types [1]. Their data were calibrated against a standardized "ruler" so each measurement represents an absolute count of protein copy-number per cell. Measurements were taken for 4 human donors to provide estimates of expression variability.

Cell type fractions
In most comparisons between the model and experimental data, the cell type labels available in the proteomics do not perfectly align with the labels observed in the experiment. Therefore where needed we incorporated data of the fraction of each cell subtype that comprises a larger cell type category (e.g. if total CD14+ monocytes is the experimentally-observed category, while the proteomics measures classical, nonclassical, and intermediate CD14+ monocytes, then the fraction of those 3 monocyte subtypes is needed to approximate the measured total-monocyte category). Datasets include [89] (CD4 T cells), [

Binary interaction network
We completed a systematic binary interaction screen of 630 immune cell surface proteins, covering every cell-surface protein detected in the proteomics dataset across all cell types and states (except for those such as multi-pass or highly polymorphic proteins which we are not able to express), plus proteins assigned a "CD" number that were missed by the proteomics. We validated our new interaction hits with an independent secondary screen and surface plasmon resonance (SPR). The list of new interactions we validated and all previously-published interactions validated by a high-quality method such as isothermal titration calorimetry, analytical ultracentrifugation, or SPR combined to make our final interaction network.

Binding kinetics
For every new interaction we discovered, we measured its binding with SPR and fit a model curve to calculate its equilibrium dissociation constant (K D ). For all of the previously-published interactions in our network, we found published kinetic data with reported K D values.

Mass action equations
From the Law of Mass Action, a second-order chemical reaction such as two protein species, denoted A and B, binding to form complex AB can be described in terms of its equilibrium K D in 3 dimensions While these equations were derived based on the statistics of random collisions in 3 dimensions, in the context of cell-to-cell interactions, the proteins on the surface of each cell are bounded to diffuse in only 2 dimensions. To approximate this conversion, the 3-dimensional K D (K D 3D ) can be multiplied by a "confinement length", h, representing the average height dimension of proteins on the cell membrane to give the 2-dimensional K D (K D 2D ) [97]. The value of h can be roughly estimated as 10 nm, although empirical measurements exist for several specific surface proteins.
The proteomics data measures total levels of A and B, while the equations above describe the concentrations of bound and unbound molecules. Because cell volume is treated as a constant and binding reactions do not destroy or create proteins, the conservation of molecules can be invoked to determine the total protein levels measured.

[A total ] = [A] + [AB]
(3) (4) While the equation above was defined based on the 3D concentrations of molecules (denoted [A]) measured when K D 3D is calculated by SPR, the same relationship applies to 2D concentrations of molecules (denoted [[A]]). Rewriting the 3-dimensional mass-action equation in terms of the 2-dimensional proteomic measurements yields This equation can be re-arranged to derive a formula for the fraction of bound proteins given the total protein amounts. First terms are grouped to give a quadratic expression with respect to the desired output, the bound con- Only the negative root of the quadratic is considered because the positive root fails to satisfy the inequalities (3) and (4). Replacing the two-dimensional dissociation constant with the easier to measure K D 3D using the approximation in (2), this yields a final equation entirely in terms that we can experimentally measure.
(7) These equations assume that A and B only participate in one interaction out of all immune surface proteins, which is true for most but not all proteins. In the future it could be possible to derive equations that factor in multiple equilibria between binding reactions involving a protein (or proteins) in common.

Cell interaction model 3.1 Assumptions
To predict the frequencies at which different cells interact, our model makes the following assumptions : (i) Immune cells that are in circulation are well-mixed and do not have a fixed spatial structure. Because of this, the odds of any two cell types encountering each other is equal before an equilibrium is reached based on the cell's surface molecules which we model.
(ii) Cell-surface proteins on each cell are evenly dispersed overall. The probability of a given surface protein species on a cell physically encountering another molecule is proportional the density of that species, and not substantially changed on average by, for example, all the available proteins clustering on one small region of the cell membrane.
(iii) The strength of multiple interactions is the sum of the constituent interactions. The kinetic equations used in our model do not factor in how avidity may make the combined affinity of multiple to scale nonlinearly compared to the sum of those individual interactions.
(iv) Our interaction network is complete enough that it can describe the global interaction patterns of cells, even though no interaction network is likely to be cover every single molecular interaction that can occur between cells, including those not mediated by proteins.
(v) The simplified geometry of our model is sufficient to describe the complex interactions of cells. This includes our conversion of kinetic parameters to 2-dimensions based on an average confinement length and averaging out the shape of cells to be roughly spherical for calculations.
Given the immense additional complexity possible within areas such as micro-scale cell geometry and surface-protein clustering dynamics, we believe our model offers one of the most complete descriptions of aggregate-level system behavior possible without resorting to prohibitively large molecular dynamics simulations.

Model overview
First, for each cell type we defined an expression matrix of the copy numbers of every surface protein involved in a cell-to-cell interaction according to our systematic interaction network. The absolute copy numbers per cell were converted to 2-dimensional densities by assuming all cells are approximately spherical with a radius provided from the literature. For every possible pairing of cell types (including the cell type with itself), we repeated the following procedure: for all protein-protein interactions between the cell types, use equation (6) to calculate the concentration of protein expected to be bound in complex when the cells are in physical proximity. We sum these bound concentrations together to derive a weighted interaction propensity score, I, specific to the cell-cell pair. When this procedure is done across all possible cell-cell pairs, the result is a symmetric matrix of relative cell affinities. From the relative affinities, it should be possible to predict how the occurrence of different cell interactions change in different conditions, such as upon a change in expression or pharmaceutical blocking of a particular receptor.
for n interactions between cells x and y (10) This interaction score provides a relative measure of cellular connectivity. In units of molecules per surface area, it can be thought of conceptually as the quantity of bound molecules holding two cells together that would be present if all the surface proteins of those cells could perfectly equilibrate and access each other. Interaction scores can be directly correlated against observed rates of cellular association, to verify the computed scores correlate with observed interaction frequencies. This interaction score is also used as input for the differential equation extension to our core model, described later.

Validation
To calibrate and validate our model, we have access to high-throughput microscopy data of human immune cells that were mixed and labeled based on cell type [24]. The frequencies at which different cells contact each other is quantified, giving an experimental proxy for our interaction metrics.
We can correlate the measurements of how often different immune cells associate at equilibrium to model predictions. Because this core model does not include data on cell abundances, the observed frequencies would need to be adjusted against cell abundance. Otherwise, the most frequently-counted cellular interactions might simply be among cells that are more abundant, as opposed to having interaction rates above the baseline expectation. Scores for achieving that kind of normalization on experimental data have already been established [96].
Another direct test of the predictive power of this model would be to experimentally block specific cell-surface protein interactions (such as by adding recombinant ligands that competitively inhibit their receptors' interactions) and gauge how cell-to-cell contact frequencies change. If these changes are predictable from our "ground-up" mass action kinetics model, that would demonstrate the utility of our systematic approach. We describe these later in our differential equation model extension.
Future extensions of this model could also be interesting for spatial measurements of cells in tissues. Methods such as CODEX can label cell-type markers 8 on cells resident in tissues, and have been used to generate datasets counting the contact frequencies of immune cells in lymphoid tissues [88]. The relative interaction scores would provide predictions about how expression changes could alter cellular contacts.

Differential equation extension
Although the core model can calculate the relative affinities for different immune cell populations, a limitation of these values is that they cannot effectively predict how perturbations alter immune cell interactions. For example, simulating any knockout of a receptor would only lead to predictions of cellular interactions decreasing, even though it would be expected that decreases in certain cellular interactions would accompany others that increase in frequency as cells are freed up from their original binding partners. Thus we complemented our core model with a differential equation extension that explicitly models all possible cell pairs associating and dissociating from each other with rate constants informed by our core model. This enables more accurate predictions of a range of perturbation conditions.

Relative cell association rates
The mass-action rate constants for determining how a cell pair x and y associates and dissociate were directly taken from the I x,y values of the core model described above. Because the model gives greater scores for cell pairs with more predilection to associate, the inverse of these values was taken to give dissociation rate constants.

Cell-cell binding reactions
The law of mass action describes how reactants in a process equilibrate with their products. For example, formulas derived from mass action laws are what we previously used to formalize the reaction where a pair of proteins go from an unbound state to reversibly forming a bound complex. We reasoned the same basic principle could be applied to describe how cells in a mixture reversibly associate with each other in well-mixed equilibrium conditions. For blood immune cells in particular this analogy seems reasonable, given the high densities of immune cells (e.g. approximately 2.5 million per mL of blood) and their dynamic motility (in contrast to more static body tissues).
By this formulation, every cell type has the ability to bind with every other cell type. Thus the number of binding terms to account for increases combinatorially with the number of unique cell states (by the function n 2 ). For this reason, we utilized rule-based algorithms to automatically construct the full system of differential equations for arbitrarily large models of cells [74].
3 Differential equation model details

Assumptions
Unlike the core mechanistic model that generates relative interaction scores, this extension of the model abstracts aspects of cell to cell binding in order to better estimate how cell interactions change under perturbations : (ii) The relative interaction score from the mass-action model represents the dissociation constant for a particular pair of cells binding in a reversible reaction. The equilibrium constant for this reaction is treated as comprising a constant on-rate (k on ) for cells colliding into a bound cell-cell pair, and an off-rate (k off ) inversely proportional to the relative interaction score. The proportionality constant was selected to give approximately the expected fraction of cells in paired states versus singlet cells.
(iii) Cells are either in a single (unbound) state or a paired bound state. Higher-order combinations like 3 or more mutually bound cells are considered to occur at comparatively low rates and ignored.

Model extension overview
For brevity of illustration considering all equations share similar forms, one example each of differential equation type is given below. The first is for calculating the change in the proportion of a cell in its unbound state, and the other is for calculating the change in a cell in its paired state bound onto another cell. For a cell pair (e.g. CD4 T cells associating with dendritic cells): While for the quantity of cells in an unbound state in a simple 5 cell type system (e.g. for calculating unbound CD4 T cells in a model also including B cells, NK cells, monocytes, and dendritic cells): T 4 (t) = T 4 : N K(t) · k T 4:N K + T 4 : DC(t) · k T 4:DC + T 4 : M 14(t) · k T 4:M 14 + T 4 : B(t) · k T 4:B Full systems of differential equations for a larger 7 cell-type model are provided in the appendix.
The ordinary differential equation system is then numerically solved from an initial condition corresponding to all cells being unbound at their natural proportions. In all simulations, a stable equilibrium is soon reached. At equilibrium, the proportion of every possible cell state (unbound, and all cell-cell combinations) is taken for further analysis. When simulating changes induced upon perturbation, the simulation is simply repeated after changing an input parameter.

Validation
The same automated microscopy methods used for validating the core model can also be adapted to testing how well the differential equation extension to the core model captures the effects of perturbations on cellular interactions. We conducted the first experimental tests pairing that automated blood immune cell imaging platform with recombinant proteins. Exogenously applying these soluble proteins which have binding activity to receptors on the surfaces of leukocytes can then act as perturbations. Although a rudimentary expectation would be that all soluble proteins act to disrupt the receptor's natural function of physically linking together cells, it is also conceivable in some cases that the soluble proteins could cluster the receptor in a way that does not block its binding sites, thus enhancing rather than disrupting an interaction.