Droplet-based screening of phosphate transfer catalysis reveals how epistasis shapes MAP kinase interactions with substrates

The combination of ultrahigh-throughput screening and sequencing informs on function and intragenic epistasis within combinatorial protein mutant libraries. Establishing a droplet-based, in vitro compartmentalised approach for robust expression and screening of protein kinase cascades (>107 variants/day) allowed us to dissect the intrinsic molecular features of the MKK-ERK signalling pathway, without interference from endogenous cellular components. In a six-residue combinatorial library of the MKK1 docking domain, we identified 29,563 sequence permutations that allow MKK1 to efficiently phosphorylate and activate its downstream target kinase ERK2. A flexibly placed hydrophobic sequence motif emerges which is defined by higher order epistatic interactions between six residues, suggesting synergy that enables high connectivity in the sequence landscape. Through positive epistasis, MKK1 maintains function during mutagenesis, establishing the importance of co-dependent residues in mammalian protein kinase-substrate interactions, and creating a scenario for the evolution of diverse human signalling networks. Here, the authors use a droplet-based screen for phosphate transfer catalysis, testing variants of the human protein kinase MKK1 for its ability to activate its downstream target ERK2. Data reveal a flexible motif in the MKK1 docking domain that promotes efficient activation of ERK2, and suggest epistasis between the residues within that sequence.

C ell signalling coordinates cellular actions in response to external stimuli. Extracellular signals are transduced from membrane receptors to transcription factors via complex signalling networks, typically through protein kinase-mediated phosphorylation cascades. These coupled enzymatic networks have evolved to govern numerous critical cellular processes from cell proliferation to apoptosis 1 . Understanding the fine-tuning of complex signalling networks is a challenge that has mainly been addressed by reductionist biochemical approaches, i.e., studying individual protein mutants and/or single components of these networks at a time 2,3 . High-throughput screening and deepsequencing analysis hold the potential to analyse a wider range of mutations in combination. Such insight is becoming increasingly important for the consideration of intragenic epistasis, i.e., the nonadditive interactions of amino acids within a protein 4 .
This work addresses the interplay between activity and sequence in a human protein kinase cascade, investigating interactions between protein kinases by monitoring the downstream phosphorylation product of the cascade. So far, human protein kinase libraries have been screened with throughputs of 10 3 variants, focusing on single-site saturation libraries [5][6][7][8] . Capturing epistasis, however, requires the screening of combinatorially randomised libraries, in which the exponential number of mutants with each added position quickly necessitates a higher experimental throughput. In vivo screening of four randomised interface positions in a bacterial histidine kinase (>10 5 variants) has enabled untangling of epistasis between said residues, although cellular background from acetyl phosphate was shown to phosphorylate target proteins as well, complicating the screening for kinase activity 9,10 . Insight into higher-order epistasis in human protein kinases would therefore benefit both from an increase in throughput, and ideally be derived from a cell-free system to avoid interference from the cellular machinery.
Here we establish an in vitro evolution system for protein kinase cascades with ultra-high throughput (>10 7 ). The screen is used to gain insights into the determinants of protein-protein interactions that control the catalytic activity of a mitogenactivated protein kinase (MAPK) cascade and the underlying epistatic effects between six residues that govern it.
Several mitogen-activated protein kinase (MAPK) pathways operate side by side in the cell 11,12 (Fig. 1a). To prevent inappropriate cross-talk between signalling responses, mitogenactivated protein kinase kinases (MKKs) contain a docking domain (D-domain) which targets a D-domain recruitment site (DRS) in MAPKs, facilitating binding and allosteric activating of their cognate partners 13,14 . (Fig. 1b 15 ). Here, we randomise the docking domain of MKK1 (>500,000 variants), and test activation of the downstream kinase ERK2 by detecting its phosphorylation product in vitro. Selecting active mutants followed by deep sequencing gives us comprehensive insight into which combinations of residues in the MKK1 D-domain promote efficient phosphate transfer between MKK1 and ERK2, their robustness to mutation and the sequence landscape describing this interaction.

Results
Design of an in vitro screening system for a synthetic MAPK cascade. We established a screening system for in vitro testing of a model protein kinase cascade (Fig. 1c) by immobilising genes encoding the chosen kinase (MKK1) on paramagnetic beads 16 , together with a phosphorylation detection mechanism (subGFP, see below) ( Supplementary Fig. 1A). The beads were cocompartmentalised in droplets with a purified downstream kinase (ERK2) and in vitro transcription and translation components (IVTT) (Supplementary Fig. S1B), so that polydisperse droplets contain at most one bead each ( Supplementary Fig. 2).
Robust in vitro expression of the upstream kinase (MKK1) begins the signalling cascade, such that MKK1 phosphorylates the hydroxyl residues T185 and Y187 of ERK2 17 . Activated ERK2 could now phosphorylate an N-terminally immobilised prolinedirected substrate peptide (VAPFSPGGRAK) 18 , labelled with a C-terminal GFP (subGFP).
At a chosen timepoint, the emulsion was broken to give a mixture of beads, which were now differentially phosphorylated, representative of each MKK1 variant's activity and complementarity to ERK2. To detect phosphorylation of the subGFP's serine, the beads were incubated with chymotrypsin: it specifically cleaves the substrate peptide after Phe as long as the peptide is not phosphorylated. In contrast, ERK2's phosphorylation of serine sterically/electrostatically hinders chymotrypsin cleavage, leaving the peptide intact and GFP still attached to the bead 18 . Ultrahighthroughput sorting of library members (each represented by one bead carrying genotype and phenotype) can then be performed by flow cytometry, where the beads encoding the best catalysts show the highest fluorescence and mutants are separated into 'bins' according to their activity. Finally, sequence recovery and nextgeneration sequencing of the encoded kinase variant couples the genotype to the activity-defined bin.
Assaying activity and binding complementarity in in vitro MKK1-ERK2 signalling. The activation of the investigated phosphorylation cascade depends on the binding complementarity between the two kinases and on the enzymatic activity of the associated individual components. To probe whether this in vitro, bead-based screening approach can select for either aspect, first individual emulsions were prepared with beads encoding MKK1 variants with different constitutive activities, IVTT mix, and ERK2 (Fig. 2a). The beads retained subGFP fluorescence intensity in accordance with the activity of the encoded variant: (i) beads encoding the constitutively active variant ca MKK1 (i.e., the mutant MKK1 Δ44-51/S218D/M219D/N221D/S222D ) 19 retained~50% subGFP fluorescence intensity, (ii) beads encoding a constitutively active variant with approximately eightfold lower relative activity for ERK2 (MKK1 S218D/S222D ) 19 retained~16%, and (iii) the wild-type MKK1 without constitutive activation retained <1% of subGFP fluorescence intensity relative to that seen with untreated beads. Likewise, the activity of the downstream cascade component is critical: when it is compromised by incubating the highly active ca MKK1 with the phosphorylation-defective variant ERK2 T185A/Y187A 20 , the percentage of retained subGFP drops to <1%.
Next, utilising the most active ca MKK1 variant, we probed the in vitro screen's ability to enrich for ca MKK1 variants based on their docking interactions with ERK2. To do so, we first prepared a binding-impaired variant by altering the ca MKK1's D-domain to contain two alanine residues (Ile9Ala and Leu11Ala, hereafter denoted ca MKK1 I9A/L11A ), impairing its ability to target the DRS of ERK2. These mutations have been shown to reduce the relative activity of MKK1 for ERK2 > 2-fold 21 (Fig. 2b, all residues numbered according to MKK1). Beads were functionalised with either ca MKK1-Cy5 or ca MKK1 I9A/L11A -TexasRed (TxR) genes ( Supplementary Fig. 3a-d) and the beads mixed in a 1:1 ratio before emulsification. After de-emulsification and chymotrypsin digest, flow cytometric analysis allows separation of ca MKK1-Cy5 and ca MKK1 I9A/L11A -TxR based on their fluorescent dye, before assessing the subGFP intensities of the beads, indicative of efficient phosphate transfer in the cascade (see the Gating strategy in Supplementary Fig. 3a-f).
The impaired D-domain of ca MKK1 I9A/L11A slows activation of ERK2 relative to ca MKK1, an effect best visualised by the bimodal distribution of subGFP-negative and subGFP-positive beads shifting over time ( Supplementary Fig. 3g-k). The bimodal distribution is likely to stem from the exponential activation of subGFP via activated ERK2, as the substrate (ERK2) is an enzyme, and will continue to phosphorylate subGFP upon activation by MKK1. At the optimal timepoint (t = 3 h, including time for MKK1 expression in droplets), the sequence-dependent D-domain function was robustly resolved (Fig. 2c, d), where subGFP-positive beads had a 92 ± 2% positive predictive value for encoding ca MKK1 over ca MKK1 I9A/L11A , which indicates a 8% false positive rate (Fig. 2e, raw data in Supplementary Fig. 4c-i). The high positive predictive value enables reliable enrichment for functional D-domains when sorting subGFP-positive beads downstream. At the same time, good coverage when screening a D-domain library can be achieved by as little as three-fold oversampling (Supplementary Fig. 4; coverage >95% of variants, limited by the percentage of false negatives in Q1: 35 ± 27%).
Combinatorial D-domain library design and MKK1-ERK2 interaction screening. The D-domain sequence comprises a set of basic, spacer and hydrophobic residues (Fig. 1b). In vivo, both MKK1 and MKK2 phosphorylate ERK2. Compared to MKK1, the MKK2 D-domain is two residues longer, containing an additional leucine at position 7a and an alanine at 8a (Fig. 2b). To Phosphorylation of ERK1/2 by MKK1/2 is facilitated by MKK's D-domain binding the D-domain recruitment site (DRS) of ERK2. Shown here is the D-domain sequence of MKK2, the residues highlighted in red bind an acidic patch of residues in the ERK2 DRS, while the large hydrophobic residues highlighted in aqua bind hydrophobic grooves within the ERK2 DRS. Spacer residues are shown in pink (PDB: 4H3Q) 15 . c (1) Paramagnetic beads (green circle) are functionalised with subGFP (blue wavy line with GFP) and ca MKK1 genes (pink wavy lines). (2) Beads are emulsified by mixing an aqueous solution containing in vitro expression components (IVTT) and purified ERK2 protein with oil and surfactant. (3) Expression of ca MKK1 in the emulsion droplets starts the cascade by phosphorylation of ERK2. Phosphorylated ERK2 is now able to phosphorylate serine within the substrate sequence of the subGFP construct. (4) After de-emulsification, beads are exposed to chymotrypsin protease (denoted as scissors) in bulk. (5) Beads with phosphorylated serine will be less susceptible to digestion by chymotrypsin, leaving subGFP attached to the bead. The bead, therefore, encompasses both the genotype and phenotype information, as an encoded active ca MKK1 mutant will retain more subGFP on a bead than a less-active kinase mutant. (6) Subsequent flow cytometric sorting of the beads based on subGFP fluorescence into activity-set gates followed by (7) deep next-generation sequencing (NGS) allows for correlation of the cascade activity to the encoded MKK1 gene. investigate the importance of both the D-domain's length and chemical composition for ERK2 interaction, we created a library of the ca MKK1 docking domain. The library contained four independently randomised hydrophobic/spacer residues, while additionally allowing for the incorporation of two insertions (MKK2-like), one insertion or none (MKK1-like), for a total of six randomised residues (Fig. 2b). In order to stay within bounds of the screening capacity, a subset of amino acids was screened at each position. Positions were chosen to ensure the inclusion of amino acids with polar, charged, or hydrophobic residues, with a wider range of possible hydrophobic substitutions as they complement the hydrophobic pockets in the ERK2 DRS (Fig. 1b). The gene was assembled using SpliMLiB, an unbiased combinatorial gene assembly method, on paramagnetic microbeads 22  ca MKK1 Cy5 M P K K K P T -P -I Q L N P ca MKK1 I9A/L11A TxR M P K K K P T -P -A Q A N P MKK2 -L A R R K P V L P A L T I N P Library Cy5 M P K K K X T X P X X Q X N X Assay of reconstituted synthetic cascades in IVTT emulsion droplets. a GFP retention of beads encoding non-constitutive active (MKK1) or constitutively active ( ca MKK1/ ca MKK1 S218D/S222D ) genes in combination with ERK2/ERK2 T185A/Y187A . Box and whisker plot shows 5th, 25th, 75th, 95th percentiles and the median for n =~10,000 screened beads per sample. b D-domain sequences and design of the randomised D-domain library of ca MKK1. Numbering is based on MKK1's WT sequence, 'a' indicates a potential insertion, while 'Δ' indicates a "deletion" of this residue when the insertion was not introduced. c Flow cytometric analysis of a 1:1 mixture of beads functionalised either with active ca MKK1 (aqua) or with docking domain impaired ca MKK1 I9A/L11A (red) before and (d) after chymotrypsin digest. Non-functionalised beads are shown in orange. Analytical gates around the bimodal distribution distinguish between false negatives (Q1), true positives (Q2), false positives (Q3) and true negatives (Q4) (e) The percentage of beads for ca MKK1 and ca MKK1 I9A/L11A in the analytical gates Q1-Q4 over seven independent measurements, of which one is shown in 'D' and all raw data is shown in Supplementary Fig. S4. The average percentage is denoted as a horizontal line. f Analysis of SpliMLiB library preparation by flow cytometry (dark blue), where positions have been randomised as described in panel b. The controls, beads functionalised directly with the full-length genes via a one-step clickreaction are shown in aqua ( ca MKK1, positive) and red ( ca MKK1 I9A/L11A , negative). Non-functionalised beads are shown in orange. g An overlay of the nine library sorts ( Supplementary Fig. S7) shown in blue and the respective gates in which the low/medium and high GFP beads were sorted (h) An overlay of the three ca MKK1 (aqua) vs ca MKK1 I9A/L11A control samples ( Supplementary Fig. S7) prepared in parallel to the library samples. i FRET sensor assay for randomly recovered ca MKK1 sequences from the high-activity (green), medium activity (orange) and low activity sorting gate (red). All values are normalised to ca MKK1. Averages derived from two biological repeats are shown. Only randomised residues are shown in sequence labels. Source data are provided as a Source Data file for panels a, e, and i. each bead by IVTT. The capture of subGFP and library synthesis was shown to be successful through flow cytometry ( Fig. 2f and Supplementary Fig. 5).
The starting library contained 500,000 variants and was assayed in nine parallel batches. We recovered 3.5 million beads after de-emulsification and chymotrypsin digest, yielding sevenfold oversampling ( Supplementary Fig. 6). These beads were sorted with flow cytometry into three activity bins; low, medium and high ca MKK1 activity, measured through subGFP fluorescence (Fig. 2g, individual plots in Supplementary Fig. 7). The low-, medium-and high-activity gates in which beads were to be sorted were set relative to the bimodal distribution of three independent bead populations encoding either ca MKK1 or ca MKK1 I9A/L11A , assayed in parallel with the library samples ( Fig. 2h, individual plots in Supplementary Fig. 7). Based on the control distributions observed for ca MKK1 and ca MKK1 I9A/L11A encoding beads, we expect the false positive rate of the high gate to be~9%, which matched our selective pressure predictions described above. Our false negative rate of finding ca MKK1functionalised beads in the low or medium gate was estimated to be~50%, which, with sevenfold oversampling, means we expected >99% of variants with functioning D-domains to be sorted at least once in the high gate for subsequent sequencing of its genotype. The majority of library beads were sorted into the low activity gate (2.1 × 10 6 or 85%), 2.8 × 10 5 (11.2%) were sorted into the medium activity gate, and 8.1 × 10 4 (3.2%) into the highactivity gate.
We further validated the ultra-high-throughput kinase screen by expressing and individually re-screening a random subset of sorted ca MKK1 variants from each gate in a plate-based secondary screen. In the secondary screen, the in vitro-expressed ca MKK1 was mixed with ERK2 and assayed against the same target peptide used in the ultra-high-throughput screen, here placed between CFP and YFP in a FRET sensor 18 . The efficiency of phosphate transfer is read out by monitoring the emission ratio after complete chymotrypsin digest: while the un-phosphorylated sensor is readily digested and gives a low-emission ratio, the sensor phosphorylated by an active kinase is digested more slowly ( Supplementary Fig. 8). The 25 randomly selected variants from the high-activity gate retained on average 84% of the emission ratio signal normalised to ca MKK1, confirming that our sort enriched for ca MKK1 variants with wild-type-like activity (Fig. 2i). In contrast, the medium gate random variants retained on average 58% of the FRET signal, while only two of the low gate variants showed a slight improvement compared to the FRET assay when done with ca MKK1 I9A/L11A . Thus we were confident that the in vitro assay and sort enriched for stable, catalytically activating protein-protein interactions.
Deep sequencing of ca MKK1 variants evaluates sequence preferences. Next, we recovered the D-domain coding sequence from each activity bin independently through limit-cycle PCR (Supplementary Figs. 9 and 10), and analysed the sequence space of MKK1-ERK2 docking complementarity through deep sequencing of the sorted variants from all three gates (Supplementary Tables 1-3). Accounting for overlap between the sequencing pools, we detected 4.9 × 10 5 unique sequences in total or 91% of the theoretical library diversity (Fig. 3a). The SpliMLiB protocol generates balanced high-quality libraries with equal proportions of randomised amino acids at every targeted position 22 . Sequence analysis of the low activity gate distribution, a proxy for an unselected library, shows the expected balance: the low activity sample contained the bulk of the library (4.7 × 10 5 sequences, or 95% of the total number of unique sequences recovered, including the expected false negatives (Fig. 3a) and exhibited a balanced distribution of amino acids (Shannon entropy >0.995 at all positions, close to the value of 1.000 that would indicate a perfectly balanced library) 23 (Supplementary Figs. 11 and 12). The amino acid proportions at each randomised position showed increasing enrichment towards preferred residues in the medium and especially the high-activity gate (Shannon entropy 0.86-0.99), giving a clear view on which amino acids confer activity and tend to be enriched at each site in the D-domain of MKK1 ( Supplementary Fig. 11).
As both highly active and medium activity variants may be observed at low frequency in the high gate (Supplementary Figs. 13 and 14), we next focused on 29,563 variants that showed a similar or better sequencing count profile as WT ca MKK1. This set of active variants contains variants that appear ≥51 times in the high gate and are enriched in the high gate over the lower gates (see Supplementary Note 1 for details). The conservative approach to variant filtering removed 19% of the sequences in the high gate, slightly above the expected 8% false positive sorting events (see "Methods" and Supplementary Fig. 13). Alternative formulations of the enrichment ratio, based on the distribution of sequencing reads in all gates, and/or dividing over all observed sequences, yielded near identical results (Supplementary Note 2 and Supplementary Fig. 15). Overall, the D-domain sequences in the final active dataset (2.9 × 10 4 variants) show strong enrichment of Leu and Ile at randomised positions 9 and 11 ( Fig. 3b). This trend follows the wild-type residues in MKK1/MKK2 and conforms with the highly conserved nature of these residues 3,24 , and their complementarity to the binding pockets on ERK2 (Fig. 1b). Similarly, we observed a preference for the MKK2-like hydrophobic residue insertions at position 7a, favouring especially the insertion of Leu. This third Leu in MKK2's D-domain (L7a) is complementary to an additional hydrophobic pocket in the ERK2-binding groove 2 (the "lower pocket"; Fig. 1b). Thus, the large hydrophobic residues at MKK1 positions 7a, 9 and 11 form the core of the D-domain features that target the ERK2 DRS.
The randomised positions at the edge of the randomised region show mild overall sequence preferences: whereas Pro is present in positions 6 and 13 in both the wild-type sequences of MKK1 and MKK2, the deep mutation scan reveals a weak preference for Pro at position 6 and a general enrichment for any hydrophobic residue at position 13. As expected, the negatively charged residue Asp was depleted across the library, especially in the core hydrophobic patch positions 7a, 9 and 11. Positively charged Lys was neutral in position 6, next to the already basic patch of amino acids in MKK1, and depleted otherwise. Similar to charged residues, the very small residues glycine and alanine were depleted in all positions, consistent with the known effect of alanine to disrupt MKK1 binding, as shown in the negative control construct ca MKK1 I9A/L11A (Fig. 3b).
Positive epistasis widens D-domain active sequence space. The sequences in the final set of active variants contained 13 variants with a single-site mutation relative to wild-type MKK1 (Supplementary Fig. 16), while 43 out of the remaining 44 possible MKK1 single mutants were abundant in lower activity gates ( Supplementary Fig. 16). Similar to the overall preference for Leu/ Ile residues in active D-domains, the point substitutions in these 13 variants tend to be hydrophobic. If only these substitutions were allowed in a combinatorial fashion, we would expect to observe at most 432 (=2 × 3 × 1 × 3 × 4 × 6) unique active variants in the final active deep-sequencing dataset:  25 .
While hotspot analysis of the 2.9 × 10 4 variants already paints a more accurate estimation of the general amino acid preferences at each of the randomised positions (Fig. 3b) since it encompasses non-linear effects, it does not provide insight into which particular pairwise interactions are epistatically intertwined. To draw out the underlying epistasis from this one-dimensional model we mapped the prevalence of all pairs of mutations in the D-domain ( Fig. 4 and Supplementary Fig. 17). We observed that for some pairs of MKK1 positions, the preferred residues at one . A logarithmic scale is used to display both enrichment (red) and depletion (blue). Each panel represents one pair of randomised positions. The presence of large hydrophobic residues Leu/Ile at any one of positions 7a, 9 or 11 serves as an "anchor", so that these residues appear in combination with non-preferred amino acids. n = 29,563 for each individual heatmap. position are influenced by the amino acid present at the other. While most position pairs followed the trends typical for each randomised position (i.e., enrichment of hydrophobic residues), we observed clusters of enrichment wherever a Leu/Ile is present, indicating that previously determined non-beneficial residues such as Ala are allowed in the context of a hydrophobic anchor residue. Further quantitative investigation of shifts in amino acid preference supports the interpretation that the system exhibits widespread epistasis and that the two-position enrichments are not just a reflection of the hydrophobic residue abundance, as the Bonferroni-adjusted p-values for the χ2 goodness of fit compar-  Table 8 and Supplementary Note 3). This prevalence of positive epistasis indicates that the presence of an Ile or Leu residue allows anchoring to ERK2 in combination with otherwise deleterious residues, such as alanine or aspartate. Interestingly, these co-enrichment patterns are also present for Leu in position 6 and all hydrophobic residues in position 13, positions with a weak overall sequence preference. This ability of the hydrophobic amino acid residues to expand the sequence space further supports the hypothesis that these residues are essential to D-domain binding and activation. Conversely, the preferred large hydrophobic residues show weak negative epistasis with each other, suggesting that a D-domain full of hydrophobic residues is not necessarily better.
Preferred motifs in the D-domain. Having shown that the amino acid preferences at the six randomised positions are interdependent and epistatically linked, we searched for structural motifs within the set of 29,563 variants that underpin this pattern. We first examined the change in preferences when one or more residues are restricted to preferred hydrophobic residues, comparing the change in amino acid distribution with the overall sequence preferences (Supplementary Fig. 19). If a Leu or Ile is fixed in any one position, there is a shift in the sequence preference in the adjacent randomised position towards hydrophobic residues, creating an Φ-X-Φ motif (Φ denoting Leu or Ile, and X the fixed spacer residue). Conversely, the sequence preference at distant positions is relaxed and becomes more tolerant of small or hydrophilic residues.
While a single large hydrophobic residue is not sufficient to anchor a D-domain, two such residues arranged in an Φ-X-Φ motif greatly reduce constraints on the rest of the D-domain sequence (Fig. 5a). Conversely, if two adjacent positions contain other residues, the remaining randomised positions are strongly enriched for Leu and Ile. The effect of the Φ-X-Φ motif is strongest in positions 7a/9, 9/11 and 11/13, where D-domains typically contain hydrophobic residues and the short Φ-X-Φ motif with two Leu/Ile residue appears sufficient. The activating effect of two hydrophobic residues is apparent regardless of the exact position of the motif, although the Φ-X-Φ motifs in positions 6/7a and 11/13 tend to be supplemented by a third hydrophobic residue nearby (Supplementary Fig. 20). The incorporation of two small or charged residues in positions 7a and 11 would prohibit the formation of any Φ-X-Φ motif, which explains the strong negative epistasis between non-preferred residues in these two positions (Supplementary Fig. 17).
Normalising the subsets of Φ-X-Φ containing datasets for the initial distribution of amino acid frequencies (Fig. 3b) gave us a glimpse into multidimensional epistasis i.e., the positive or negative magnitude epistasis with a third residue outside of the Φ-X-Φ motif (Fig. 5b). The emergence of the distinct Φ-X-Φ motif showcases how multidimensional epistasis shapes the sequence preference for a distinct Φ-X-Φ motif in active MKK1 D-domain sequences. Whenever a Φ-X-Φ is present (may it be at positions 6-X-7a, 7a-X-9, 9-X-11 or 11-X-13) negative epistasis is observed with a third hydrophobic at any other position, while positive epistasis is observed with any third residue which is non-beneficial in isolation. Likewise, when residues other than Ile/Leu are present at these positions, positive epistasis is observed with Ile and Leu at all other positions in the subset, meaning that the frequency of finding a Ile/Leu goes beyond the already abundant frequencies of Ile and Leu at these positions in the entire active variant dataset. On the contrary, non-beneficial amino acids become ever more detrimental in this context through negative epistasis.
Thus, the D-domain sequence landscape is shaped by the requirement for a two-residue hydrophobic motif, although the location of this motif in the D-domain is flexible, an effect which was not immediately apparent from hotspot analysis (relying on additivity). Accommodating this shifting motif is thought to rely on the promiscuity of the ERK2 DRS, which naturally accommodates several unique D-domains with different orientations to maximise overlap with its hydrophobic pockets 2 . The coevolved amino acids which complete the motif are less constrained in chemical functionality but likely optimise packing to the DRS-indicated by the strong negative epistasis of multiple small residues co-occurring ( Supplementary Fig. 17).
Sequence similarity network shows evolutionary connectivity.
Having observed that an active D-domain can accommodate the key hydrophobic Φ-X-Φ motif in more than one possible position, we addressed the clustering of all 29,563 active variants. We constructed a sequence similarity network (SSN) and visualised it with a force-directed layout (Fig. 6). Each variant in the final active dataset is a node and edges connect all pairs of variants that differ by a single-site mutation. We were interested in whether the SSN contains any distinct regions or clusters within the network, which would indicate the possibility of distinct biological solutions to the problem of ERK2 activation. Therefore, the largest connected subgraph of the full SSN-spanning 97% of nodes in the full SSN-was partitioned using the Leiden algorithm to identify clusters within the network 26 . Here, a cluster is a collection of nodes (protein variants) that are highly connected to each other but have relatively few links to nodes in other clusters.
This SSN contains nine clusters with substantial membership, with a modularity score of 0.75 indicating that the clusters are well-defined (possible values between −0.5 and +1, where a higher value indicates stronger divisions into clusters). The clusters are further grouped into two equally sized groups separated by the presence or absence of the 8aAla insertion (Fig. 6). Thus, the 8aAla insertion appears to divide the sequence space, likely through forcing a different conformation in the DRSbinding pocket 27 . Within each region of the sequence landscape, the clusters are distinguished by the nature and position of the defining single large hydrophobic residue. The Φ-X-Φ motif is apparent in 7 out of 9 clusters, typically with one residue strictly defining the cluster (e.g., 11 L in yellow and green clusters) and the second residue one of several choices. Two clusters show a slightly different motif: the cyan variants share the 9L-X-11F motif and the smallest, black cluster is defined by the 9M-X-11I anchoring motif-in each case, one I/L residue is substituted for a chemically similar large hydrophobic residue (Fig. 6).
Examination of the network structure shows that this SSN is not scale-free; that is, there are no clear hub variants in this network 28 . Instead, the D-domain SSN predominately features numerous moderately well-connected variants ( Supplementary  Figs. 21 and 22), which could be explained by the number of dimensions that are simultaneously probed. When screening multiple site saturations in parallel, fitness peaks might disappear as a result of neutral ridges that connect these fitness peaks in the multidimensional fitness space 29,30 . As a result, Gavrilets's "holey landscape" model predicts high fitness genotypes (in this case, the 29,563 variants) to permeate all regions of genotype space, reducing the expected number of clusters seen in a SSN. Here, the many solutions for binding and activation of human kinases established by positive epistasis are well connected, allowing kinases to roam genotype space more freely, potentially aiding the Φ-X-Φ 6-X-7a 7a-X-9 9-X-11 11-X-13 Φ-X-Φ Φ Φ 6 7a 8a 13 11 9 6 7a 8a 13 11 9 7a 8a 11 9 6 7a 8a 13 11 9 6 7a 8a 13 11 9 6 7a 8a 13 11 9 6 7a 8a 13 11 9 6 6 1 13 3 1 1 11 1 1 process of diversification or evolution of novel signalling pathways.

Analysis of synthetic D-domains from the sequence network.
After examining the general features of the D-domain sequence landscape, we investigated the affinities of D-domains towards the ERK2-binding groove from different regions in the sequence space (Fig. 6).

Discussion
The development of a in vitro ultra-high throughput assay, in tandem with a highly randomised combinatorial library design, enabled us to perform a comprehensive analysis of the epistatic sequence determinants of binding and catalysis in the MKK1 docking domain. As an in vitro screen for reaction turnover for eukaryotic kinases that reaches such throughput (~10 7 variants per day, with minimal reagent consumption-less than 200 µL total), this assay makes biological phosphate transfer practically amenable not only to sequence exploration (as in this study) but also in the future to directed evolution. This work demonstrates the screening for the MKK1-ERK2 cascade, but testing other kinase cascade elements could be extended with a suitable cognate target peptide downstream, when designed to be susceptible to protease cleavage. The in vitro character of screening in droplet compartments is experimentally liberating: screening campaigns that are otherwise hindered by difficult heterologous expression 31,32 or functional redundancy and conflicting kinase activity from background kinases in the expression host 9,10,33 , will become possible in a cell-free environment 34 . Their readout should report directly on intrinsic properties of the proteins probed, i.e., protein-protein interactions, catalysis and regulation.
Such intrinsic effects have emerged in this study, defining allowed motifs and evolvability by characterising sequence diversity in six randomised positions: active docking D-domain sequences are strongly enriched for the Φ-X-Φ motif, which can be placed anywhere along our tested sequence space and predicts productive phosphate transfer between two kinases in the MAPK pathway. It is possible that this insight might have emerged in a lower throughput study, albeit with less statistical significance. However, in addition to identifying a sequence feature, the deepsequencing data illuminate a comprehensive, six-dimensional combinatorial sequence landscape: single Leu/Ile residues in isolation and the Φ-X-Φ motif in particular re-shape the rest of the D-domain sequence preference through widespread positive epistasis.
Previous sequence-function mapping studies have largely scanned the effect of single-site mutations across full genes 35 . Datasets that include an analysis of epistatic effects have mostly focused on pairwise interactions only 25,36 , such as in the determination of structural interactions 37,38 , While higher-order epistasis (i.e., interactions between more than two residues) has been recognised as important in adaptive evolution and key to determining evolutionary trajectories 36,39 , investigations have focused on small recombination libraries 25,40 , while few have examined higher-order epistasis exhaustively with highthroughput experiments 9,41-47 . In the MKK1 docking domain, widespread positive epistasis generates evolutionary contingency, avoiding loss of function and enabling smooth transitions between residue combinations. Consequently, a type of sequence landscape emerges in which the functional variants are highly interconnected but do not contain any clear hubs, which could be explained by the Gavrilets Holey Landscape model where the presence of neutral ridges in multidimensional fitness space allows variants to permeate all regions of sequence space. In our data, mutational tolerance is epistatically enhanced in the presence of a Leu/Ile residue pattern and the Φ-X-Φ motif that promotes adaptability through positive epistasis. These features may help explain how diverse signalling cascades emerged in evolution. In addition to the versatility that the recognised modularity of kinases provides 48 , our observations suggest that epistatic effects contribute to robust intrinsic evolvability of protein function.
The experiments described in this work mimic a neutral drift scenario in evolution (a non-adaptive exploration of sequence space) that plays a prelude to future adaptive evolution by creating a multitude of starting points for functional innovation 49,50 . When the trade-off between mutational burden and retention of function in this neutral phase is favourable, successful adaptive evolution becomes more likely 51 . Such evolutionary contingency will smoothen evolutionary transitions to expand kinase cascades that govern cellular development, stress responses and maladaptive states such as cancer 52 . One of the aims of synthetic biology is the construction of synthetic networks (e.g., operating in engineered T-cell therapies) with complex and predictable outputs 53 .
More generally, our approach exemplifies how the functionality of a sequence motif can be dissected, categorised and understood by carrying out essentially just one miniaturised ultra-high-throughput experiment followed by sequence interpretation. Ultra-high-throughput technologies are key in this endeavour: the >100-fold greater throughput compared to previous human kinase deep-sequencing experiments 5-8 has yielded additional insight and at the same time provided, compared to traditional experimentation, a richer dataset suitable for systematic examination of sequence landscape parameters.

Methods
Protein expression and purification. All DNA and protein sequences are denoted in Supplementary Table 4. The MAPK ERK2 construct (UNIPROT ID P28482) was cloned as part of a pET28a expression plasmid. Oligonucleotides used for sitedirected mutagenesis to generate ERK2 T185A/Y187A were designed with NEBa-seChanger. The subGFP construct was cloned into a pHATA 54 expression vector resulting in pHAT-Avi-subGFP, where subGFP was in frame with an N-terminal His6 and Avi-tag. The FRET sensor for secondary validation was cloned into the expression vector pOP3BT, containing an N-terminal His8 tag.
All proteins were purified using standard Ni affinity chromatography. ERK2 was dialysed to Phosphate Buffered Saline (154 mM NaCl, 5.60 mM Na 2 HPO4, 1.06 mM KH 2 PO 4 without calcium and magnesium, from here referred to as PBS (Lonza)) with 2 mM DTT for all experiments except for fluorescence anisotropy, for which ERK2 was dialysed to PBS_A (PBS, 2 mM DTT, 2 mM TCEP, 2 mM EDTA, 2 mM EGTA and 0.1% (v/v) Brij35). The FRET sensor was dialysed to 20 mM Tris-HCl (pH 8.0) 100 mM NaCl. subGFP was dialysed to PBS. All proteins were aliquoted for single-use, flash-frozen and stored at −80°C.
Bead functionalisation with full-length MKK1 amplicons and subGFP. The paramagnetic carboxy-beads (Ø 5 µM; S1964, microParticle, Berlin) were first functionalised with azide and Tamavidin-2-HOT as previously described 22 . For click chemistry-mediated coupling of DBCO-DNA and azide, beads were washed in 1.5-mL Eppendorf tubes with 500 µL PBS (to which was added 0.05% (v/v) Tween-20), on a DynaMag-2 (ThermoFisher) magnetic rack which allowed separation of the magnetic beads from the buffer. This was followed by a washing step with 200 µL binding buffer (Dynabeads kilobaseBINDER kit).
The coupling reaction of the full-length, DBCO-linked MKK1 amplicons to azide-modified beads used~2 × 10 7 DNA fragments per bead for complete saturation, assuming the average molecular weight of each base pair to be 650 Daltons. For example, a 100 µL solution of ca MKK1 amplicons (1660 bp, 1.5 µg/µL) would be used to functionalised 4 × 10 6 beads-adding up to 2.1 × 10 7 DNA fragments per bead.
Beads were mixed with equal volumes of Binding Solution (Dynabeads kilobaseBINDER Kit, ThermoFisher) and purified DNA. The beads were incubated in a shaker (90 min, 1150 rpm, 37°C). The supernatant was removed using the DynaMag-2. The beads were washed with 40 µL of Washing Solution (Dynabeads kilobaseBINDER kit) while immobilised to the magnet without resuspension. Next, the beads were washed as usual with PBS (0.05% v/v Tween-20).
Preparation of SpliMLiB fragments. SpliMLiB library assembly has been described previously 22 . SpliMLiB design and oligonucleotides used are shown in Supplementary Fig. 25 and Supplementary Table 6.
SpliMLiB assembly. The 12 unique fragP13X oligos were attached via clickmediated chemistry as described for the full-length MKK1 amplicons.
Digestion and ligation of the final fragment P6X. After ligation of frag(7aX) and subsequent digestion of the beads, the last fragment could be ligated on. FragP6X was first digested in twelve parallel reactions consisting of the purified oligo (~20 µL of 400 ng/µL solution) diluted in 40 µL 1 × 3.1 buffer in 0.02% v/v tween with 4 µL BspQI. The mixtures were incubated for 90 min at 50°C, and purified using silica columns (Zymoclean Gel DNA Recovery, Zymo Research). Zymo PCR clean-up kit, and eluted in 10 µL MiliQ to give~200 ng/µL solutions of the digested P6X fragments. The digested beads which contained the sticky overhang from the previously ligated fragment, frag(7aX) were divided over 12 PCR tubes. To this, 100 µL ligation mixture was added, consisting of 10 µL (~2 µg) of the digested, purified fragP6X, in 1× T4 in 0.02% (v/v) Tween-20 with 5 µL T4 Ligase. The beads were washed with PBS (0.05% (v/v) Tween-20), pooled and stored at 4°C. Finally, subGFP was attached, as described for beads functionalised with full-length MKK1 amplicons above.
De-emulsification and chymotrypsin digest. Before de-emulsification, the excess oil below the emulsion layer was removed using a Gel-Saver-Tip II (200 µL). PBS (0.05% (v/v) Tween-20) (100 µL) and 1H,1H,2H,2H-Perfluorooctanol (PFO) (Alfa Aesar) PFO (18 µL) were added to the emulsion and mixed through manual rotation of the tube. After phase separation, the aqueous layer containing the beads was carefully removed to avoid oil contamination. This process was repeated three times.
Flow cytometric analysis and FACS. Flow cytometric analysis was carried out on a FACScan Cytek machine for beads stored in PBS (0.05% (v/v) Tween-20). SubGFP fluorescence was quantified using 488 nM excitation, with a 525/40 bandpass filter. TexasRed fluorescence was quantified using 561 nm excitation, with a 610/20 bandpass filter. Cy5 fluorescence was quantified using 640 nm excitation, with 660/10 bandpass filter. Flow cytometric sorting of beads was performed on a BD FACSAria Fusion, with four-way sorting into different tubes according to subGFP fluorescent intensity. SubGFP fluorescence was quantified using 488 nM excitation, with 530/30 bandpass filter. Cy5 fluorescence was quantified using 640 nm excitation, with 660/20 bandpass filter. The forward and side scatter profiles were used to ensure sorting was restricted to single beads, while Cy5 gating was used to select for beads that contained full-length ca MKK1 library DNA.
Recovery of full-length MKK1 amplicons from sorted beads for the secondary FRET assay. The beads sorted into the high, medium and low activity gates were pooled accordingly and washed with PBS (0.05% (v/v) Tween-20). From the pooled beads,~5% (v/v, after vortexing) were transferred for full-length recovery by PCR, whereas the other 95% (v/v) was used for recovery of the D-domain sequence (see below). The bead populations were washed with 1x Q5 buffer (NEB). The fulllength MKK1 genes were recovered using Q5 High-Fidelity 2x Master Mix (NEB), using the primers FL_Rec_F and FL_Rec_R (Supplementary Table 5). The PCR tubes containing the beads in 1× Q5 buffer were placed on a DynaMag-96 Side Magnet (ThermoFisher), and the supernatant was carefully discarded. To each tube, a 100 µL reaction mastermix was added consisting of 0.5 µM of each forward and reverse primer and 1× Q5 MM (NEB). Thermocycling was performed starting with 30 s at 98°C, followed by 30 cycles of 10 s at 98°C, 15 s at 67°C, 60 s at 72°C, followed by a final extension step at 72°C for 2 min. The PCR mixture was purified over silica columns (Clean & Concentrate, Zymo Research). The PCR recovery product was gel extracted and purified using silica columns (Zymoclean Gel DNA Recovery, Zymo Research). The PCR fragment was cloned by Gibson assembly 55 into an empty, linearised pET28a expression vector, amplified with the primers pET28_Rec_F and pET28_Rec_R (Supplementary Table 5). The plasmid population was transformed to α-Select Chemically Competent E. coli (Bioline, now discontinued). Colonies were picked to inoculate a 5 mL LB kan culture. The next day, the plasmid was purified using Miniprep (Qiagen), and sequenced. Plasmids that contained MKK1 genes with mutations outside of the D-domain (an artefact of amplification by BIOTAQ polymerase, see above) or not containing the MKK1 insert were discarded from further analysis.
Cerulean (excitation maximum λ 433 , emission maximum λ 475 ) transfers energy and excites Citrine (excitation maximum λ 516 , emission maximum λ 529 ) so that a high emission ratio corresponds to the intact (phosphorylated) FRET sensor, and a lower emission ratio to the cut (non-phosphorylated) FRET sensor after exposure to chymotrypsin. The bandwidth of excitation/emission was 9/29 nm, respectively, with 25 flashes per measurement.
Recovery of D-domain sequences from sorted beads and deep sequencing. The number of PCR cycles for recovery of the D-domain was first optimised to prevent PCR-induced biases 56 Fig. 9). A pre-exponential fluorescent threshold was set, and the number of cycles required to reach said threshold was correlated to the number of beads. The resulting formula (r 2 0.99) was used to calculate the number of cycles required to reach the same threshold when recovering the D-domain from sorted library beads (Eq. (2)). In which y equals the number of cycles and x the number of beads.
The pooled low (2.1 × 10 6 beads), medium (2.8 × 10 5 beads) and high bead population (8.1 × 10 4 beads) were washed with 1× Q5 buffer, dividing each sample into eight reactions for recovery PCR. After removing the 1x Q5 buffer, a 100 µL reaction mastermix (consisting of 0.5 µM of each forward (NGS_F) and reverse primer (NGS_R) in a 1× NEBNext Ultra II Q5 Master Mix (NEB) was added to the beads (Supplementary Table 5 Deep-sequencing analysis. The raw forward and reverse reads were merged using PEAR 57 and filtered for reads containing the correct 5-or 6-base forward or reverse sequence. Due to the close spacing of substitutions and insertions in a short sequence, the reads could not be sensibly aligned to reference as DNA sequence. Therefore, all following steps were performed on the protein sequences using a set of custom Python scripts, which are freely available at www.github.com/fhlab/ Kinases. The D-domain sequence was extracted from FASTQ files (using script "proteins.py") by searching all DNA reads with a regular expression for DNA sequence that matches the library design, and all such sequences were translated into protein sequence. The sequences were aligned to the reference sequence MPKKKXTPXQXNXAPDG using the Emboss implementation of Needleman-Wunsch algorithm 58 with a custom scoring matrix (Supplementary Table 7). Each amino acid match is scored equally positive (+5), all amino acid mismatches are penalised equally (−1) and a match to a randomised X position is scored as weakly positive (+1). Insertions and deletions are scored with gap open penalty 3 and gap extension penalty 5. This scoring scheme allows successful alignment of short, highly randomised sequences and correctly places the inserted residues (7a, 8a if present) after the randomised substitution position. The aligned sequences were parsed to extract and count all present mutations and stored as a dictionary for later analysis.
All three sequencing fractions were analysed to determine the number of observed variants, setting a cut-off of ≥ 10 sequencing reads for the high gate to reduce spurious observation of sequencing noise. However, because the medium and low gates are substantially more diverse, a lower cut-off of ≥ 3 supporting reads was used there. The amino acid diversity at each randomised position ( Supplementary Fig. 11) was calculated by extracting the amino acid position at each randomised position and weighing the composition by the number of times each variant was observed in the fraction of interest.
Statistics. The variants observed in the high gate were filtered according to the number of reads in the high gate (51 or more) to generate a final active dataset of active variants. This threshold was chosen after consideration of multiple filtering schemes (see Supplementary Notes 1 and 2) to reflect the prevalence of the WT sequence in the high gate.
The single position enrichment was derived by dividing the observed proportion (frequency) of each possible amino acid, f obs a (ranging between 0 and 1), by the ideal proportion f id a that is expected in a balanced library (f id a ¼ 1 12 ; 1 13 ; 1 2 for positions 6, 9, 11, 13; 7a; and 8a, respectively). Similarly, the two-position enrichment was calculated by dividing the joint observed frequency of amino acids a and b, f obs a;b , by the ideal joint proportion f id a;b ¼ f id a f id b (in the ideal case, the probabilities are independent) (Eq. (3)).
The presence of epistatic interactions between positions (i.e., deviation from the independence of probabilities) is obtained by dividing the f obs a;b ð¼ f id ajb f id b Þ by the joint proportions expected from the one-dimensional enrichment at each positions, given by f obs a;b ¼ f obs a f obs b without epistasis (Eq. 4) (see Supplementary Note 3 for detailed derivation). We repeated the analysis using the amino acid distribution in all observed variants instead of the ideal distribution and reached identical conclusions.
Exploration of the sequence similarity network. A sequence similarity network was constructed with Python using the NetworkX package (https://networkx.org/, version 2.4) and the open-source visualisation software Gephi (https://gephi.org/, version 0.9.2). Each variant in the final active set of active variants was added as a node and the Hamming distance between all pairs of variants was calculated; an edge was added to the network if the Hamming distance was equal to 1. The degree, betweenness, closeness and eigenvector network centrality scores were calculated, but they did not reveal any clear hub nodes. We addressed the question whether network can be partitioned in a meaningful way in order to explore node connectivity by the Louvain method 59 for community detection, as implemented by the "community louvain" module (https://pythonlouvain.readthedocs.io/en/latest/). This shows that a standard Louvain partitioning had a good modularity score of 0.7 (range −0.5 to +1.0, a positive value is better). However, the Louvain algorithm only guarantees that each node is well connected to its own community, yet some communities might be poorly connected to themselves (that is, missing a bridge within the community). Such mis-assignment can happen, if the bridge node within a community is also well connected to nodes outside of the community: so that it gets reassigned to the outside community, leaving the original community divided. Therefore, we used the improved Leiden algorithm (method described by ref. 26 , implemented in https://leidenalg.readthedocs.io/en/stable/intro.html), which identified nine large communities within the largest connected subgraph of the full network. The D-domain sequences in each community were extracted in FASTA format and used to create a WebLogo representation of sequence content 60 .
Peptide affinity assays. The procedure has been adapted from Garai et al 2 . All synthetic peptides were ordered from GenScript with purities >95%. The fluorescent reporter peptide fl-HePTP was purchased with an N-terminal fluorescent dye (5-carboxyfluorescein). Fl-HePTP was dissolved in MilliQ water. All other D-domains were dissolved in PBS_A.
Reaction volumes for all experiments were 25 µL, prepared in a 384-well assay plate (Corning, type 3820). For K D determination of fl-HePTP, ERK2 in PBS_A was concentrated to 240 µM using Amicon Ultra-4 centrifugation filters (Amicon, 30 K). Protein was sequentially diluted in PBS_A, and mixed 1:1 with fluorescent reporter peptide. For K i determination of non-labelled D-domains, D-domains (dissolved to 1 mM) were serially diluted in PBS_A. A solution containing ERK2 (25 µM) and fl-HePTP (40 nM) (final concentrations) were mixed 1:1 with the D-domain dilutions. Corning plates containing the reaction mixtures were spun down (2 min, 4000 rpm). Fluorescence polarisation was measured using a PheraStar FS (BMG Labtech), with a FP 485 520 520 filters for 5-FAM HePTP.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The NGS data are available at the European Nucleotide Archive under accession number PRJEB39786. The processed NGS data are included with the source code (see below) in the "data" directory. Source data are provided with this paper.