An in vivo platform to select and evolve aggregation-resistant proteins

Protein biopharmaceuticals are highly successful, but their utility is compromised by their propensity to aggregate during manufacture and storage. As aggregation can be triggered by non-native states, whose population is not necessarily related to thermodynamic stability, prediction of poorly-behaving biologics is difficult, and searching for sequences with desired properties is labour-intensive and time-consuming. Here we show that an assay in the periplasm of E. coli linking aggregation directly to antibiotic resistance acts as a sensor for the innate (un-accelerated) aggregation of antibody fragments. Using this assay as a directed evolution screen, we demonstrate the generation of aggregation resistant scFv sequences when reformatted as IgGs. This powerful tool can thus screen and evolve ‘manufacturable’ biopharmaceuticals early in industrial development. By comparing the mutational profiles of three different immunoglobulin scaffolds, we show the applicability of this method to investigate protein aggregation mechanisms important to both industrial manufacture and amyloid disease. Protein aggregation remains a significant challenge for manufacturing of protein biopharmaceuticals. Here, the authors demonstrate the use of directed evolution and an assay for in vivo innate protein aggregation-propensity to generate aggregation-resistant scFv fragments.

O ver the last 30 years, recombinant antibodies have emerged as highly effective therapeutics 1,2 . Antibodybased medicines now comprise over half of first-time approvals 3 , and seven of the ten highest grossing pharmaceuticals in 2018 were based on antibodies or antibody-like scaffolds 4 . This success, and the development of more sophisticated therapeutic strategies based on antibody scaffolds that incorporate multidentate interactions and/or effector functions 5,6 , is partly due to the ability to readily generate high affinity candidate therapeutics using hybridoma or phage display platforms [7][8][9] . While the structural and biophysical properties of antibodies and other protein scaffolds allow the formation of highly avid complexes, the inherent metastability of proteins can result in local or global unfolding that can lead to inactivation and/or protein aggregation. Here, we define aggregates as all species with higher molecular weight than the soluble monomer. This encompasses stable or transient interactions between ordered or disordered states. Aggregation can also be triggered by native-state interactions (colloidal or hydrophobic). As proteins are subjected to various stresses during manufacturing that increase the risk of protein misfolding and aggregation 10 , overcoming aggregation (which may be associated with low protein stability and/or low solubility) is a major hurdle in the development of biopharmaceuticals. Aggregation compromises the quality, stability, and even safety of a drug product [11][12][13] , yet our ability to identify 'manufacturable' candidates with long-term stability during lead isolation and optimisation remains challenging. Similarly, our ability to predict the in vivo aggregation propensity of intrinsically disordered proteins and globular proteins associated with protein aggregation diseases 14 and how subtle sequence changes alter aggregation in vivo/in vitro are also currently beyond our means. One reason for this is a lack of structural and molecular understanding of the mechanism(s) of the initiation and propagation of protein aggregation, making development of a suitable screen difficult 11 . For biopharmaceuticals, the relationship between different 'developability' assays has recently been delineated 15 . However, the ability of these assays to predict manufacturability and longterm stability remains poor 16 , due to lack of a known key quality attribute for aggregation resistance.
A variety of in silico tools have recently been developed to identify aggregation-prone sequences to guide rational design of proteins with enhanced properties [17][18][19][20][21][22] , for example, by identifying regions of poor solubility 18,23 in the primary sequence or three dimensional structure of a protein. While prediction of sequences with high aggregation propensity 19,24 or low solubility 18,22 is possible, predicting which of these aggregationprone regions (APRs) will become exposed (or sequestered) by protein folding or unfolding events remains a significant challenge. Such complexity is highlighted by recent work on Tau, an intrinsically disordered protein whose aggregation is linked to neurodegenerative disease 25,26 . Molecular dynamics simulation is an attractive option to identify such APRs 27 , but the necessity of a structural model, length of computational time, the need for a greater understanding of the conformational fluctuations that trigger aggregation, and the availability of suitable force-fields to replicate the stresses found in manufacturing make this approach challenging.
In principle, directed evolution methods using phage, ribosome, or yeast display are powerful approaches capable of investigating the effects of sequence changes on protein aggregation. For example, biopharmaceutical model proteins have been generated, a priori, with enhanced soluble expression 28 , thermodynamic stability 29 or resistance to heat-or acid-induced aggregation 30,31 . As aggregation can occur by a variety of mechanisms including partial unfolding and homo-or heterotypic interactions in the native state, no singular property drives aggregation. Consequently, development of a suitable screen to enable the selection and optimisation of biopharmaceuticals for resistance to innate aggregation by directed evolution has not been possible. We have previously developed a tripartite βlactamase enzyme assay (TPBLA) (Fig. 1a, b) that allows the identification and ranking of aggregation-prone peptides, including the Alzheimer's peptide (Aβ 40/42 ) and islet amyloid polypeptide (IAPP) 32 . In this assay, the test protein is fused inframe between the two domains of the E. coli periplasmic enzyme β-lactamase (βLa, Fig. 1a, b). This assay thus directly links the aggregation-propensity of the test protein to the susceptibility of the bacterium to β-lactam antibiotics 32 . Importantly, by relying solely on the innate aggregation propensity of the protein of interest, the screen does not use arbitrary methods to destabilise proteins (e.g. heat and chemical denaturation 33 ) that may not reflect the inherent dynamics of the test protein relevant to aggregation during biomanufacture or in disease 32,34 .
Here, using both therapeutically relevant proteins and proteins involved in aggregation disease as examples, we show that the TPBLA can be used to assess the aggregation propensity of a variety of protein structural scaffolds, including scFv fragments from two monoclonal antibodies (mAbs) that differ by just three amino acids in their V H domains, but which have fundamentally different aggregation properties 35 . We then show that the TPBLA can be used as a screen for directed evolution experiments to select for sequences that are aggregation-resistant. Importantly, the approach does not require any structural knowledge or prior biophysical information about the protein of interest, and can be used to reveal residues that modulate aggregation that could not be predicted a priori using currently available algorithms. At a fundamental level, the ability to detect multiple aggregation liabilities simultaneously enables both spatially clustered and more subtle pair-wise interactions that dictate aggregation to be identified, allowing the delineation of aggregation hotspots in both industrially-relevant and medically-important proteins. In addition, identifying large numbers of sequence variants that endow protection from aggregation will empower the development of algorithms better trained to predict aggregation of biologics and other protein scaffolds. This will allow a greater understanding of the relationship between sequence, solubility and aggregation, the developability of promising biologic candidates and the prediction of mutations that may cause protein aggregation disease.

Results
Protein aggregation correlates with bacterial survival. The TPBLA has been used previously to rank the aggregation propensity of intrinsically disordered proteins and two pairs of globular proteins 32 : β 2 microglobulin and D76N (an aggregationprone natural sequence variant), and Dp47d, a single V H domain nanobody and its non-aggregating counterpart Hel4 30 . To determine whether the assay is able to differentiate between aggregating and non-aggregating sequences of therapeutically relevant protein scaffolds, we compared the in vivo growth scores (area under the antibiotic survival curve, Fig. 1c, d) of Dp47d and its non-aggregation counterpart HEL4, alongside two other aggregation-prone therapeutically relevant protein scaffolds: granulocyte colony-stimulating factor (GCSF) (a 174 residue four helical bundle protein whose poor soluble expression in E. coli was improved 1000-fold (GCSF C3) by ribosome display and three parallel selection pressures 36 ), and the single chain variant (scFv) of an IgG1 antibody, MEDI1912 (referred to here as IgG WFL ) 35 . This recombinant human monoclonal antibody is specific for human nerve growth factor (NGF) and displayed significant aggregation and poor in vivo behaviour which was rectified in a variant containing three substitutions in the complementarity determining regions (CDRs) 1 (W35S and F36T, IMGT numbering 37,38 , Supplementary Fig. 1) and 2 (L64T) of V H , generating the variant referred to here as IgG STT 35 . The in vivo growth score of bacteria expressing each of these constructs was measured in a 48-well agar plate assay (Supplementary Fig. 2) over a range of ampicillin concentrations (0-140 µg mL −1 ) (Fig. 1c). For each protein, in vivo growth scores for the engineered variant with low aggregation (scFv STT , GCSF C3 and HEL4, Fig. 1d) (high in vivo growth score) is significantly enhanced relative to its aggregation-prone counterpart (scFv WFL , GCSF and Dp47d, Fig. 1d) (low in vivo growth score). These data validate the ability of the TPBLA to distinguish aggregation-prone proteins from their less aggregation-prone sequences over a range of different protein scaffolds.
As the biopharmaceutical sector is currently dominated by IgGs, and many next generation therapies will also be based on this class of proteins or their derivatives, we focussed subsequent work on scFv WFL and scFv STT . To assess the ability of the tripartite β-lactamase assay to differentiate between proteins with small changes in sequence, and to determine which of the amino acid substitutions (W35S, F36T or L64T) is responsible for the improved behaviour, the survival curves and in vivo growth scores for six variants that substituted W35S, F36T and L64T either individually or in combination were measured. The results showed that W35S largely endows aggregation resistance, **** **** ** c Fig. 1 The tripartite β-lactamase assay. a The test protein (green) is inserted into a 28-residue glycine/serine-rich linker (grey) separating the two domains of the E. coli enzyme TEM-1 β-lactamase (purple and pink). b Correct folding of the test protein in the E. coli periplasm enables the two halves of βlactamase to be brought into close proximity to form the functional enzyme active site that hydrolyses β-lactam antibiotics. c Antibiotic survival curve of the maximal cell dilution allowing growth (MCD GROWTH ) on solid medium over a range of ampicillin concentrations for bacteria expressing the aggregation-prone scFv WFL within β-lactamase (blue) or the aggregation-resistant sequence scFv STT (pink).  followed by F36T, with L64T making little contribution (Fig. 2, Supplementary Fig. 3). Importantly, this insight can be achieved without the need to express and purify the proteins for biophysical analysis.
In vivo scFv aggregation correlates with IgG1 aggregation. In order to use the TPBLA as a mAb developability screen it is essential that the minimal scFv constructs used in the assay yield similar aggregation propensities when reformatted as a full-length IgG. Consequently, each of the eight scFvs variants described above (WFL, WFT, WTL, WTT, SFL, SFT, STL and STT) were generated as IgG1 antibodies and their retention time on a highperformance size exclusion chromatography (HP-SEC) column was quantified (Fig. 2). HP-SEC is usually used in the biopharmaceutical industry to assess aggregation by quantification of monomer loss. As shown previously 35 , while IgG STT has an elution time consistent with a monomeric IgG1 (∼8.5 min), IgG WFL displays an asymmetric elution profile with a longer retention time (∼14 min) than expected based on monomer mass even in the presence of 125 mM arginine 39 . Consequently, the retention time was used in this study to assess the non-specific interactions/ aggregation of this family of biologics. Overlaying the retention times for all eight variants with the in vivo assay scores shows an excellent correlation between an improvement in bacterial growth and a decrease in column retention time (Fig. 2). The interface between IgG WFL dimers, formed en route to larger aggregates, has been shown previously (by chemical cross-linking and MS mapping) to be mediated by contacts between the V H and V L domains in different molecules 35 . Repeating these experiments for scFv WFL and scFv STT , using increasing concentrations of the amine-specific cross-linker bis(sulfosuccinimydyl)suberate (BS 3 ) showed that the majority of scFv STT (99%) remained monomeric, whilst 45% of scFv WFL was incorporated into higher-order oligomers ( Supplementary Fig. 4a), consistent with both the TPBLA and HP-SEC data. Analysis of the dimers formed in scFv WFL by mass spectrometry showed inter-protein cross-links between residues M0 (V H ) and K66 (V L ) and M0 (V H ) and M0 (V H ) (IMGT numbering), consistent with those formed for IgG WFL (Supplementary Figs. 4 and 5), confirming conservation of the interactions formed in the initiating stages of aggregation. Together, these results show that the TPBLA allows the rapid assessment of the aggregation propensity of scFv fragments, which are maintained when inserted into an IgG1 scaffold.
Evolving proteins with reduced aggregation propensity. Having established that the TPBLA could be used in candidate selection, we next sought to use the assay as a screen for directed evolution in order to search for novel sequences able to ameliorate poor developability for candidates with promising therapeutic activity such as IgG WFL . To achieve this, genetic variation was introduced into the gene encoding scFv WFL using error-prone PCR, before inserting the resulting library of sequences into the β-lactamase vector yielding a library of 1.3 × 10 6 mutants (Methods). The DNA sequences of 57 variants in the naive library revealed an average mutational frequency of 8 amino acid substitutions per scFv. For screening, the plasmid DNA library of βLa-scFv WFL variants (βLa-scFv WFL *) was transformed into E. coli SCS1 cells and plated onto agar containing 80 µg mL −1 ampicillin. At this antibiotic concentration, colonies should only grow if they express βLa-scFv WFL variants that increase the expression of soluble and functional β-lactamase, compared with wild-type βLa-scFv WFL (refer to Fig. 1 c). From the 315 colonies that could grow under this selection pressure, 185 variants were randomly selected and their in vivo growth score was measured, together with that of βLa-scFv WFL and βLa-scFv STT . The resulting data ( Fig. 3a) showed that 181 of these 185 variants displayed enhanced growth relative to WFL, with 12 having superior growth to the rationally engineered aggregation-resistant STT. To determine whether the in vivo growth score for these evolved variants also correlates with reduced aggregation propensity within an IgG1 scaffold, ten variants that spanned the rank order were converted to IgG1 molecules for further analyses. Molecules were selected sequentially across the rank (starting with 139, the best performing variant) by calculating in vivo growth scores separated by one standard deviation (s.d.) of the replicate error (βla-scFv STT n = 16, s.d. = 130). For each of these values, the variant with the fewest substitutions relative to WFL was selected for further analyses in IgG format. This identified eight variants (11, 176, 59, 72, 126, 130, 16 and 139). Two further sequences (37 and 128) were selected for study as these were found to retain the original WFL residues (W35, F36 and L64) yet had improved in vivo growth score. The location and identity of each of the substitutions for these variants are shown in Supplementary  Fig. 6. The aggregation propensity of these IgGs was then assessed using HP-SEC. As with the rationally engineered variants (Fig. 2a), a clear inverse correlation was observed between the retention time on HP-SEC and the in vivo growth score: antibodies with high in vivo growth scores exhibited shorter retention times reflecting reduced interactions with the column matrix ( Fig. 3b). The aggregation properties of the ten evolved IgG variants, together with IgG WFL and IgG STT , were also measured using AC-SINS (affinity-capture self-interaction nanoparticle spectroscopy) 40,41 . This method identifies self-association by an increase in the plasmon wavelength of gold nanoparticles upon their clustering induced by self-association of antibodies immobilised on their surface. Figure 3b shows an excellent direct correlation between the magnitude of the red shift in AC-SINS with the retention time by HP-SEC, adding further support to the ability of the TPBLA to select for sequences with reduced tendency to self-associate both as scFvs and as intact IgGs. Interestingly, as the majority of the evolved variants displayed similar thermal melting transitions to those observed for IgG WFL (56 and 73°C, assessed by differential scanning fluorimetry, DSF), no correlation was found between the aggregation propensity or the in vivo growth score with thermal stability ( Supplementary Fig. 7, Supplementary Table 1). The application of a single selection pressure may result in increased aggregation-resistance at the expense of target affinity, akin to affinity/stability trade-offs 29 . To assess this possibility, IC 50 values for the cognate antigen NGF 35 were measured for each of the ten evolved IgG variants by a competition binding assay monitored by fluorescence, and the results compared with IgG WFL , IgG STT and MEDI578 (the parent antibody prior to affinity maturation into MEDI1912 35 (Fig. 3c, Supplementary Table 2)). The results showed that all of the evolved antibodies maintain higher affinity to NGF than MEDI578, demonstrating that all variants retain functional activity, with no correlation between IC 50 and in vivo growth score values ( Supplementary  Fig. 8) 35 .

Mutation hotspots identify localised frustration within IgGs.
Analysis of the mutational frequency of individual residues within the aggregation-resistant scFv sequences enabled a proteinspecific profile of residues that might contribute to aggregation to be generated. Such 'hotspot' residues represent ideal targets for mutation to improve bioprocessability of the sequence when reformatted as an IgG, or to improve soluble expression of proteins more generally. The mutational-frequency profile across the V H and V L domains of the library (βLa-scFv WFL *) was constructed from the sequences of all 315 variants that grew under the selection pressure of 80 µg mL −1 ampicillin (Fig. 4a). This analysis revealed 12 hotspot residues with a mutational frequency significantly higher (>2 s.d.) than the mean (labelled by residue in Fig. 4a). Nine of these residues, which are all hydrophobic or aromatic, lie in the V H domain, and are clustered in the CDR regions: F30, W35, F36 (CDR1), I56, I57, I59 and F62 (CDR2), and I110 and L112c (IMGT numbering) (CDR3). The remaining three hotspot residues lie in the V L domain (K18, N57 and I71). The chemical identity of the most frequently selected residue, and whether a particular amino acid residue is enriched relative to all other residues possible via a single-base-pair change, was also assessed (Table 1) In vivo growth (A.U.) Fig. 3 In vivo growth score of evolved βLa-scFv WFL variants and the aggregation propensity and target affinity of ten selected variants in IgG1 format. a Ranked in vivo growth score of 185 variants (Inset shows error for controls βLa-scFv WFL and βLa-scFv STT , data represent mean values ± s.d. (n = 15 biological repeats)). Ten variants across the rank (11, 176, 37, 59, 128, 72, 126, 130, 16 and 139) were selected and reformatted as full-length IgG1s for biophysical analysis. b HP-SEC retention time (green dots, longer times indicate greater interaction with column matrix) and AC-SINS (purple triangles, larger plasmon shifts correlate with greater self-association. n = 3 technical repeats. Note: error bars smaller than symbols (mean values)) of the ten selected variants in IgG1 format. These data correlate inversely with in vivo growth score (grey bars represent mean values, error bars represent s.e.m. n = 3 technical repeats). c Data used to calculate the IC 50 values of binding of the ten evolved variants in IgG1 format to NGF determined using a homogeneous time-resolved fluorescence assay (HTRF). Data represent mean values ± s.d. (n = 3 technical repeats). Source data are provided as a Source Data file.   a Mutational frequency of the screened scFv WFL * library reveals 12 residues with a mutational frequency greater than two standard deviations from the average value (2σ). Nine occur in V H and three in V L . b The mutation frequency profile for screened scFv Li33 reveals only three sites with a mutational frequency >2σ. c The mutational frequency of the evolved V L domain, JTO, reveals 10 residues with a mutational frequency >2σ. All profiles use IMGT numbering. The cumulative mutational frequency is normalised to 1 for each dataset. Residues showing high mutational frequencies (>2σ) are labelled in each case. Datasets are pooled from two independent experiments. Source data are provided as a Source Data file.
of which are solvent exposed and hydrophobic) tended to be substituted with more hydrophilic residues (Ser and Thr), while the hotspot residues in the V L domain which were initially charged (K18), hydrophilic (N57) or hydrophobic (I71) were substituted with polar or other charged amino acids.
In order to understand whether the mutation frequency profile for scFv WFL was specific for this scFv sequence, or simply reflected innate frustration of the Ig-fold itself, we performed the same directed evolution screen on two other IgG scaffolds: a second industrially-derived scaffold (a scFv variant of the anti-LINGO-1 mAb, Li33 42 ) and a λV6-57 V L domain (JTO) isolated from a patient with multiple myeloma with tubular cast nephropathy 43 (Supplementary Figs. 9a and b). The resultant mutational-frequency profiles (Fig. 4b and c), from 140 and 75 DNA sequences, respectively, contrast markedly and also are distinct to that for scFv WFL (Fig. 4a). For JTO, the TPBLA reveals sequence-wide frustration, with clusters of frequently mutated residues (>2σ) observed both within, or directly flanking, its CDRs, as well as in the framework region (notably involving residues 19-21 and 51-54 and the C-terminal region). In accord with recent work by Rennella et al. 44 , this apparently non-specific profile may reflect the fact that the aggregation of this domain is driven from the unfolded state by interactions between APRs throughout the structure. In this case, the TBPLA may select for sequences with both decreased aggregation propensity and increased local or global thermodynamic stability (which decreases the population/lifetime of solvent exposed APRs). In light of this, the most frequently mutated hotspot was found in β-strand B (residues 19-21, Fig. 4c), highlighting this region as a particularly important driver of aggregation as reported previously 44 .
By contrast, both to this sequence-wide effect and the relatively localised profile for scFv WFL , scFv Li33 showed minimal frustration. Only three residues, F30 and I36 in V H (most commonly substituted with S and T, respectively) and L53 in V L (most commonly substituted with P), exhibited substitution frequencies significantly greater (>2 s.d.) higher than the mean. The difference in the profiles of scFv WFL and scFv Li33 is remarkable, given the similarity of their framework regions (66.5% similarity and 48.2% identity (Supplementary Fig. 10)), but may be expected as their poor developability has been ascribed to different mechanisms: aberrant CDR-CDR (WFL) 35 and CDR-constant region (Li33) 42 interactions. As the TPBLA employs scFvs, it cannot detect aberrant CDR-constant region interactions, and given that Li33's solubility depends critically on the type of IgG scaffold 42 , we hypothesised that the dominant evolutionary pressure in the TPBLA for this sequence may be thermodynamic stability, rather than its innate aggregation propensity, as was also previously observed for the soluble globular protein Im7 using this assay 34 . To test this hypothesis, the scFv sequences for Li33, Li33 I36T , Li33 Y88D and Li33 L53P (the single point variants with the highest in vivo growth score, Supplementary Fig. 9) were grafted into an IgG1 scaffold. Surprisingly, no significant changes in thermal stability were detected between the wild-type and evolved Li33 I36T , Li33 Y88D variants, whilst a single broad transition was observed for Li33 L53P (Supplementary Fig. 11, Supplementary  Table 1). Instead, small but significant reductions in selfassociation monitored by AC-SINS ( Supplementary Fig. 12a) over wild-type Li33 were observed for I36T and L53P. In addition, L53P and Y88D showed increased solubility relative to wild-type as assessed by a polyethylene glycol (PEG) precipitation assay ( Supplementary Fig. 12b). These data suggest that the TPBLA is able to identify (and resolve) specific problematic residues between proteins with identical topologies and highly similar sequences and does not simply identify scFvs with increased thermal stability. Together, these results demonstrate the power of the TPBLA to develop new understanding of the molecular determinants of aggregation associated with proteins of relevance for bioprocessing, as well as those associated with protein misfolding diseases.
Comparison of mutational hotspots with in silico predictions. Several in silico approaches have been developed to identify residues/sequences with poor solubility (e.g. structurally corrected Camsol 18 ), or high aggregation propensity (e.g. Aggregscan3D 45 and SAP 19 ). Comparison of the location of the sequence hotspots identified here for scFv WFL by directed evolution, with those predicted based on these algorithms are shown in Fig. 5a, b,  Supplementary Fig. 13 and Supplementary Table 3. The results portray the complexity in determining protein aggregation based on predictions alone. Each algorithm detected at least one of the insoluble or aggregation-prone residues in CDR1 and CDR2 of scFv WFL that form the large hydrophobic patch shown previously 35 , and recapitulated here for scFv WFL , to be involved in the aggregation interface. The identity of the residues involved, however, varied between algorithms. SAP and Aggrescan3D also identified a third hotspot-cluster in CDR3. In addition, each in silico method highlighted additional residues in V H that were not identified by directed evolution, and no in silico method identified any of the hotspot residues in V L . In total the three algorithms highlighted 26 residues as potential positions in which aggregation could be suppressed by mutation, including eight of the 12 most frequently mutated residues identified here by directed evolution. However, only three residues are flagged by all three algorithms (Supplementary Fig. 14).
Since the variants of scFv WFL generated by evolution each contain several mutations (for example see Supplementary  Fig. 6b), the importance of individual amino acid substitutions to the properties of the proteins remained unknown. To examine the relative importance of each individual substitution and to determine how these values relate to in silico predictions, the in vivo growth score for βLa-scFv WFL * variants containing the most common amino acid substitution in each of the 12 hotspot residues was measured ( Supplementary Fig. 15). While no single substitution was found to match the in vivo growth score for STT (690 ± 8 A.U.), F62S, a residue not mutated in rational design of STT 35 , achieved 91% of this enhancement (643 ± 27 A.U.). By contrast, F36, which was mutated to Thr in the rational design of IgG STT , only achieved 15% of the enhancement when mutated to Ser in our assay (scores for WFL and F36S were 172 ± 53 A.U. and 250 ± 9 A.U., respectively). Quantifying the effect of single substitutions also did not improve the correlation with in silico methods. For example, only three residues (F36, I59 and F62) are flagged by all three computational methods ( Supplementary  Fig. 14), yet these vary considerably in their in vivo growth score (ranked 10th, 5th and 1st, respectively ( Supplementary Fig. 15)). Overall, while there is some agreement between in silico approaches and the TPBLA, the identity of the problematic residues determined experimentally and predicted computationally varies considerably (Supplementary Table 3). Hence, identification of the key residues to target by rational engineering would be difficult using a multi-algorithm approach, highlighting the power of using evolution to find solutions to the problem of aggregation.

Discussion
Split β-lactamase assays have been used previously as a proxy for several characteristics including protein-protein interactions 46 , a marker for gene expression 47 , and for selecting open reading frames 48 . These assays exploit β-lactamase's stability, the presence of permissive grafting sites in its structure, and the potential for high throughput screening via colorimetric assays. Here, we have shown the ability of the versatile tripartite β-lactamase platform 32,34,49 to distinguish aggregation-prone variants of diverse biopharmaceutically-relevant protein scaffolds from their more aggregation-resilient counterparts. In contrast with other in vivo systems for studying protein aggregation 50,51 , this assay has the advantage that fusion proteins are expressed in the periplasm of E. coli, allowing the formation of disulfide bonds, such as those found in IgGs and their derivatives. Most importantly, no perturbant such as increased temperature, pH or chemical denaturant is used to accelerate aggregation, allowing identification of sequence characteristics that trigger innate (unaccelerated) aggregation pathways. Furthermore, this assay has broad utility as it is agnostic to the underlying mechanism of aggregation (e.g. unstructured peptides with a high propensity for amyloid formation as well as for globular proteins that selfassemble through a variety of mechanisms). We have shown that the aggregation propensity driven by the self-association of IgG WFL is largely determined by the Fv region, as IgG1 and scFv homologues yield the same rank order of aggregation propensity judged both within the βLa fusion, and as purified IgGs (Fig. 2a).
As scFvs are commonly reformatted into IgG scaffolds, and scFv formats are frequently used in phage or other display systems, the assay could readily be integrated into the development pipeline to identify developable sequences directly after discovery and affinity maturation. The assay is amenable to any protein displayed as a single chain, and hence could be used to optimise a wide variety of biologics, including dAbs, scFabs, scFc (with coexpression of these allowing the detection of Fab:Fc interactions) and bispecifics (in scFv format) all of which are poorly characterised in terms of developability relative to platform IgGs. We have also shown here the power of the TPBLA combined with directed evolution to rectify problematic sequences, and to identify mutational hotspots that limit protein behaviour (due to a variety of mechanisms) in both the variable and framework regions. Here, we took a previously characterised IgG with known development issues 35 , IgG WFL , and engineered new sequences (Fig. 3a) with reduced self-association as measured by HP-SEC and AC-SINS (Fig. 3b, c). It is notable that the sequence of the rationally engineered IgG STT was not isolated during screening. The best performing evolved variant involved substitution of only one of these residues (F36), demonstrating the advantages of directed evolution and selection over rational approaches. Screening a randomised scFv library of a second industriallyderived sequence (Li33) identified substitutions in different hotspots. These substitutions were found to improve solubility in the context of an IgG1 scaffold ( Supplementary Fig. 12), demonstrating its broad utility. In this regard, it is intriguing that the mutational-frequency profile for the 'synthetically-derived' sequences of WFL and Li33 differ significantly from one optimised for humoral immunity, examined here using JTO. A complex mutational profile is observed for the latter, which may reflect both the selective pressure to increase thermodynamic stability and minimise APRs (whose identity found here using the TPBLA correlates with experimental results and in silico predictions 44 ). By contrast, the majority of hotspot residues identified for both scFv WFL and scFv Li33 are located in, or close to, the CDRs. Given the importance of the CDRs in determining epitope binding affinity, this is unsurprising and, at first glance, may appear to be problematic for the maintenance of a successful candidate profile. We have shown here, however, that binding affinity can be maintained concomitantly with a significant improvement in aggregation performance, at least for variants of IgG WFL , presumably because only a subset of CDR residues are directly involved in epitope binding 35 . Indeed, the crystal structure of the scFv of the IgG MEDI578 (the parental sequence of IgG WFL ) in complex with its ligand 35 shows direct interaction of only 6 of the 16 residues in CDR1 and CDR2, with CDR3 making 13 out of a total of 22 contacts to NGF.
Despite the ability of the TPBLA to generate candidate sequences with greatly improved properties, it remains challenging to determine how the different amino acid substitutions introduced actually ameliorate aggregation. This arises because aggregation results from a complex interplay of properties that includes kinetic and thermodynamic stability, the number and position (solvent accessibility) of aggregation-prone regions, and local dynamics that may expose those regions. Analysis of the substitutions made here for scFv WFL in the context of their location in the protein (Table 1) allows the cause of the liability to be putatively assigned. For example, 10 of the 12 hotspot residues found for scFv WFL are hydrophobic/aromatic in nature and all were substituted with more hydrophilic residues, consistent with the mechanism of aggregation suggested previously for this protein 35 . In accordance with this hypothesis, three different algorithms that predict solubility and aggregation propensity of amino acid sequences within structured 18 and dynamic protein domains 19,45 , identify the same region. These algorithms, however, yield different predictions, confusing the choice of residues to mutate in any rational approach to improve protein behaviour.
The ability of the TPBLA to quantify aggregation propensity in the absence of protein purification engenders its use as a fundamental research tool. Firstly, it allows optimisation of protein expression for experimentally intractable proteins (e.g. aggregation-prone or insoluble proteins). Secondly, using the TPBLA as a screen for deep mutational scanning 52 would allow areas of sequence frustration (i.e. destabilisation due to functional constraints or aggregation propensity) to be mapped and their underlying mechanisms of aggregation to be better understood, enabling their modification using rational engineering approaches. A fundamental understanding of biopharmaceutical aggregation during manufacture and storage is still lacking more than 30 years since the first introduction of IgGs into the clinic. We have shown here that the TPBLA is a powerful method by which to identify (using the TPBLA alone), or re-engineer (using the TPBLA as a directed evolution screen) inherently manufacturable proteins. Combining the TBPLA with the approaches described above may thus be of enormous practical benefit to create proteins with improved behaviour, and when combined with evolution methods, may be able to provide the mechanistic understanding needed to apply a true quality-by-design approach to biopharmaceutical discovery and development.

Methods
Construction of β-lactamase fusions. Synthetic DNA sequences (purchased from Eurofins Genomics) encoding GCSF, scFv WFL and scFv Li33 , and their variants, were inserted in-frame into the 28-residue G/S linker (previously inserted between residues 196 and 197 of TEM-1 β-lactamase 34 ) in the pMB1-βla-linker plasmid via a 5′ (XhoI) and 3′ (BamHI) restriction site. Ligation products were transformed into E. coli SCS1 cells (Stratagene) and the cells grown on agar plates containing 10 µg mL −1 tetracycline. The identity of the resulting clones was verified by DNA sequencing (DNA and amino-acid sequence of the βla-linker and the test protein variants used are listed in Supplementary Tables 4 and 5).
In vivo growth assay. MCD GROWTH (maximal cell dilution allowing growth) assays were performed in sterile 48-well LB agar plates (Greiner Bio-One, cat. 677102) prepared prior to the assay. Tetracycline (10 µg mL −1 final concentration) and filter-sterilised L-arabinose (final concentration of 0.075% (w/v) for scFv WFL / scFv Li33 or 0.1% (w/v) for JTO) were added to 100 mL of sterile 1.5 % (w/v) LB agar cooled to <50°C. Three hundred microlitres of this solution was added into each of the first 6 wells (first row) of the 48-well plates. Ampicillin (10 mg mL −1 stock) was then added to the LB agar stock to give the required concentration for the next row of wells. This procedure was repeated until the plate contained 8 rows of LB agar containing increasing concentrations of ampicillin. β-lactamase-test protein constructs were screened over an ampicillin range of either 0-140 μg mL −1 (20 μg mL −1 increments) or 0-280 µg mL −1 (40 µg mL −1 increments). Agar plates were left to set in a sterile environment.
A single colony of fresh E. coli SCS1 cells (Stratagene) transformed with the appropriate plasmid was used to inoculate 5 mL sterile LB containing 10 µg mL −1 tetracycline. Cultures were incubated overnight at 37°C with shaking (200 rpm).
One millilitre of overnight culture was used to inoculate 100 mL sterile LB containing 10 µg mL −1 tetracycline and grown at 37°C (shaking at 200 rpm) until an OD 600 of 0.6 was reached. Expression of the β-lactamase fusion construct was induced by the addition of filter-sterilised arabinose at a final concentration of 0.075% (w/v) (scFvs) or 0.1% (w/v) (JTO). Cultures were incubated for a further 1 h then serially diluted 10-fold into sterile 170 mM NaCl solution. Three microlitres of each dilution was then spotted onto the pre-prepared 48-well agar plates. The plates were incubated at 37°C for 18 h and the MCD GROWTH was determined for each ampicillin concentration by visual inspection.
A single value from each MCD GROWTH assay, illustrative of the effect of each scFv on bacterial growth, was calculated from the area under the MCD GROWTH curves as a sum of the areas of 7 trapezia using Eq. (1), where A curve is the total area under the curve, and x i and y i are the x-axis and y-axis values at each concentration of ampicillin.
Construction and expression of IgGs. scFv variants chosen for further study were reformatted into the TM-YTE IgG1 backbone by cloning eukaryote codonoptimised V H and V L domains into human TM-YTE IgG1 heavy chain and light chain expression vectors 53 . The plasmids were co-transfected into HEK293/EBNA mammalian cells (Invitrogen, Catalog no. R620-07) for expression and IgG proteins purified from the culture medium using Protein A chromatography.
High-performance size exclusion chromatography (HP-SEC). HP-SEC was performed using an Agilent 1100 series HPLC fitted with a TSK SWXL HPLC guard column (Tosoh Bioscience)  Cross-linking data analysis. Data were imported into PEAKS studio (version 10) and exported as MGF files, to then be imported and analysed by StavroX. Due to the significantly lower levels of dimer formed for scFv STT , the 200x cross-linked sample was used while the 50x cross-linked sample was used for scFv WFL. The data were searched against the protein sequences for scFv WFL and scFv STT . StavroX parameters used for searching were as follows: K and R protein cleavage sites (with 2 and 1 potential missed cleavages, respectively), fixed modification of C to B (cysteine to carboxyamidomethylcysteine) and variable modifications of M to m (methionine to oxidised methionine) with a maximum of two variable modifications per peptide. The non-deuterated and deuterated BS3 cross-linker was added to StavroX, C 8 H 10 O 2 (138.07 Da) and C 8 H 6 D 4 O 2 (142.09 Da), respectively. To include all potential cross-links, the site specificity was set as lysine (K) for peptide 1 and to lysine (K), serine (S), threonine (T) or tyrosine (Y) residues, as well as including the N-terminal amine 54 for peptide 2. Mass tolerances were set as 3.0 ppm for the precursor ions and 0.8 Da for fragment ions with mass limits of 200-8000 Da. Low-precise scoring was used with an false detection rate cut-off of 5% and a score cut-off of 10. A decoy database was generated by shuffling the sequence while keeping the protease sites. Manual data validation was achieved using a comparative approach searching for unique peptides from the digested dimer bands. Cross-linked peptides were readily identified from the doublet peak (Δ4 Da) formed by use of a deuterated and non-deuterated cross-linker.
DNA library synthesis. The Diversify PCR Random Mutagenesis Kit (Takara) was used to synthesise a scFv megaprimer (error rate of 8.1 (WFL), 2.7 (Li33) or 5.8 (JTO) mutations per 1000 bp), using forward (5ʹ-GTGGTGGTGGCTCGA) and reverse (5ʹ-AACCGCTCCCGGATC) primers that anneal to the Gly/Ser linker regions up-and down-stream of the scFv sequence. The product was purified on a 1% (w/v) agarose gel and the desired band was excised and purified using Qiagen Gel Extraction Kit, according to the manufacturer's instructions. To prevent expression of wild-type βLa-scFv WFL after ligation, a 'stop template' plasmid was created. To this end, two stop codons were inserted into β-lactamase (amino acid positions 109 & 110, Supplementary Table 6) in the pMB1-βLa-scFv WFL plasmid using the Q5 Site-Directed Mutagenesis Kit (NEB). A ten-fold excess of scFv megaprimer was added to the βLa-scFv WFL stop template and splicing performed using the QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent). Two microlitres DpnI was then added to each reaction (1 h, 37°C) to remove template DNA. The product was purified using Qiagen PCR Purification Kit and 2 µL was used to transform TG1 Electrocompetent cells (Lucigen) by electroporation (2.5 kV field strength, 335 Ω resistance and 15 µF capacitance). Following recovery, cells were plated onto pre-prepared LB bioassay agar plates containing 10 μg mL −1 tetracycline and incubated overnight at 37°C.
Single colonies were picked for sequence analysis before the remaining colonies were removed from the bioassay plates by addition of 10 mL LB medium and scraping off. The culture was centrifuged (10 min, 5000 × g) before DNA extraction using the Qiagen Midiprep Kit, according to the manufacturer's instructions.
Evolution assay. Directed evolution bioassay assay plates were prepared containing 2.5% (w/v) LB, 1.5% (w/v) agar, 10 µg mL −1 tetracycline, 0.075% (scFvs) or 0.1% (JTO) (w/v) arabinose and either 80 µg mL −1 (WFL) or 140 µg mL −1 (Li33 and JTO) ampicillin. SCS1 Supercompetent Cells (Agilent) were thawed on ice for 10 min and 50 µL cells transferred to a 14 mL round-bottomed transformation tube. Two microlitres of the prepared library plasmid DNA (100 ng µL −1 ) was added to the cells and incubated on ice for 30 min before heat shocking at 42°C for 45 s. After 5 min incubation on ice, 950 µL SOC medium was added to cells and incubated (37°C, 200 rpm) for 1 h. Three millilitres SOC medium was then added to the cells along with 10 µg mL −1 tetracycline. Cells were incubated for 4 h and βlactamase expression then induced with 0.075 % (w/v) arabinose. Cells were then incubated (37°C, 200 rpm) for 1 h. The culture was spread onto the prepared assay plates and incubated overnight at 37°C.
Affinity-capture self-interaction nanoparticle spectroscopy (AC-SINS). Affi-niPure goat anti-human IgG Fcγ Fragment specific (IgGα-Fc) and ChromePure Goat IgG, whole molecule (IgG WHOLE ) (Jackson ImmunoResearch) were buffer exchanged into 20 mM potassium acetate, pH 4.3 and diluted to 0.4 mg mL −1 . Nine millilitres of citrate-stabilised 20 nm gold nanoparticles (Expedeon) were incubated with 600 µL IgGα-Fc and 400 µL IgG WHOLE for 2 h at room temperature. Nanoparticles were blocked with 0.1 µM 2000 MW thiolated PEG (Sigma-Aldrich) at room temperature for 1-2 h. Nanoparticles were concentrated to 800 µL in siliconised Eppendorf tubes (VWR) and stored at 4°C. Forty-five microlitres of 50 µg mL −1 antibody samples were mixed with 5 µL nanoparticle solution and incubated at room temperature for 30 min. The mixture was transferred to a 384well polystyrene UV transparent plate (Thermo Scientific), and the absorbance read from 400 to 700 nm in 1-nm increments. The maximum absorbance was determined (the plasmon wavelength) and the redshift in plasmon wavelength compared with nanoparticles in the absence of antibodies was then calculated by subtracting one from the other.
Epitope competition assay. The relative affinity of the IgG WFL variants for NGF was established using a homogeneous time-resolved fluorescence (HTRF) epitope competition assay. The assay determines relative affinity by measuring the reduction in binding of biotinylated NGF (R&D Systems (256-GF, biotinylated inhouse)) to DyLight650-labelled IgG WFL in the presence of increasing concentrations of test IgG. Binding of DyLight650-labelled IgG WFL to biotinylated NGF is detected by FRET between streptavidin Europium cryptate (CisBio), which binds biotinylated NGF and the DyLight650 conjugated to the IgG. Fluorescence was measured on a PerkinElmer EnVision plate reader with the following settings: 100 flashes, delay 70, cycle 2000, Excitation UV2 (TRF) 320 nm, Emission APC 665 (Bandwidth 7.5 nm), Emission Rhodamine 590 (Bandwidth 20 nm), mirror D400/ 630. The HTRF ratio is calculated by Eq. (2) and the %DELTA F is calculated by Eq. (3): 665 nm emission 590 nm emission 10; 000 ð2Þ ðSample ratio À negative control ratioÞ Negative control ratio 100 ð3Þ Differential scanning fluorimetry. Twenty microlitres of 0.52 mg mL −1 antibody solution in PBS was added to a white PCR plate (BioRad). SYPRO Orange protein gel stain (5000× stock, Invitrogen) was diluted to 40× in distilled H 2 O prior to addition of 5 µL to each well. The plate was sealed, and melt curves obtained on a BioRad CFX96 Real-Time PCR system (20-95°C, increments of 0.2°C per min and hold time of 10 s) by measuring fluorescence intensities using the FRET channel with excitation from 450 to 490 nm and detection from 560 to 580 nm.
Relative surface accessibility (RSA). RSA values were calculated by taking the absolute solvent accessible surface area for the residue in the model of the structure of scFv WFL (created by mutating PDB 5J7Z 35 using Pymol 2.1.0) and dividing it by the maximum possible area for the amino acid type as described by Miller et al. 54 .
Poly(ethylene glycol) (PEG) precipitation assay. A 40% (w/v) PEG 10,000 (Sigma) solution was prepared in PBS and corrected to a pH of 7.0. PEG solution, PBS and 20 µL of IgG stock solution were combined to achieve a PEG concentration range of 0-10% (w/v) and final IgG concentration of 0.5 mg mL −1 in a 96-well plate in triplicate. Plates were sealed with adhesive sealing film and incubated at 4°C for 24 h. After incubation, samples were thoroughly mixed in their respective wells before 2 µL of each sample was transferred to a Lunatic plate for turbidity measurement at 500 nm on a Lunatic (Unchained Labs). Turbidity of buffer only controls was subtracted from final readings.
In silico aggregation predictors. A model of the structure of scFv WFL (described above) was used. The webserver for CamSol 18 was used to generate a structurally corrected profile at pH 7 with a 10 Å patch radius to identify soluble and insoluble amino acids located at http://www-vendruscolo.ch.cam.ac.uk/camsolmethod.html. Aggrescan3D 2.0 45 server was used to predict aggregation propensity located at http://biocomp.chem.uw.edu.pl/A3D2/. Predictions were made in dynamic mode with a 10 Å radius, and stability calculation option was selected, using FoldX 55 to optimise input structure. Spatial aggregation propensity (SAP) calculations were performed using CHARMM 56 simulations and method described by Chennamsetty et al. 19 using a 10 Å radius.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The datasets generated during the current study are available in the University of Leeds data repository (https://doi.org/10.5518/739) and the Source Data file.