Crystal structure of an L chain optimised 14F7 anti-ganglioside Fv suggests a unique tumour-specificity through an unusual H-chain CDR3 architecture

Targeted cancer immunotherapy offers increased efficacy concomitantly with reduced side effects. One antibody with promising clinical potential is 14F7, which specifically recognises the NeuGc GM3 ganglioside. This antigen is found in the plasma membrane of a range of tumours, but is essentially absent from healthy human cells. 14F7 can discriminate NeuGc GM3 from the very similar NeuAc GM3, a common component of cell membranes. The molecular basis for this unique specificity is poorly understood. Here we designed and expressed 14F7-derived single-chain Fvs (scFvs), which retained the specificity of the parent antibody. Detailed expression and purification protocols are described as well as the synthesis of the NeuGc GM3 trisaccharide. The most successful scFv construct, which comprises an alternative variable light chain (VLA), allowed structure determination to 2.2 Å resolution. The structure gives insights into the conformation of the important CDR H3 loop and the suspected antigen binding site. Furthermore, the presence of VLA instead of the original VL elucidates how this subdomain indirectly stabilises the CDR H3 loop. The current work may serve as a guideline for the efficient production of scFvs for structure determination.

GM3, which contains the sialic acid N-glycolyl neuraminic acid. NeuGc GM3 is displayed on the cell surface of a range of cancer cells, e.g. breast carcinoma, melanoma, retinoblastoma and lymphoid tumors [14][15][16][17] . The ceramide tail is anchored in the membrane, whereas the hydrophilic glycan head is exposed to the extracellular environment. NeuGc GM3 is structurally highly similar to the ganglioside N-acetyl GM3 (NeuAc GM3), with the only difference being an additional oxygen atom found in NeuGc GM3. The presence of NeuGc GM3 in certain human cancer types remains enigmatic [18][19][20] . NeuAc GM3 can be converted to NeuGc GM3 by cytidine monophosphate-N-acetylneuraminic acid hydroxylase (cmah) 21 . However, in contrast to other mammals, humans carry a deletion in the CMAH gene rendering normal human cells unable to produce NeuGc GM3 22,23 . Nevertheless, there are minute amounts of NeuGc GM3 present even in human healthy cells, probably due to incorporation through the diet 24,25 . Still, NeuGc GM3 fulfils the criteria of being a tumour-specific extracellular marker, which makes it a very potent antigen for targeted cancer immunotherapy.
The monoclonal antibody (mAb) 14F7 is a murine IgG that binds selectively to NeuGc GM3 14 . It has been reported to kill cells by disrupting the integrity of tumour cells through an oncosis-like mechanism 26 and in addition serves as a powerful prognostic tool 15,27 . The affinity for NeuGc GM3 has earlier been reported to be in the low nanomolar range (K D = 25 nM), whereas 14F7 exhibits poor affinity for NeuAc GM3 28 . However, the structural basis for 14F7′s discriminatory power is currently poorly understood. The crystal structure of the 14F7 Fab domain without its ganglioside ligand was determined more than a decade ago 29 , but the ganglioside complex has remained recalcitrant to crystallisation and efforts to reliably model the binding mode have still left many open questions 30 . What is known from light chain shuffling studies 28 is that the interaction is mainly with the 14F7 heavy chain Fv. Moreover, in vitro directed evolution through combinatorial phage display and subsequent mutagenesis have identified crucial residues for the recognition of the ganglioside 30 .
One of our main experimental challenges has been to reproducibly crystallise the 14F7 Fab. Therefore we wished to explore an alternative approach, based on a single-chain Fv (scFv). A scFv is an antibody-derived fragment that contains the heavy chain variable region (V H ) and the light chain variable region (V L ) tethered covalently by a linker. The construction and expression of a panel of 14F7-derived scFv antibody fragments that maintain recognition of NeuGc GM3 using a phagemid vector have been reported previously 28 , but only allowed the production of small amounts of soluble scFvs. For structural characterisation, we need larger quantities of stable protein. With the aim of obtaining the crystal structure of the NeuGc binding paratope, we set out to explore alternative scFv design strategies, and develop expression and purification protocols that may also be applicable to other systems. We describe the construct design and validated production protocols for four 14F7-derived scFvs and report the successful crystallisation and structure determination of one of these to 2.2 Å resolution. In addition we describe a synthesis protocol for the NeuGc GM3 trisaccharide. Together, this information provides a valuable platform for future structural investigation and engineering of the unique specificity of 14F7 for its tumour specific antigen.

Results
Synthesis of the NeuGc GM3 trisaccharide. The synthesis of the NeuGc GM3 trisaccharide was carried out according to the scheme in Fig. 1. Initial glycosylation attempts between benzoyl protected 3′,4′-diol lactose acceptor 2 31 and donor 1 32 using IBr/AgOTf 33 as promoter gave no sialylation product, only elimination of the Neu5Gc donor. However, using the same reaction conditions and donor, but changing the acceptor to the more active benzylated lactoside 3 34 efficiently afforded the α-linked sialylation product 4 in 70% yield. De-protection of 4 through catalytic hydrogenolysis followed by Zemplen deacylation and saponification afforded the target structure 5 in 94% yield.
Design of scFv variants. Four different scFv variants of the original murine 14F7 were designed ( Fig. 2 and Supplementary Data S1). In addition to the original 14F7 light chain, we used an alternative light chain variable region (V LA ) based on previous phage display light chain shuffling experiments (clone 3Fm) 28 . Furthermore, we designed two linkers connecting V H and V L termed L1 and L2. L1 (N-KLSGSASAPKLEEGEFSEARV-C) was adapted from an established vector system for expression of scFvs in Escherichia coli 35 , while L2 (N-KLAPQAKSSGSGSESKVDARV-C) consists of an extended version of the linker used in the prior light chain shuffling screening 28 . The residues from the prior version are underlined. Matching V H with either V L or V LA and linkers L1 or L2, gave rise to four different 14F7-derived scFv constructs named C1 to C4 (Fig. 2).
Periplasmic expression and purification of 14F7-derived scFvs C1 to C4. The codon-optimised scFv 14F7 variants were cloned into an engineered pFPKEN vector that had a pelB leader sequence for translocation to the periplasm following translation. Two of the four constructs showed weak expression (C2 and C4), while the other two (C1 and C3) gave good yields (5 mg/L culture and 3 mg/L culture, respectively) (Fig. 3A). Both well-expressing constructs contain the alternative light chain 3Fm, in keeping with the earlier light chain shuffling experiments by Rojas et al. 28 . The highest quality protein was obtained by periplasmic extraction using the described buffer containing sucrose and lysozyme. Isolation and purification of the protein from whole cell lysate was also attempted and gave higher yields, but the scFvs appeared partly degraded based on SDS-PAGE analysis (Supplementary Data S2). Affinity purification using sepharose-coupled protein L proceeded immediately after isolation from the periplasmic fraction. Protein L binds to the majority of correctly folded antibody Vκ domains, and therefore offers a highly selective purification step for antibodies and fragments thereof with this domain 36 . During the initial purification, the scFv constructs were prone to proteolysis. Speed of work, keeping all solutions on ice and the presence of protease inhibitors were crucial to obtaining non-degraded protein, with an inhibitor cocktail showing the best effect. The 14F7 scFv fragments bound firmly to the protein L resin and eluted in the later fractions. In a second step, the protein was further purified using SEC, eluting in two peaks. The main fraction contained monomeric scFv (Fig. 3B,C). Repeating SEC using the monomeric fraction did not indicate further dimerization (not shown), therefore the protein was judged to be sufficiently stable and mono-disperse to allow down-stream characterisation and crystallisation. Storage of the scFv at −80 °C was possible without losing activity provided that the aliquoted protein samples were flash-frozen in liquid nitrogen. Based on the expression yields and purification efficiency, the scFv C1 construct was selected for detailed characterisation and crystallisation trials. This construct contains the alternative variable light chain V LA and the L1 linker adapted from an established vector system for expression of single-chain T-cell receptors (scTCRs) and scFvs in E. coli 35,37 , which was not previously used for 14F7-related work.
Stability, affinity and specificity of scFv C1. In order to assess the binding activity and selectivity of the designed 14F7 scFvs, the C1 representative was subjected to ganglioside-binding analysis using ELISA. NeuGc GM3 and NeuAc GM3 were immobilised on the plate and incubated with scFv C1 or control 14F7 mAb. Both  antibody formats could clearly distinguish between the two gangliosides. They showed strong binding to the N-glycolyl variant and negligible binding to NeuAc GM3 (Fig. 3D,E). Fitting of the ELISA data yielded a K D,app of 4 nM for scFv binding to NeuGc GM3 (Fig. 3D), compared to an apparent K D,app of 2 nM for the mAb (Fig. 3E), in line with earlier experiments 28 . The thermostability of the scFv construct was estimated from a tryptophan fluorescence scan at increasing temperatures. A single clear transition occurred around 71.4 °C indicating the melting point of the protein. No significant difference was found in the presence of the NeuGc GM3 trisaccharide 5 (Fig. 3F). Both ELISA and thermostability measurements were carried out in triplicates. The data are summarised in Fig. 3G.
Crystal structure of scFv C1. The crystal structure of scFv C1 was determined to 2.2 Å resolution. Despite the different light chains, it is structurally highly similar to the 14F7 Fab structure determined earlier to 2.6 Å resolution (PDB ID: 1RIH) 29 , with an all-atom r.m.s.d. value of 0.5 Å. The largest difference between the two structures is found in the CDR regions, notably in the important CDR H3 loop, which is subject to a small backbone shift (Fig. 4A). Based on mutation analysis and computational modelling, this loop was predicted to play an important role in NeuGc recognition 29 , which has recently been confirmed by phage display studies 30 . This work showed for example that CDR H3 residue Arg98 is essential for retaining binding affinity and cannot be substituted by any other amino acid without losing activity. While the exact position of Arg98 is governed by the shift in the backbone of the CDR H3 loop, the position in the scFv is comparable to that in the Fab. In the scFv C1 structure, four molecules (M1 to M4, see Supplementary Data S3) are present in the crystal's asymmetric unit, which allows for an independent structural comparison of the CDR H3 loop. The loop cannot be fully traced in the electron density of two of these molecules (M3 and M4), which likely reflects some degrees of flexibility. In molecules M1 and M2 (Fig. 4B,C), however, the CDR H3 loops are well defined. In both cases the CDR H3 loop is engaged in different crystal contacts (see Supplementary Data S3). The CDR H3 conformation of M1 resembles that of the 14F7 Fab more closely than that of M2, with main chain distances at the loop tip of 2.3 Å and 5.5 Å, respectively ( Fig. 4B,C, dashed lines). Another region that showed disorder in the crystal structure is the L1 linker region, which was not traceable in any of the four molecules, suggesting increased mobility compared to the rest of the molecule.
A second residue critical for NeuGc recognition is Trp33 in CDR H1 30 (Fig. 4D). This residue is positioned exactly as in the 14F7 Fab. At the higher resolution of the current analysis, the structure reveals an extensive hydrogen bonding network in the suspected hydrophilic binding pocket 29 .
The differences in light chains between scFv C1 (V LA ) and 14F7 mAb and Fab (V L ) call for a detailed comparison. The sequence identity between the two light chains is 61%. A sequence alignment and stereo image of the scFv, with V L -V LA differences mapped, can be found in Supplementary Data S4. We focused in particular on the interactions between the CDR regions of V H and V L or V LA . With 874 Å 2 versus 828 Å 2 , the interface between V LA and V H in the scFv is somewhat larger than between V L and V H in the Fab, as calculated using PDBsum 38 . The interface is largely hydrophobic and quite tightly packed, featuring a number of aromatic residues involved in stacking interactions and hydrogen bonds. Especially important is the interaction with V H CDR H3 (Fig. 4E). It is therefore plausible to assume that V L /V LA indirectly contribute importantly to NeuGc GM3 binding, by stabilizing the characteristic long CDR H3 loop, configuring it for antigen recognition. Such stabilisation may be mediated directly, e.g. by π-π interactions of CDR H3 Tyr100 E with V LA residues Tyr49 and CDR L2 Tyr50 from V LA (Fig. 4E), as well as indirectly through a water-mediated hydrogen bonding network close to the interface with the scFv framework (Fig. 4D).

Discussion
There is a high interest in antibody fragments that fully retain the antigen-binding capacity due to their application as building blocks in advanced immuno-therapeutics [39][40][41] . Several scFvs for therapeutic use are in development, however, none have entered the market. ScFvs have the advantage of rapid delivery and penetration of tumour cells compared to mAb molecules 42,43 . Their unique specificity renders them a versatile delivery vehicle for diagnosis, radio-immunotherapy and gene therapy, especially if fused to drugs or effector molecules 44 . Furthermore, the smaller scFv (28 kDa) is likely more crystallisable than the larger Fab or full mAb molecules. This work can therefore provide important structural information to guide further scFv development.
The production of scFvs, in the amounts and quality sufficient for structural determination, requires an efficient expression system that allows successful formation and maintenance of disulphide bonds crucial for structural integrity. The periplasmic chaperone FkpA enhances expression yields by facilitating folding and reducing protein degradation 45,46 . Here, we focused on the scFv derived from the 14F7 mAb, an anti-tumour antibody against NeuGc GM3 that has significant clinical potential 15,27 . The 14F7 Fab was crystallised more than a decade ago 29 , however, with very poor reproducibility; and in particular, there is still no crystal structure of the antigen complex. In this work, we therefore followed two aims: to provide good starting conditions to finally solve the structure of the 14F7 antigen complex and to more generally explore strategies to produce well-expressing scFvs.
In total four different constructs were produced and evaluated, based on two different variable light chains (V L and V LA ) and two different linkers (L1 and L2) (Fig. 2). Only two of these constructs, C1 and C3, expressed in adequate yields in XL1-Blue E. coli cells. Remarkably both C1 and C3 contain the alternative light chain variable region (V LA ) previously selected from a phage-displayed light chain shuffling library 28 based on the ability to pair with 14F7 V H and allow the secretion of the corresponding scFv to the bacterial periplasm. Thus the need of replacing the original 14F7 V L to achieve successful bacterial expression was confirmed in the context of a different vector/host strain system, implying that V L somehow impairs the secretion process of the antibody fragments, whereas the sequence of the linker region appears to be of less importance in terms of expression levels. However, the linker may still affect stability as indicated generally for scFvs in computational studies 47 . In this work, two different linkers were used, both of which are slightly longer than those of the previous constructs by Rojas et al. and designed based on experiences by Løset and colleagues from expression of scTCRs and scFvs in E. coli 35,37 . The linker region was disordered in the crystal structure of scFv C1. Mutagenesis of parts of the linker to residues conferring structural rigidity and reduced conformational plasticity, as well as exhibiting higher propensity towards the formation of secondary structure elements (i.e., α-helix or β-sheet) may stabilise this region more and reduce proneness to aggregation and/or unwanted proteolytic activity. In addition, this may also improve chances of crystallisation. Indeed, in another study, the L1 linker used herein was subject to a Leu to Pro mutation in the second position when channelled through thermostability engineering as part of a highly unstable scTCR 48 . We further note that the elongation of the linker from N-EKSSGSGSESKVD-C 28 to N-KLAPQAKSSGSGSESKVDARV-C (as in L2) possibly releases a strain between the domains, allowing for the detailed structural characterisation presented here.
Apart from the construct design, the expression and purification conditions were important. For example, we found that periplasmic isolation of the protein was worthwhile, despite the loss in overall yields, due to the significantly improved scFv quality. The protein molecules were much less degraded (compare Fig. 3A and Supplementary Data S2). Additional benefits were achieved by working at 4 °C throughout and using a protease inhibitor cocktail rather than individual inhibitors. Several attempts to increase yields by induction with IPTG were found to be counterproductive and we therefore assume that the leaky basal expression is close to optimal at the levels of chaperone present.
The scFv C1 construct was tested for binding activity using ELISA and compared with the 14F7 mAb. The data clearly show that scFv C1, just as the 14F7 mAb, can discriminate strongly between NeuGc and NeuAc GM3, and barely binds the latter. This is important for potential clinical applications of the humanised antibody. While the ELISA experiment is a relatively crude method for determining the K D , it nevertheless indicates that the K D of the scFv is comparable to the apparent K D of the mAb with respect to NeuGc GM3 binding. The strong binding affinity, differentiation between NeuGc and NeuAc, and high thermostability make scFv C1 a very attractive molecule for drug delivery and immunotherapy. Surprisingly, no further increase in thermostability was observed when bound to the NeuGc GM3 trisaccharide 5. This could indicate that the lipid part of the ganglioside contributes importantly to binding and the trisaccharide has limited binding affinity on its own. Alternatively, since its thermostability is already high, it is possible that 5 binds strongly to its target, but that the stabilizing effect of binding is nevertheless negligible in the already highly stable scFv.
Structural analysis of scFv C1 revealed that the important CDR H3 region adopts a similar conformation as in the 14F7 Fab, stabilised by tight interactions with the light chain CDRs. There is, however, some variation between the four different molecules in the crystal. For M1 and M2, the electron density was of sufficient quality to allow tracing of the chains. They adopt slightly different conformations, which at the tip diverge by up to 5.5 Å from the Fab structure (Fig. 4C). The different conformations may provide suitable alternative starting positions for in silico modelling of the antigen complex. An additional benefit of the scFv structure compared to the Fab is its higher resolution. While the conformation responsible for recognizing the NeuGc GM3 antigen may only be revealed experimentally, the dynamic nature of the CDR H3 may be important for the actual recognition mechanism. After all, this long loop has to be inserted quite deeply into the biological membrane, as discussed in more detail below.
Phage display-based directed evolution experiments showed that while most substitutions in CDR H3 abolished binding to NeuGc GM3, mutation of Trp33 to Phe, Tyr or Gln rescued binding affinity 30 . Furthermore, Trp33Gln produced moderate cross-reactivity to NeuAc GM3. The position of Trp33 in the scFv is identical to that of the Fab, but at the higher resolution, additional water molecules were identified. These water molecules contribute to a hydrogen bonding network that may be important for ligand recognition (Fig. 4D). Furthermore, the maintenance of aromaticity without hydrogen bonding capability resulting from the Trp33Phe mutation 30 suggests that a π interaction at this position may be essential to NeuGc specificity. The other rescuing variant, Trp33Gln, suggests that Gln may give rise to an alternative hydrogen bonding network that still supports trisaccharide binding, but compromising specificity.
Previous work indicated that it is mainly, if not exclusively, the prominent CDR H3 loop that recognises the antigen [28][29][30] . However, a comparison of the V L -V H interface of the different variable light chains of the Fab and scFv C1 structures (V L and V LA , respectively) reveals important interactions that may be indirectly linked to antigen affinity (Fig. 4E). These include the π-π interactions of Tyr100 E found in CDR H3 with residues of the V L and V LA CDR L1. Interestingly, computational studies indicate that destabilisation of the V L CDRs may lead to an increased stability of V H CDRs and vice versa 47 , thus suggesting that more distal, indirect effects on protein dynamics by mutations in CDR regions could affect ligand binding.
One intriguing aspect of the small NeuGc GM3 glycan head group is that it would bring the scFv in close proximity to the plasma membrane, thus logically implying a potential interaction with lipids surrounding the ganglioside. In fact, molecular dynamics simulations have suggested that GM3 is deeply embedded in the cell membrane and that only the terminal saccharides are exposed 49,50 . Moreover, the interaction of gangliosides with cholesterol in lipid rafts appears to bend the glycolipid head group into a conformation almost parallel to the membrane 51,52 . Recently, internalizing scFvs have been described and even selected for by phage display 53,54 . Selecting for this trait in conjunction with recognition of NeuGc GM3 may provide a very potent delivery molecule for future antibody-drug conjugate development. The stable production and structure determination of this active scFv, which retains high binding affinity and stability, is a good starting point for the structural characterisation of the antigen complex. Such a structure would finally reveal the molecular details of NeuGc GM3 selectivity and moreover provide the basis for scFv -membrane interactions, contributing valuable information to the development of 14F7-derived therapeutics. Last but not least, the work described here provides valuable new insight and general guidelines for the production of well-expressing scFvs. (4,7,8,9- (4). A mixture of acceptor 3 (0.200 g, 0.226 mmol), donor 1 (0.170 g, 0.272 mmol) and molecular sieves (3 Å) in dry CH 2 Cl 2 (3 mL) was stirred for 20 min under a nitrogen atmosphere, when a solution of AgOTf (0.102 g, 0.678 mmol) in dry MeCN (4.5 mL) was added. After 15 min of stirring at room temperature, the reaction mixture was cooled to −45 °C. Thereafter a solution of IBr (1.0 M in CH 2 Cl 2 , 0.45 mL, 0.452 mmol) was added dropwise. The reaction was stirred and gradually allowed to warm to −30 °C, then stirred for a total reaction time of 3 h. Diisopropylamine was added to neutralise the reaction and the stirring was continued for another 20 min. The mixture was filtered through a pad of Celite, and the filtrate concentrated in vacuo. The residue was purified by flash column chromatography on silica gel (toluene/MeCN, 3:1) to give 4 (0.231 g, 70%) as a white solid. [α] D + 2.3 (c 1.0, CHCl 3 ). The correct product was verified by NMR and mass spectrometry. 1 H NMR (500 MHz, CDCl 3 ) chemical shifts can be found in Supplementary Data S5. (5). 10% w Pd(OH)2 (75 mg) was added to 4 (0.100 g, 0.068 mmol) in MeOH (5 mL). The reaction mixture was stirred under a hydrogen atmosphere (using a balloon) at room temperature for 2 days to complete conversion according to TLC. The reaction was filtered through PTFE frits (3 frits stacked on top of each: 20 μm, 10 μm, 5 μm) and rinsed with MeOH. The filtrate was concentrated in vacuo to give a crude de-benzylated compound as a white solid. Freshly prepared sodium methoxide was added to a solution of the crude product dissolved in dry methanol (3 mL) under a nitrogen atmosphere. The mixture was stirred at room temperature until complete conversion (16 h) according to TLC. Then water (0.5 mL) was added to the reaction mixture and additional sodium methoxide was added to reach a pH of 12. The reaction was stirred at room temperature for 24 h and then neutralised by the addition of Dowex H + ion exchange resin. The resin was filtered off and washed with methanol and water. The filtrate was concentrated to a crude product, followed by freeze drying to afford an anomeric mixture (α:β, 6:11) of compound 5 (0.042 g, 94%) as a light white solid. The correct product was verified by NMR and mass spectrometry. 1 H NMR (500 MHz, CDCl 3 ) chemical shifts can be found in Supplementary Data S5.
Expression and purification of scFv constructs. 14F7-derived scFvs C1-4 were expressed in XL1-Blue E. coli cells. Transformed cells were grown overnight in 2x YT medium supplemented with 2% glucose and 100 μg/mL ampicillin (2x YT GA ). Other media (Lysogeny Broth and Terrific Broth) were also tested, and Terrific Broth gave almost as good yields as YT medium. Overnight cultures were used to inoculate larger volumes and were allowed to grow at 37 °C at 125 rpm until OD 600 reached 0.6-0.8. The cells were pelleted at 4000 × g for 40 minutes at 4 °C. Pellets were re-suspended in equal volumes of 2x YT without glucose (2x YT A ) to allow protein expression at 30 °C overnight. No IPTG was added to induce expression, after tests revealed that the application of IPTG leads to lower protein quality. Cells were harvested the following morning by centrifugation at 4000 × g at 4 °C. Periplasmic extracts were prepared by re-suspending the pellets in extraction buffer (50 mM Tris-HCl pH 7.5, 20% sucrose, 1 mM EDTA, 80 μg/μL lysozyme, 80 μg/μL DNase, cOmplete protease inhibitor (Sigma)), and yielded approximately 5 ml per gram cell pellet. The solutions were stirred for one hour on ice, before the soluble fractions were isolated by centrifugation at 18000 rpm for 30 min at 4 °C. The periplasmic extracts were loaded on a 1 mL protein L column (Pierce), washed with PBS, and the captured proteins were eluted with 0.1 M glycine, pH 2.5. Following elution, fractions were immediately neutralised with 1 M Tris-HCl pH 7.5. The proteins were further purified by size exclusion chromatography (SEC). SEC of scFv C1 used in subsequent crystallisation and diffraction experiments was done in 20 mM Tris-HCl pH 7.5 and 150 mM NaCl. SEC of scFv used for downstream characterisation experiments (ELISA, thermostability), was performed in PBS. Protein concentrations were determined (Implen NanoPhotometer) using extinction coefficients calculated from the protein sequences by ProtParam 55 (Supplementary Data S1). Protein integrity was assessed by reducing SDS-PAGE and visualised by Coomassie staining.
Binding assays using ELISA. NeuGc GM3 was obtained from horse erythrocytes using a modification of the Folch method 56 . NeuAc GM3 was purchased (Avanti Polar Lipids). Both gangliosides were solubilised in methanol. All steps of the protocol were performed at room temperature. The wells of a Nunc-Immuno 96 MicroWell PolySorp solid plate (Sigma) were coated with 100 μL ganglioside solution at 10 μg/mL and left to dry overnight. The next morning the wells were washed three times with 100 μl PBS T per well (1x PBS containing 0.1% Tween 20) before the addition of 200 μl of PBS TB (PBS T containing 2% BSA). The blocking solution was left for one hour followed by a 3x PBS T -wash. The scFv C1 stock solution was prepared in PBS TB at concentrations: 2.10, 6.20, 18.5, 55.6, 167 and 500 nM. The 14F7 mAb was prepared in PBS TB at concentration: 3.70, 11.1, 33.3, 100 and 300 nM. The mAb functions as a positive control, and two negative controls were also included: PBS TB (no protein added) to map background absorption and an unrelated scFv to control for unspecific binding. All samples were set up in triplicates, using 100 μl protein solution per well to ensure reproducibility. The samples were incubated for one hour, followed by a 3x PBS T -wash. The targetbound scFv or mAb were detected using protein L coupled horseradish peroxidase (HRP, Genscript), diluted in 1x PBS (1:2000). 100 μl were added to each well and incubated for one hour. Subsequently, the plates were washed four times with PBS T , followed by the addition of 100 μl substrate 3,3′,5,5′-tetramethylbenzidine (TMB). The developed signal was stopped by the addition of 100 μl 1 M HCl to each well and absorbance measured at 450 nm by a Multimode microplate reader (Thermo Scientific). The ELISA data were background subtracted and normalised to the data point of the highest protein concentration assuming saturation at this point (θ = 1). A simple one-binding-site model, θ= , where [protein] is the actual concentration of either mAb or scFv. K D,app is the apparent dissociation constant of the system. Due to the potentially bivalent interaction of the mAb (two binding sites), the value does not represent a true K D for these.
Assessment of scFv thermostability. The thermostability of scFv C1 was determined by recording the tryptophan fluorescence signal while heating the protein sample. Measurements were done in a JASCO-8500 fluorimeter equipped with a Peltier heating unit. A protein concentration of 2 μM was used. The sample was placed in a 200 μL cuvette, excited at 295 nm and stirred with a magnetic bead at 200 rpm during the entire measurement. The emission at 350 nm was recorded with the photo multiplier set to high sensitivity. The sample was exposed to a temperature gradient from 25 °C to 90 °C at ΔT = 1 °C/min. Melting point determination was done in the presence and absence of 20 μM NeuGc GM3 trisaccharide 5. The tryptophan fluorescence intensity ratio at between 350 nm and 330 nm (F350/F330) was calculated. This ratio represents a shift from the folded to the denatured and thus solvent exposed tryptophan residues. A Boltzmann distribution was fitted to the data in order to determine the melting temperature T m at the inflection point using Graphpad Prism 7.
Crystallisation. Purified scFv was concentrated to 16 mg/mL and crystallised using the Structure screen Data collection and structure determination. Diffraction data to 2.2 Å resolution were collected from a single crystal at the ID29 micro focus beam line at the European Synchrotron Radiation Facility (ESRF) Grenoble, France. The data were processed with the EDNA auto-processing procedure provided at the beam line 57 . The space group was determined to be P2 1 and the data cut-off chosen based on the correlation coefficient (CC 1/2 ) statistics in accordance with ref. 58 . We decided to use a conservative cut-off in CC 1/2 of 0.80 after evaluation of map quality and refinement R-factors. The most probable number of molecules in the asymmetric unit was determined to be four from the Matthews coefficient 59 . The data were phased by molecular replacement using a homology model of scFv C1 based on PDB entry 3UMT, another scFv with 65% sequence identity to C1. Molecular replacement was done using Phaser 60 from the PHENIX suite 61 . The homology model was prepared by SWISS-MODEL 62 . Based on the phased data, an initial experimental model was built using phenix.autobuild. The final model was obtained after several cycles of refinement (phenix.refine) and manual model building with Coot 63 . The model was numbered according to the Kabat system 64 . Data collection and refinement statistics are summarised in Table 1 Table 1. X-ray crystallographic data collection and refinement statistics. *Data collected on a single crystal. **Values in parentheses are for highest-resolution shell.