The repeat region of cortactin is intrinsically disordered in solution

The multi-domain protein, cortactin, contains a 37-residue repeating motif that binds to actin filaments. This cortactin repeat region comprises 6½ similar copies of the motif and binds actin filaments. To better understand this region of cortactin, and its fold, we conducted extensive biophysical analysis. Size exclusion chromatography with multi-angle light scattering (SEC-MALS) reveals that neither constructs of the cortactin repeats alone or together with the adjacent helical region homo-oligomerize. Using circular dichroism (CD) we find that in solution the cortactin repeats resemble a coil-like intrinsically disordered protein. Small-angle X-ray scattering (SAXS) also indicates that the cortactin repeats are intrinsically unfolded, and the experimentally observed radius of gyration (R g) is coincidental to that calculated by the program Flexible-Meccano for an unfolded peptide of this length. Finally, hydrogen-deuterium exchange mass spectrometry (HDX-MS) indicates that the domain contains limited hydrophobic core regions. These experiments therefore provide evidence that in solution the cortactin repeat region of cortactin is intrinsically disordered.


Results
The cortactin repeat and helical domains are monomeric. Cortactin must bind to F-actin to regulate actin polymerization and branching, however, a molecular understanding of the cortactin-actin interaction, and potential conformational transitions in cortactin to allow the interaction, is lacking. The region of cortactin that is both necessary and sufficient to bind actin 18 is termed the cortactin repeat (cortacinCR) domain. This domain contains 6½ highly similar repeats (Fig. 1A,B), and is followed by a C-terminal helical, or coiled coil, domain (Fig. 1A). Because coiled-coil domains sometimes homo-oligomerize, we wished to probe the oligomerization state of this region of cortactin so we conducted size exclusion chromatography with multi-angle light scattering (SEC-MALS) for two constructs, the cortactin repeat domain, and the cortactin repeat and helical domains (cortactinCRH) (Fig. 1C). We found that both of these domains elutes with a monomeric peak. Analysis of the SEC-MALS data indicated a molecular mass of approximately 43.9 kDa (±1.2%) for cortactinCRH and 29.9 kDa (±1.5%) for cortactinCR. The expected molecular weights for monomeric forms of these proteins (including N-terminal vector derived residues GPLGS) are 36.5 kDa and 27.4 kDa, respectively. Circular dichroism suggests a coil-like intrinsically disordered structure for the cortactin repeats. The fold of the cortactin repeat domain remains controversial. Divergent results from the studies of this protein suggest that it is either natively unfolded 21 , or a folded protein whose secondary structure does not change upon binding to F-actin 19,20 . To settle these controversies, we first conducted circular dichroism experiments. Published circular dichroism results of different cortactin constructs came to divergent conclusions regarding the structure of the cortactin repeats 19,21 . We used circular dichroism to probe the secondary structure of cortactinCR, and compared these analyses to a well-folded control protein (CCM3) (Fig. 2). The control protein, CCM3, showed extensive secondary structure at 4 °C which was lost on heating to 90 °C (Fig. 2C). In contrast, CD spectra of the cortactinCR had a minimum negative signal at 202 nm indicating the presence of mostly (C) SEC-MALS for cortactinCRH (green) and cortactinCR (blue). Predicted molecular masses for monomeric proteins including N-terminal vector derived residues are 36.5 kDa and 27.4 kDa for cortactinCRH and cortactinCR. SEC-MALS observed experimental molecular masses are 43.9 (±1.2%) kDa and 29.9 (±1.5%) kDa for cortactinCRH and cortactinCR, respectively. random coil, consistent with natively unfolded protein. We observe a slight red shift (2 nm) and decrease in negative signal indicating minor structural changes on heating from 4° to 90 °C ( Fig. 2A). Our control protein, CCM3, shows denaturation at 66.1 °C (Fig. 2D), however no such inflection point is observed for cortactinCR (Fig. 2B). CD spectra can be deconvoluted to describe the secondary structure content of the protein of interest. We used the server, BeStSel 22 , to analyze the data from 200-250 nm for the 4 °C samples of cortactinCR and CCM3. For cortactinCR the deconvolution suggested a composition of 0% helix, 14% sheet, 13% turn and 73% other/ irregular. For our control protein, CCM3 the deconvolution suggested a composition of 85% helix, 14% sheet 1% turn and 0% other/irregular, which compared favorably with crystal structures (PDB deposited structure 3L8J has a composition of 73% helix, 0% sheet, 5% turn and 22% other/irregular).
Overall, the CD data show that cortactinCR lacks significant secondary structure in solution. Furthermore, plotting Small-angle X-ray scattering finds the cortactin repeats to be intrinsically disordered. The solution scattering properties of intrinsically disordered proteins are very different from those of folded proteins. This difference allows small-angle X-ray scattering (SAXS) to provide clear evidence of intrinsic disorder that is an orthogonal technique from circular dichroism [24][25][26][27][28] . Therefore, we conducted SAXS for cortactinCRH (Fig. 3A). We found no aggregation, and molecular weight estimation based on Porod volume corresponded well with those expected for a monomeric protein (Table 1), however Guinier approximations indicated a radius of gyration (R g ) for this 324 amino acid protein of ~47.5 Å (Fig. 3B, Table 1), significantly larger than would be expected for a globular protein (~20 Å) 29 . Analysis of the scattering properties of cortactinCRH shows that it displays other features expected for intrinsically disordered proteins: its Kratky plot displays a monotonic increase characteristic of intrinsic disorder 24 (Fig. 3C), and its Porod-Debye plot does not plateau as would be expected for a globular protein 30 (Fig. 3D). We next calculated distance distribution functions P(r) and found an extended D max of ~180 Å ( Fig. 4A and Table 1). The molecular envelopes that we calculated for cortactinCRH were consistently elongated and conformationally diverse (Fig. 4B). Finally, to predict the overall shape and size of cortactinCRH we used the program Flexible-Meccano 31 . This is a well-validated technique that models intrinsically disordered protein structure as a random coil based on the amino acid sequence 24,31,32 . We find that the experimentally observed R g of cortactinCRH (~47.5 Å) falls coincidently in the distribution peak (47.5 Å, 1821 occurrences) of predicted R g values for 100,000 predicted models of an unfolded 324 amino acid peptide chain (Fig. 4C). The SAXS analysis therefore finds that cortactinCRH is intrinsically disordered as a random coil.  on the solvent accessibility/hydrogen bonding of the amide hydrogen, H/D exchange as an analytical technique is a good probe for the protein conformational dynamics and interactions [33][34][35][36] . We conducted an HDX-MS time-course study for cortactinCR by incubating cortactinCR by incubating cortactinCR with D 2 O for 0, 0.5, 1, 2, 4, 8, 15, 30, 60, 120, and 240 min H/D exchange periods. For each proteolytic peptide, the percentage of D-uptake (i.e., number of deuteriums divided by the number of amide hydrogens (not counting proline(s)) after each incubation period was color-coded to produce a heat map. Examination of the cortactinCR data reveals a significant correlation of solvent exposure with the previously described CD and SAXS experiments. We find most regions of cortactinCR rapidly reached HDX saturation by the first time-point (Fig. 5), indicating that cortactinCR contains minimal hydrophobic core (unprotected) and is largely intrinsically disordered.

Discussion
Cortactin contains 6 1/2 cortactin repeats that form what is termed the 'cortactin repeat domain' (Fig. 1. Whether and how the cortactin repeats domain folds in solution has been controversial, and the literature supports two possibilities, either an extended or natively unfolded 2,21 , or a folded domain 19,20 . To resolve the question of whether the cortactin repeats are folded in solution we conducted studies based on the orthogonal biophysical techniques of circular dichroism, small-angle X-ray scattering, and hydrogen-deuterium exchange mass spectrometry. Our studies clearly demonstrate the intrinsically disordered nature of the repeat region of cortactin. We began by probing the overall oligomerization state of the repeat region of cortactin. We were particularly interested in what is termed either the 'coiled coil' or 'helical' region C-terminal to the cortactin repeats. Coiled coil domains are mediators of homotypic or heteromeric protein-protein oligomerization (e.g., 37 ); the presence of  Table 1. Small-angle X-ray scattering data collection and structural statistics.
a region with coiled coil properties would raise the question of whether or not cortactin can oligomerize through this domain. Our SEC-MALS analysis of purified constructs of cortactinCR and cortactinCRH convincingly show that there is no oligomerization (Fig. 1), so we propose that this region of cortactin be exclusively referred to as the 'helical region' .  Hydrogen-deuterium exchange mass spectrometry for cortactinCR. Percentage of deuterium uptake is indicated for HDX incubation periods ranging from 30 s to 240 min. Minimal changes in deuterium uptake are observed over the time course suggesting a minimal hydrophobic core for cortactinCR, and that the protein is largely unprotected and in an unfolded state. Alternating orange and black sequences indicate cortactin repeats.
We next conducted circular dichroism for the cortactin repeat region of the protein (Fig. 2). This analysis clearly showed a lack of intrinsic disorder, and a complete lack of denaturation as observed by heating to 90 °C. This behavior is typical of intrinsically disordered proteins. Furthermore, our analysis suggests that the circular dichroism can classify cortactin as a coil-like unfolded protein rather than a pre-molten globule protein 23 . Pre-molten globule-like proteins contain more ordered secondary structure than the coil-like group 23 , indicating that any conformational changes of the cortactin repeats to a folded domain would be extensive, and perhaps suggesting that the cortactin repeats may not form a folded domain on binding to actin filaments.
Small-angle X-ray scattering is an orthogonal technique that can demonstrate intrinsic disorder, and for the cortactinCRH construct demonstrate a clear unfolded profile in the Kratky plot (Fig. 3C), which for folded proteins tends towards a bell-shaped distribution with a well-defined maximum 24 . Furthermore, the P(r) curves demonstrate an extended molecule (Fig. 4A), and our observation of R g at ~47.5 Å matches extremely well with the predicted shape of an intrinsically disordered protein of 324 amino acids in length (Fig. 4C). The extended nature of the cortactinCRH construct in solution (D max at ~180 Å) also correlates well with previous studies based on deep etch electron microscopy and analytical ultracentrifugation that found the full-length protein to be an extended molecule of between 220 Å and 290 Å in length 2 . When taken together, the CD and HDXMS analysis indicate that the cortactin repeats are intrinsically disordered in solution.
Overall, we have conclusively shown that the cortactin repeat domain is an intrinsically disordered in solution, however, the molecular basis of how cortactin binds to actin remains undiscovered. Low-resolution negative stain EM found that the repeats do not interact with actin filaments in an extended fashion 38 , and crosslinking suggested that the cortactin repeats may conformationally change upon binding to actin 21 . Some mapping also suggests that the fourth repeat may be required for the interaction with actin 39,40 . These important aspects of cortactin's function remain poorly understood, therefore, we propose that future directions in the study of cortactin structure should focus on understanding the molecular basis for cortactin-actin binding.

Methods
Protein expression and purification. Three fragments of mouse cortactin (Uniprot: Q60598) comprising residues Gly83-Phe324 or Gly83-Thr401 of cortactin were subcloned into the pGEX6p-1 expression vector (GE), with an N-terminal glutathione S-transferase (GST) affinity tag followed by a PreScission protease site. These were transformed into Escherichia coli strain Rosetta(DE3) (Novagen) for expression. Production of the targeted proteins was induced by 0.2 mM isopropyl 1-thio-β-D-galactopyranoside (IPTG) at 16 °C overnight. Cells were harvested and lysed in 1x PBS buffer supplemented with protease inhibitors (Roche) and clarified supernatant was loaded onto glutathione-Separose 4B beads (GE) or Ni-NTA beads (GE). GST-cortactin was then digested with PreScission protease on-column overnight at 4 °C. The cleaved target protein was applied to a Resource S column (GE) in buffer of 20 mM MES pH 6, 5% glycerol, 1 mM DTT, and eluted with an NaCl gradient from 10 mM to 500 mM. The elution peak was loaded onto a Superdex 200 increase (GE) column. Each construct resulted in a single peak of cortactin protein. The final purified fragments of cortactin each contain N-terminal vector derived residues GPLGS followed by cortactin. The constructs are termed cortactinCR (residues Gly83-Phe324) and cortactinCRH (Gly83-Thr401).

Size exclusion chromatography with multi-angle light scattering (SEC-MALS). The purified
proteins, cortactinCR and cortactinCRH, were analyzed by SEC-MALS by use of an in-line HPLC (Agilent Technologies 1260 Infinity), and MALS system (Wyatt DAWN HELEOS II, and OPTILAB T-rEX). Each SEC purified protein was loaded onto a WTC-300 silica-based column (Wyatt) in 1x PBS buffer supplemented with 0.02% sodium azide. For each run, a 100 µL sample at 0.6 mg/ml for cortactinCRH or 1.5 mg/mL for cortactinCR, was injected and flowrate was 0.4 mL/min with total 120 min profile. Astra chromatography software (Wyatt) was used for collecting and analyzing data.

Circular dichroism (CD).
Purified cortactinCR was SEC purified in a buffer of 1x PBS supplemented with 5% glycerol. CD spectra were collected at 4 °C for cortactin-CR at a concentration of 10 µM by use of a Chriascan CD spectrometer (AppliedPhotophysics). Constant temperature spectra were collected at 4 °C and at 90 °C, and averages of 20 spectra calculated for each temperature. The control protein, CCM3, was purified as previously described 41 , and CD spectra were collected with the same CD protocol for purified CCM3 at a concentration of 12.5 µM. For stepped temperature ramp CD experiments a temperature range of 5 °C to 90 °C was analyzed, and the spectra repeated 3 times to average the data. The temperature-ramp experiments were conducted at 202 nM for cortactinCR and 209 nM for CCM3, the respective minima for their constant temperature spectra at 4 °C.
Small angle X-ray scattering (SAXS). CortactinCRH was dialyzed against 20 mM Tris pH 8, 300 mM NaCl 1 mM TCEP at final concentrations of 0.4 mg/ml and 1.1 mg/mL. X-ray scattering was conducted at the LiX beamline at the National Synchrotron Light Source II (NSLS-II) and data were collected with a Pilatus 1 M detector. Five individual 5-second exposures were collected for each concentration and for a buffer blank. Data integration, averaging, and buffer subtraction were conducted by use of pyXS 42 . Following inspection of each exposure with Primus 43 , radiation-damaged exposures were excluded. Exposures were merged together by use of pyXS and Guinier analysis was performed with Primus to calculate radius of gyration (R g ). Pair distribution functions P(r) and forward scattering I(0) were calculated with GNOM 44 , and molecular weights estimated separately based on Porod volumes calculated in Primus, and excluded bead volumes of ab initio models from DAMMIF 45 . Dimensionless Kratky plots of q 2 vs. I(q), in which q = qxR g and I(q) = I(q)/I(0) were generated as described 30,46 . Porod-Debye plots of q 4 xI(q) vs q 4 were generated as described 30 . The amino acid sequence of cort-actinCRH was used to generate 100,000 models of cortactinCRH as a random coil type intrinsically disordered protein by use of the program Flexible-Meccano 31 . The expected R g for a folded protein is calculated from the formula R g = 0.395 * N 3/5 + 7.257, in which N is the number of residues (324 for cortactinCRH) 29 . The program Flexible-Mecanno 31 was run by use of default options to generate 100,000 conformers of 324 amino acids.

Hydrogen-deuterium exchange mass spectrometry (HDX-MS). CortactinCR was analyzed by
HDX-MS at the National High Magnetic Field laboratory (NHMFL) by use of on-line LC-ESI FT-ICR methods 47  After proteolysis, the CortactinCR peptide (with and without LmnA) separation and desalting were performed over a Pro-Zap Expedite MS C18 column (1.5 μm particle size, 500 Å pore size, 2.1 × 10 mm 2 ; Grace Davidson, Deerfield, IL), with a Jasco high performance liquid chromatography/supercritical fluid chromatography (HPLC/ SFC) system triggered by the HTC PAL autosampler (Eksigent Technologies). Peptides elute over a 2 min gradient from 2 to 95% Solvent B (Solvent A: acetonitrile/H 2 O/formic acid (4.5:95:0.5) and Buffer B: acetonitrile/H 2 O/ formic acid (95:4.5:0.5)). After ionization by ESI at 3.8 kV, the sample was directed into a custom-built hybrid Velos Pro 14.5 T FT-ICR mass spectrometer (Thermo Fisher, San Jose, CA) 48 . Approximately 350 mass spectra were collected from m/z 210-1300 over a period of 6.5 min, at high mass resolving power (m/Δm 50% = 200,000 at m/z 400, in which Δm 50% is the magnitude spectral peak full width at half-maximum peak height).
After the deuterium uptake profile was analyzed for each of the peptides, a deuterium uptake "heat map" was drawn as the visual representation of the localized deuteration rate for the cortactinCR, to confirm and complement structural information discovered by other experiments. The "heat map" is drawn by summarizing deuterium uptake information for all peptides from the cortactinCR. Briefly, the deuterium uptake of each residue is calculated by averaging the deuteration levels of that residue from each overlapping peptide containing it, and the deuteration level of each residue is calculated by dividing the observed deuterium uptake by the maximum possible deuterium uptake for each peptide. Although deuterium uptake for each residue could vary across the peptide, so that this calculation does not represent an accurate extent of deuteration for each residue, this approach incorporates all available information from all overlapping peptides without introducing bias by manually selecting which peptide to display in the "heat map". Data availability statement. Data and constructs will be made available upon reasonable request.