Simultaneous Analysis of Secondary Structure and Light Scattering from Circular Dichroism Titrations: Application to Vectofusin-1

Circular Dichroism data are often decomposed into their constituent spectra to quantify the secondary structure of peptides or proteins but the estimation of the secondary structure content fails when light scattering leads to spectral distortion. If peptide-induced liposome self-association occurs, subtracting control curves cannot correct for this. We show that if the cause of the light scattering is independent from the peptide structural changes, the CD spectra can be corrected using principal component analysis (PCA). The light scattering itself is analysed and found to be in good agreement with backscattering experiments. This method therefore allows to simultaneously follow structural changes related to peptide-liposome binding as well as peptide induced liposome self-association. We apply this method to study the structural changes and liposome binding of vectofusin-1, a transduction enhancing peptide used in lentivirus based gene therapy. Vectofusin-1 binds to POPC/POPS liposomes, causing a reversal of the negative liposome charge at high peptide concentrations. When the peptide charges exactly neutralise the lipid charges on both leaflets reversible liposome self-association occurs. These results are in good agreement with biological observations and provide further insight into the conditions required for efficent transduction enhancement.

: PCA scores of first and second component. Each lipid/peptide ratio is shown inside a circle with a colour corresponding to the spectra presented in the main paper. The values on the PC1 and PC2 axes represent the amount that each of these components is present in the sample. The independence of the two components is clearly visible by the nearly orthogonal scores of L/P ratio 0-16 on the one hand and 32-48 on the other hand. The reversibility of the proteoliposome aggregation is observed as clustering of the low and high L/P ratios. The confidence region drawn in white represents the 95% confidence region based on a beta distribution.  The first curve (labelled 0) is peptide alone and cannot be reliably analysed due to a low amount of scattered light. The presence of larger particles at L/P ratios of 32, 37 and 42 is clearly visible as the appearance of a "shoulder" with a larger correlation time. The return to smaller particle sizes is visible by the overlapping curves at all other L/P ratios tested.

SVD-based PCA Decomposition of Titration Data.
The method and equations described below can be used to carry out principal component analysis [1,2,3] of any dataset. The specific application described in this paper is the separation of a light scattering from structural changes in CD spectroscopy, and is described in the final step, with additional details given in the rest of the paper.
• Record CD spectra [4,5] and subtract baselines as usual. Do not apply any other data processing.
• Organise the spectra in an (n x m) data matrix X with elements x ij where the n rows are the spectra (indexed by i) and the m columns are the wavelengths (indexed by j).
• Mean centre the data by subtracting the mean from each column: x ij • Autoscaling: divide each column by its standard deviation [6].
• Compute the singular value decomposition (SVD) of the centered and scaled data matrix using the built-in SVD function of scientific software. The SVD returns three matrces: U , S, and V T Matlab, Octave, R, Python/Numpy, have built-in functions to compute the SVD. Example code is given in the supplementary information. Alternatively, you can use the built-in PCA function directly if there is one, taking care of the following: (1) some software uses different terminology (eigenvectors are sometimes called loadings) or scaling (scores and loadings may or may not be standardised), and (2) the PCA method should use a stable algorithm such as NIPALS [1] or SVD instead of eigenvalue decomposition.
• Compute the (standardised) scores T and loadings P .
• Compute the backscaled loadings [7] backscaled loadings = P ij · s j • Analyse the scores (columns of T ) and loadings (columns of P ) as desired. See the main text of this paper for an example.
• To correct the data for light scattering, identify and remove the component k representing the light scattering.
reconstructed data = T k * · P T k * where the subscript k * indicates that column k has been removed.

Source code for PCA
Note that all code for error handling and output was removed for clarity. Also note that we have chosen not to use built-in PCA methods to (1) illustrate the method and (2) be explicit about the data pre-processing and the definition of scores, loadings, and backscaled loadings. The python source code contains comments, but they are left out for the equivalent code provided for R and octave.