Gamma estimator of Jarzynski equality for recovering binding energies from noisy dynamic data sets

A fundamental problem in thermodynamics is the recovery of macroscopic equilibrated interaction energies from experimentally measured single-molecular interactions. The Jarzynski equality forms a theoretical basis in recovering the free energy difference between two states from exponentially averaged work performed to switch the states. In practice, the exponentially averaged work value is estimated as the mean of finite samples. Numerical simulations have shown that samples having thousands of measurements are not large enough for the mean to converge when the fluctuation of external work is above 4 kBT, which is easily observable in biomolecular interactions. We report the first example of a statistical gamma work distribution applied to single molecule pulling experiments. The Gibbs free energy of surface adsorption can be accurately evaluated even for a small sample size. The values obtained are comparable to those derived from multi-parametric surface plasmon resonance measurements and molecular dynamics simulations.


Supplementary
. Peptide P1 interaction with graphene sheets. a the retraction trajectories acquired from singlemolecule pulling experiments. b the selected trajectories for free energy calculation. c the work distribution fitted with gamma distribution. d the empirical cumulative distribution function comparing to theoretical gamma CDF shows the fitting is good. Figure 5. Computational experiments to test the effect of the linker. a the simulated system. b the end-point conformation after first pulling experiment. c the end-point conformation after second pulling experiment. d the most likely adsorbed conformation. e the histogram of distance of alpha carbons to surface in tyrosine and cysteine.

Supplementary Note 1: Limitations in deriving a distribution model from histogram
Assuming a parametric distribution model such as norm and gamma, the maximum likelihood method is consistent and efficient in estimating the parameters which determine the density function. However, if a particular parametric distribution family is not known, a simple way to nonparametrically learn a distribution model directly from the data is to make a histogram. A histogram gives us a piecewise constant estimate of the density function since the probability density within each bin is a constant. It has been proved that the asymptotic integrated mean squared error between a true density function ( ) and a piecewise density function from where is the number of data points, ℎ is the bin size, ′( ) is the first derivative of the true density function. Differentiating with respect to ℎ and setting it equal to zero, we obtain the optimal ℎ as

Supplementary Note 2: Impact of the linker
In the single molecule pulling experiment, a GGGC linker is included to provide freedom of rotation and to facilitate linkage to the maleimide group on the PEG linker. Additional experimental data (set D in main text) after deleting GGG yields the same adsorption free energy. In MP-SPR measurements, only A3 peptide is studied and yet comparable adsorption free energy is obtained. These findings indicate that the linker has minimal impact on peptide binding. To understand the mechanism leading to the agreement with and without GGG, molecular dynamics simulations have been carried out to observe how the molecule adsorbs on surface.
Since cysteine is included for conjugation of the peptide via the maleimide-thiol interaction, we must include the maleimide compound in our simulation. Supplementary

Supplementary Note 3: Applicability to other systems
To test whether the approach is applicable to other model systems, we collected a set of data for peptide P1 (HSSYWYAFNNKT) interaction with a graphene surface. 5 The graphene-binding P1 peptide was purchased with a -GGGC terminal group to provide freedom of rotation and facilitate linkage to the maleimide group on the PEG linker. [6][7][8] The peptide, HSSYWYAFNNKT-GGGC, was purchased from GenScript at 98.9% purity. AFM tips were prepared from APTES modified DNP Bruker probes purchased from Novascan Technologies, Ames, IA USA. PEG modification was performed using the same protocol as data sets A, C, and D in the main text and peptide modification was performed following the same protocol as in the main text, using P1-GGGC as