Validation of Size Estimation of Nanoparticle Tracking Analysis on Polydisperse Macromolecule Assembly

As the physicochemical properties of drug delivery systems are governed not only by the material properties which they are compose of but by their size that they conform, it is crucial to determine the size and distribution of such systems with nanometer-scale precision. The standard technique used to measure the size distribution of nanometer-sized particles in suspension is dynamic light scattering (DLS). Recently, nanoparticle tracking analysis (NTA) has been introduced to measure the diffusion coefficient of particles in a sample to determine their size distribution in relation to DLS results. Because DLS and NTA use identical physical characteristics to determine particle size but differ in the weighting of the distribution, NTA can be a good verification tool for DLS and vice versa. In this study, we evaluated two NTA data analysis methods based on maximum-likelihood estimation, namely finite track length adjustment (FTLA) and an iterative method, on monodisperse polystyrene beads and polydisperse vesicles by comparing the results with DLS. The NTA results from both methods agreed well with the mean size and relative variance values from DLS for monodisperse polystyrene standards. However, for the lipid vesicles prepared in various polydispersity conditions, the iterative method resulted in a better match with DLS than the FTLA method. Further, it was found that it is better to compare the native number-weighted NTA distribution with DLS, rather than its converted distribution weighted by intensity, as the variance of the converted NTA distribution deviates significantly from the DLS results.

Efforts to develop new drugs are not limited to the physicochemical properties of pharmaceuticals. They also include explorations of effective ways to deliver those drugs without compromising efficacy or safety [1][2][3][4][5][6][7][8][9][10] . Despite advances in molecular biology research, many drugs still have serious side effects due to the lack of a specific target and correct control release profile, and these side effects limit our ability to design optimal medications for many diseases, including cancer, neurodegenerative diseases and infectious diseases [11][12][13][14][15] . To address this issue, researchers have developed several new modes of drug delivery system (DDS) that have entered clinical practice, including nanoparticles based on polymers, noble metals and lipid based carriers 3,[16][17][18][19] . The interactions and stability of such materials are strongly dependent on carrier size, whose characterization is crucial in assessing the quality and determining the efficiency of the DDS [20][21][22] . In particular, chemical modification of nanoparticles is necessary to make them suitable for physiological conditions, and accurate measurement of their size is necessary for quality control [23][24][25] . This requirement has become more significant as nanoparticles, and their chemical modifications, have been developed for more specific purposes 1,26,27 . Likewise, lipid vesicles, due to their versatile engineering capabilities, have been combined with various therapeutic agents to achieve desired pharmaceutical properties 28,29 . Due to the inherent self-assembly of lipids, validation of their size and distribution is essential to understand the physical properties that directly correlate with drug efficacy. All of these factors highlight the importance of using accurate and precise measurement techniques to characterize the size distribution of biological and synthetic nanoparticle suspensions. method. The comparison confirmed that both of the two MLE-based methods can recover a narrow size distribution of the monodisperse reference standards. The estimation methods were also applied to estimate the size distribution of polydisperse vesicle samples prepared in various polydispersity conditions to verify if FTLA and the iterative method are applicable for polydisperse samples of unknown size distribution. While the iterative method achieved results comparable with DLS, the results from FTLA deviated from those of DLS as it assumes an arbitrary size distribution that may not be appropriate for polydisperse samples. In addition, the size distribution of NTA acquired with the estimation methods was converted into an intensity-weighted size distribution from its number-weighted distribution to investigate if the conversion gives a better comparison with the DLS results.

Results and Discussion
size Measurement of polystyrene Latex Nanoparticle standards. To validate NTA and its size distribution analysis methods, monodisperse polystyrene (PS) latex standards were measured by DLS and NTA. The mean size and the polydispersity index (PI) of the DLS results were acquired by cumulant analysis of the measured auto-correlation function and are shown in Fig. 1. The mean size obtained by DLS is in good agreement with the nominal values, with 91 ± 1, 278 ± 1 and 352 ± 3 nm acquired for the 92, 269 and 343 nm particle size standards, respectively. The PIs of the samples were 0.037 ± 0.029, 0.016 ± 0.009 and 0.035 ± 0.018, respectively, indicating that the samples were monodisperse.
Size information from NTA, shown in Fig. 1, was extracted from the track data with segment length greater than 5, and processed by the direct conversion method. The mean sizes of the 92, 269 and 343 nm PS standards were 94 ± 4, 261 ± 3 and 320 ± 3 nm, respectively, close to their respective nominal values. The RVs of the 92, 269 and 343 nm PS standards were 0.075 ± 0.009, 0.033 ± 0.007 and 0.044 ± 0.001, confirming that the size distributions were monodisperse.
To determine if the RV of the NTA results from the direct conversion could be reduced by considering a limited track segment length, the two MLE-based size estimation methods, i.e., FTLA and the iterative method, were applied to the particle tracks acquired from the standard samples. The size distributions of the 92 nm standard sample acquired by direct conversion, FTLA and the iterative method are shown in Fig. 2. The distributions of the other standard samples are presented in the Supporting Information. As shown in Fig. 1, mean sizes of 88 ± 4, 250 ± 3 and 294 ± 7 nm acquired by FTLA and 89 ± 4, 250 ± 4 and 295 ± 6 nm acquired by the iterative method were determined for the 92, 269 and 343 nm standard samples, respectively. RVs of 0.019 ± 0.003, 0.010 ± 0.008 and 0.052 ± 0.022 by FTLA and 0.038 ± 0.003, 0.041 ± 0.010 and 0.058 ± 0.002 by the iterative method were acquired for the 92, 269 and 343 nm standard samples, respectively. The RVs of the 256 and 343 nm standards acquired by the FTLA and iterative methods are not very different from those derived by direct conversion, whereas that of the 92 nm standard is significantly reduced. This suggests that the MLE-based methods are effective for size distribution estimation especially when the particle size is small, which tends to result in a limitation in acquiring tracks of high enough track segment length due to the large diffusion coefficient. For small particle samples, the error in size estimation by direct conversion is enlarged as the error is inversely proportional to the square root of the track segment length, making the observed size distribution broad 57 . For large particle samples, the particles in the sample are tracked for long enough that the error in size estimation by direct conversion is as small as that of the MLE-based methods.
In the comparison, the mean sizes of the standard samples determined by the FTLA and iterative methods match each other well despite their different approaches to finding the maximum likelihood. However, the RVs from FTLA are smaller than the corresponding values from the iterative method. The assumed size distribution of FTLA has the advantage of a smoother size distribution but at the expense of reduced likelihood 60 , resulting in the smaller RVs of these PS standard sample measurements. Given that the samples are monodisperse, assuming a log-normal distribution for the size distribution estimation does not decrease the likelihood compared with the corresponding result from the iterative method (see Supporting Information). www.nature.com/scientificreports www.nature.com/scientificreports/ popC Vesicles. POPC vesicles prepared in various polydispersity conditions were measured by DLS. The results are presented in Fig. 3. The mean size of the vesicle samples increased with increasing pore size of the extrusion filter. For vesicle samples extruded through filters with pore sizes of 50 nm (V50), 100 nm (V100), 200 nm (V200) and 400 nm (V400), DLS revealed mean sizes of 91 ± 4, 132 ± 3, 213 ± 4 and 458 ± 13 nm, respectively, and PIs of 0.089 ± 0.032, 0.064 ± 0.035, 0.077 ± 0.048 and 0.291 ± 0.013, respectively. Despite the increase in mean size, the PI does not vary much for pore sizes of 50, 100 and 200 nm, indicating that these samples are relatively monodisperse. However, it jumps to about 0.3 for the 400 nm pore size, showing that the vesicle sample extruded through 400 nm pores is highly polydisperse.
For the different extrusion filter pore sizes, the mean sizes measured by NTA were similar to those from DLS, increasing with increasing filter pore size. However, the RVs acquired from NTA indicated that the samples were polydisperse and did not significantly vary with filter pore size, unlike the PI values obtained by DLS. For the various numbers of FT cycles, the trend of mean size measured by NTA also matched that obtained by DLS, decreasing gradually as the number of cycles increased. However, the RVs of the samples from NTA show a gradual increase as the number of FT cycles increases, which is opposite to the trend of the PI measured by DLS, probably because the FT treatment led to homogenization of the vesicle samples 61 . Remarkably, for some of the samples the mean size acquired from NTA was larger than its corresponding mean size from DLS, in contrast to the belief that the mean size of an intensity-weighted distribution is larger than that of its corresponding number-weighted distribution 45,48 . As the polydispersity of the samples is very large, it is more unlikely that a larger mean size will be obtained from NTA.
Regardless of the condition of the vesicle samples, the mean sizes of the standard sample measurements estimated by the two MLE-based methods match very well despite the different approaches of the two methods. However, the RVs from the two methods are different, with the RVs from the FTLA method being smaller than those from the iterative method except for the 400-nm filter pore size condition. Although FTLA finds a smaller RV than the iterative method, as noted above, it is at the cost of allowing a smaller likelihood value by assuming a certain shape for the size distribution, i.e., a log-normal distribution, for parameter optimization 60 . In fact, the likelihood value of the size distribution acquired through FTLA is smaller than that from the iterative method except in one case (see Supporting Information). Moreover, some of the likelihood values are even smaller than their corresponding values from the direct conversion, i.e., without any statistical processing. This implies that FTLA is inappropriate for the estimation of size distributions of NTA measurements on polydisperse samples without a proper assumption of the sample size distribution and that the iterative method is preferable to FTLA for such samples 54 .
By applying the iterative method as the preferred method for the size distribution estimation of NTA measurements, the size distributions of the vesicle samples acquired by DLS and NTA are compared in Fig. 4. A comparison of the distribution profiles shows that the size distribution of the vesicle samples is monomodal for most of the samples except V400 and the distribution from NTA is shifted toward smaller sizes than its corresponding distribution from DLS. This shift can be attributed to the different physical quantity that the two measurement techniques acquire, i.e., the intensity of scattered light and number of particles for DLS and NTA, respectively, indicating a larger shift for higher polydispersity 45,48 . However, the shift (i.e., the difference in the mean sizes measured by DLS and NTA), is very small compared with the acquired PI and RV of the samples even after applying the iterative method for better size estimation of NTA. Filipe et al. proposed that the small shift can be explained by the lower detection limit of DLS, which detects small particles below 30 nm, which also explains the relatively high PI value measured by DLS 52 . It expects a smaller RV for a size distribution acquired by NTA than the corresponding PI by DLS since NTA would neglect the small size range from the true size distribution and report a narrower distribution. However, the results of FT11 and FT13 show that their respective RV from NTA is larger than their corresponding PI from DLS while the mean size of each sample measured by NTA and DLS is similar, which is contradictory to the expectation.
Theoretically, the small difference in the mean sizes measured by DLS and NTA can be due to the complex nature of the light scattering intensity in relation to the scattering form factor of the sample 48 . Therefore, the number-weighted size distribution acquired by NTA was reconstructed into an intensity-weighted distribution (as shown in Fig. 4) by assuming an isotropic thin-shell hollow sphere model for the vesicle samples and introducing the RGD approximation to construct the form factor 48,55,62 . The reconstruction revealed mean sizes of 102 ± 1, 118 ± 9, 191 ± 4 and 279 ± 46 nm and RVs of 0.065, 0.056, 0.033 and 0.271 for V50, V100, V200 and V400, respectively, and mean sizes of 209 ± 17, 201 ± 7, 173 ± 4, 165 ± 3, 147 ± 6, 148 ± 5, 131 ± 2 and 130 ± 3 nm and RVs of 0.116 ± 0.045, 0.169 ± 0.021, 0.090 ± 0.020, 0.076 ± 0.002, 0.113 ± 0.019, 0.128 ± 0.002, 0.110 ± 0.003 and www.nature.com/scientificreports www.nature.com/scientificreports/ 0.102 ± 0.013 for the vesicle samples treated with 3 to 17 FT cycles, respectively. Figure 5 compares the mean sizes and RVs of the original number-weighted and the converted intensity-weighted distributions from NTA with those from DLS. The mean sizes measured by NTA and those acquired from its reconstructed intensity-weighted size distribution are very close to each other. However, the RVs from the reconstructed NTA size distributions are significantly low for some samples, but it is questionable if those samples can be considered monodisperse based only on the reconstructed values. This result indicates that converting the NTA size distribution to an intensity-weighted distribution does not effectively mitigate the difference in the mean sizes observed by DLS and NTA and that it is better to compare the measurements by DLS and NTA when the size distributions are expressed in their original weightings. www.nature.com/scientificreports www.nature.com/scientificreports/

Conclusion
This paper introduces and compares two particle size measurement techniques, DLS and NTA, for the size characterization of polydisperse macromolecular assemblies. While both techniques acquire size information by detecting the diffusion coefficient of the measured particles, they differ in the quantity of the size distribution they report, with NTA reporting number and DLS reporting intensity. Three size distribution estimation methods for NTA were tested to compare their performance using monomodal PS latex standards. The two MLE-based methods, i.e., the FTLA and iterative methods, produced results comparable to each other and in line with those from DLS while the direct conversion resulted in a larger variance, especially for small particles, due to the limitation of obtaining sufficiently long particle tracks. This result indicates that an MLE-based approach should be applied for the accurate measurement of small sized particles with NTA.
The two MLE-based size distribution estimation methods for NTA were further tested on measurements of polydisperse vesicle samples prepared in various polydispersities, which obtained the same mean sizes despite their different strategies to finding the optimal size distribution from the given track data. However, the calculated likelihood of the acquired size distributions obtained by the two methods indicated that FTLA sacrificed likelihood at the expense of a smoother size distribution, further suggesting that the iterative method is preferable for polydisperse samples.
The results for the vesicle samples obtained by NTA using the iterative method were then compared with those from DLS. The mean sizes were comparable except for those samples with a very high polydispersity index, i.e., V400 and FT3. However, the mean sizes of some samples, e.g., FT9 to FT17, measured by NTA were larger than or very close to those from DLS, which seems contradictory to the fact that the mean size from an intensity-weighted size distribution is larger than that from its corresponding number-weighted size distribution. Although the low size detection limit of NTA was pointed out as a source of the contradiction by reducing the variance measured by NTA, it does not fully explain why the mean sizes of V50 were less different between DLS and NTA compared with V100 and V200 despite its higher PI value.
Additionally, the number-weighted size distribution of the vesicle samples from NTA was converted into an intensity-weighted distribution by assuming the thin-shell hollow sphere model with the RGD approximation to verify the influence of the different quantities that DLS and NTA produce. While the mean size given by NTA after conversion was very close to the value before conversion, the RV after the conversion was much reduced compared to that from DLS and NTA. Considering the nature of the vesicle samples, it is questionable whether the small relative variance after reconstruction is reliable. Therefore, the conversion of a number-weighted size distribution of NTA into an intensity-weighted one does not seem to effectively explain the difference in the mean sizes measured by DLS and NTA, and the size information from DLS and NTA is better compared in their original weightings. theory Nanoparticle tracking Analysis. Similar to DLS, NTA extracts the size information of particles in suspension by measuring their diffusion coefficient 51 . As illustrated in Fig. 6, by taking sequential images of illuminated particles in suspension on a periodic time interval, the displacement of a particle can be identified from successive images and constructed into a track. To determine the displacement of particles, NTA compares the two-dimensional location of particles in an image frame with the subsequent frame. In doing so, NTA sets a certain threshold www.nature.com/scientificreports www.nature.com/scientificreports/ distance, also known as the maximum jump distance, to properly identify if the two particles in the two adjacent frames are the same particle. If any single particle is found in the successive frame within the threshold distance from the location of the particle in the previous image, the two particles are recognized as the same particle and make a track segment. In the same manner, this recognition process is performed for subsequent frames and track segments are combined to construct a particle's track. The track terminates if there is no particle in the following frame or if there are more than two particles within the threshold distance. Because of the nature of the tracking process, the track segment length, and the number of track segments, are finite and variable.
For a track of a track segment length n, the mean squared displacement of the track z, expressed as where r i is the two-dimensional displacement of ith track segment of the track, is translated into the diffusion coefficient assuming a 2D Brownian motion, which is related by where D is the diffusion coefficient of the tracked particle and Δt is the time interval of the image frames 51,52,57,58 . Then the acquired diffusion coefficient D of the track is converted to the hydrodynamic diameter relying on the Stokes-Einstein equation, where k B is the Boltzmann constant, T is the temperature, η is the viscosity of the medium and d is the hydrodynamic diameter of the tracked particle, and comprises the size distribution of the measured sample.

size Distribution estimation Methods for NtA. Uncertainty in the Mean Squared Displacement
Measurement in NTA. Measurement of the mean squared displacement z in NTA assumes that a particle is tracked for long enough that the measured mean squared displacement is close enough to the ideal mean squared displacement. However, as the track segment length is finite, the acquired mean squared displacement z of a track has statistical uncertainty that is inversely proportional to the square root of the track segment length n 54,63 . Therefore, small track segment lengths would make the measured size distribution significantly broader than its true size distribution and they should be excluded to have a narrower size distribution.

Maximum Likelihood Estimation (MLE) with an Assumed Arbitrary Distribution with Parameters.
As particles observed by NTA undergo Brownian motion, the probability distribution of the mean squared displacement z of a tracked particle whose size is d with a track segment length n can be described by a gamma distribution as follows 58 : B . An observed set of tracks Φ of a sample measured by NTA can be related to a particle size distribution f(d) by the integration  www.nature.com/scientificreports www.nature.com/scientificreports/ the inversion relation cannot be used because Q(d|n, z) depends on f(d) 60 .
Due to the difficulty of finding Q for the inversion, finite track length adjustment (FTLA) can be used to determine f(d) by maximizing the likelihood of the observed list of tracks Φ with an assumed size distribution for f(d) as illustrated in Fig. 7 57 . For an assumed size distribution with a few parameters, for example a log-normal distribution with its geometric mean and geometric standard deviation set as parameters, an "ideal" set of particles d i (with j = 1-999) is drawn, which comprises 1000-quantiles of the size distribution, i.e., those of particle size d i that satisfy C(d j ) = j/1000 (j = 1-999) for the cumulative size distribution C(d) of f(d). For each particle size d j , the likelihood of producing a track whose mean squared displacement z and track segment length n is given by j j Hence, the likelihood of producing such a track with a given particle set d j (j = 1-999) is given as where z k and n k are the mean squared displacement and the track segment length of the k th track of Φ, respectively, and K is the total number of the observed tracks. For practical reasons, the maximization is performed on the logarithm of the likelihood, as maximizing a positive function also maximizes its logarithm 58 : Figure 7. Schematic representation of the size determination methods based on maximum likelihood estimation. By assuming a size distribution described by adjustable parameters, the FTLA method looks for the parameter values that maximize the likelihood of the given NTA tracks. The iterative correction method refines the size distribution that approaches to the maximum likelihood over each iteration. ( With the log-likelihood, an optimal parameter set for the size distribution is sought for its maximum value, which determines the size distribution for the observed tracks. As this approach assumes an arbitrary size distribution, we can expect a decrease in the likelihood for the benefit of a simpler solution 60 . Correction. For a size distribution f(d), the log-likelihood of producing the observed list of tracks Φ can be expressed as

Maximum Likelihood Estimation by Iterative
where b is the bin number for the segmented range of particle size d 58 .
Then its differentiation with respect to the size distribution is given as At the maximum of the likelihood, the differential becomes zero, leading to where m is the bin number for the mean squared displacement and N n is the number of tracks with track segment length n, and in this study the iteration terminates when the change in χ 2 becomes smaller than 1% of the previous value 58 .

Conversion of Number-Weighted Distribution of NtA. The size distribution measured by DLS is
weighted by the intensity of the scattered light, which is dependent on the particle size, whereas that from NTA is weighted by the number. In comparing the two quantities, one must be converted to match the weighting of the other. The number-weighted size distribution f(d) from NTA can be converted into an intensity-weighted distribution f I (d) given by I 0 where q is the scattering vector and I(q,d) is the form factor of the measured particles 66,67 . For vesicles, we assume a thin-shell hollow sphere model 48,68,69 , so that where P(q,d) is the structural factor for the vesicle approximated by the Rayleigh-Gans-Debye (RGD) approximation given by sin( /2) /2 (19) experimental setup. For the DLS measurements, a ZetaPals particle size analyzer (Brookhaven Instruments, Holtsville, NY) with a 658.0 nm monochromatic laser was used. For each sample, three independent runs of 1 min were performed. To avoid unnecessary reflection, all measurements were taken at a scattering angle of 90°, and the measured intensity autocorrelation function was fitted to yield the intensity-weighted size distribution of particles in solution. The deconvolution of the autocorrelation function was done using the cumulants method, which was applied to calculate the intensity-weighted log-normal profile of the size distribution expressed by the average effective diameter and its polydispersity. NTA measurements were made with an LM10HS (Nanosight Limited, Amesbury, UK) equipped with a scientific CMOS camera, a 20x objective lens, a blue laser module (405 nm, LM12 version C) and NTA software version 3.1. A 1-ml disposable syringe was used to inject the samples into the instrument chamber. The video data for NTA measurements were collected for 30 seconds, repeated three times for each sample. The detection threshold of the NTA software was set to 5 and the maximum jump distance and the minimum track segment length were both set to auto.
Detected tracks were then translated into a size distribution using three different methods, i.e., direct conversion of the detected particle tracks, maximum likelihood estimation with an assumed distribution (the FTLA method) and maximum likelihood estimation by iterative correction. For the conversion, valid tracks were acquired from the detected tracks by the software, with tracks of a small track segment length excluded from the selection.
FTLA assumes a certain shape for the size distribution to maximize likelihood, and we assumed a log-normal distribution in this study so that the mean and standard deviation parameters could be optimized to produce maximum likelihood 57 . For the calculation, 1000-quantiles are generated from a log-normal distribution while varying its mean size and standard deviation, which represent an "ideal" set of particles for the assumed size distribution 57 .
In the iterative method, the size distribution f(d) is refined by the iteration where f (j) (d i ) is a normalized fraction of those particles in the size distribution of the jth iteration whose size is between d i and d i+1 , and N is the number of tracks in the given set of tracks. The initial size distribution f (0) (d) is set to a uniform distribution so that f (0) (d) = 1/M, where M is the number of bins for the size distribution. The iterations are performed until the change in the chi-squared value between the histogram of the displacement of the observed tracks and that of the estimated size distribution from the iteration is less than 1% of the previous value 58 .
In comparing the results from DLS and NTA, the mean size of NTA from the acquired size distribution is compared with the mean hydrodynamic size R z of DLS while the relative variance (RV, the variance divided by the square of the mean size) of NTA is compared with the PI of DLS.

Data Availability
The data that support the findings of this study are present in the paper and the Supplementary Information. Additional information is available from the authors upon reasonable request.