Label-free nanofluidic scattering microscopy of size and mass of single diffusing molecules and nanoparticles

Label-free characterization of single biomolecules aims to complement fluorescence microscopy in situations where labeling compromises data interpretation, is technically challenging or even impossible. However, existing methods require the investigated species to bind to a surface to be visible, thereby leaving a large fraction of analytes undetected. Here, we present nanofluidic scattering microscopy (NSM), which overcomes these limitations by enabling label-free, real-time imaging of single biomolecules diffusing inside a nanofluidic channel. NSM facilitates accurate determination of molecular weight from the measured optical contrast and of the hydrodynamic radius from the measured diffusivity, from which information about the conformational state can be inferred. Furthermore, we demonstrate its applicability to the analysis of a complex biofluid, using conditioned cell culture medium containing extracellular vesicles as an example. We foresee the application of NSM to monitor conformational changes, aggregation and interactions of single biomolecules, and to analyze single-cell secretomes.


Scattering intensity
Elastic scattering of light is a fundamental process in light-matter interactions. For threedimensional nano-objects, such as biomolecules or nanoparticles, whose size is much smaller than the wavelength of the incident light (l), this process is known as Rayleigh scattering. From here forward we explicitly refer to biomolecules only. However, we highlight that the solution we have developed is generic and also applies to other nano-objects equally well. The total scattering intensity (power), that is collected by the optical system, is determined by the incident intensity (I0), the collection efficiency ( ), and the scattering cross-section of a biomolecule ( ! ), defined as 1 where = 2 ⁄ is the wavenumber of light in the medium with refractive index n, and ! is the polarizability of the biomolecule, where ! is the refractive index of the biomolecule, V is its volume, and is the depolarization factor, depending on its shape (for a sphere = 1 3 ⁄ , for an ellipsoid ranges between 0 and 1 2 ⁄ depending on the ratio of its semiaxis).
A nanochannel, which is an object of finite (subwavelength) size in only two dimensions, can scatter light as well. It can be approximated as a cylinder with cross sectional area (A) and length that is much larger than √ . The refractive index of the inside, ni, is lower than the refractive index of the outside, no. The scattering intensity is constant along the length of such a nanochannel, and the total scattering intensity can be written as where L is the length of the nanochannel from which the scattered light is collected and ' is the scattering cross section of the nanochannel. For a nanochannel having circular crosssection, ' can then be determined by the analytical expression presented by Mie 1 . For a nanochannel with √ ≪ (Rayleigh limit), and light incidence angle perpendicular to its long axis, the solution can be simplified to 1 where ko is the wavenumber in the medium outside the nanochannel, ( = 2 ( ⁄ , and is the one-dimensional polarizability of the nanochannel, where is the depolarization factor that depends on the shape of the nanochannel (for a cylinder with aspect ratio equal 1, = 0 for TM polarization, = 1 2 ⁄ for TE polarization).

Fig. S1. Schematic of the assumed scenario: a cylindrical nanochannel with a single biomolecule inside.
When a biomolecule is present inside such a nanochannel (Fig. S1), the combined scattering by the nanochannel and the biomolecule becomes a complex canonical problem of scattering by a sphere embedded inside a cylinder. The analytical solution to this problem was recently presented, based on the expansion of cylindrical waves into spherical ones and vice versa 2 . However, it leads to a series of rather complex expressions, from which it is difficult to draw a more generalized conclusion. Therefore, we instead employ the Clausius-Mossotti relation that describes the relation between the microscopic quantity -polarizability -and the macroscopic quantity -dielectric function. It states that the contribution of biomolecules enclosed in a cavity can be described in terms of a bulk optical property of the material inside the cavity -the effective permittivity 3 where nV is the volume density of the biomolecules, , = ⁄ , where N is the number of biomolecules in the volume V.
Using the expression for the polarizability of a cylinder (Equation 6), where the refractive index of the inside is assumed to be * ≈ √ * (Equation 7), the total polarizability of a nanochannel comprising a single biomolecule can then be written as for a cylinder with circular cross-sectional and TE polarization.
For biomolecules with low polarizability that are sparsely distributed inside the nanochannel,

Equation 11
In order to confirm the conclusions of the analytical theory presented above, we have carried out finite-difference time-domain (FDTD) simulations. They were performed using the Lumerical FDTD Solutions software package. The simulated structure -a cylindrical nanochannel with radius rc -was placed into a 3D simulation cell of the size L ´ 0.5 ´ 0.5 µm 3 . The refractive indexes of the inside and outside of the nanochannel correspond to water (ni = 1.33 RIU) and SiO2 (no = 1.46 RIU), respectively. Biomolecules were simulated as dielectric spheres with refractive index nm = 1.43 and radius rm. Periodic boundary conditions were set in the x direction and perfect matching layer absorbing boundary conditions were set in the y and z directions. The mesh step was 1 nm. Light was introduced as a linearly polarized plane wave via a total-field/scattered-field source with normal incidence. If not stated otherwise, the wavelength of the incident light was l = 500 nm. The scattered light was collected from 4 near-field monitors surrounding the nanochannel in all directions.   (Equation 10 and Equation 11), presented as dependency of the relative difference of the scattering cross section of a nanochannel containing a biomolecule and an empty nanochannel, ( − )⁄ , on a variety of properties of the nanochannel and the biomolecule: polarizability (radius) and linear density of biomolecules ( = ⁄ ) (Fig. S2A), crosssectional dimension of the nanochannel (Fig. S2B), and axial position of the biomolecule inside the nanochannel (Fig. S2B, C, D). The overall trend captured by both approaches is in excellent quantitative agreement (Fig. S2A). However, the FDTD results show an additional dependency on the axial position of the biomolecule inside the nanochannel that is neglected by the analytical model and that originates from the phase difference between the light scattered by the nanochannel and the biomolecule (Fig. S2B, C, D). As a result, when a biomolecule freely diffuses inside the nanochannel, the scattering cross section randomly fluctuates over time, with the frequency depending on the diffusivity of the molecule (D). However, the mean value over all possible locations is very close to the analytical model (Fig. S2B). This means that if the mean travel distance, √2 Δ , within one integration time step (Δ ) is much larger than the dimensions of the nanochannel cross section, the phase differences are effectively averaged, and the obtained scattering cross section value corresponds very well to the analytical model.  The total scattering intensity, -= " -, that is collected from a region of a nanochannel of the length # = 3 (2 ( ) ⁄ containing a biomolecule can be written as Equation 12 using Equation 10 and Equation 11, where ' and ! correspond to the total scattering intensity of the nanochannel (Equation 1) and the biomolecule alone (Equation 4), respectively. Here, it has to be noted that the scattering intensity collected from a typical biomolecule (Fig. S3A) is several orders of magnitude lower than the scattering intensity collected from a typical nanochannel (Fig. S3B), ' ≫ ! , because the volume contributing to the detected intensity is much larger than the molecular volume, # ≫ ! . Therefore, the scattering signal stemming from the biomolecule that resides inside the nanochannel is not directly distinguishable (Fig.  S3C), -≈ ' . However, the scattering intensity integrated from the differential image ( Fig.  S3D) that is obtained by subtracting the scattering image produced by the empty nanochannel from a scattering image of the nanochannel with the biomolecule inside, -− ' ≈ −2[ ' ! , can be several orders of magnitude higher than the scattering intensity collected from the biomolecule alone, had it been outside the nanochannel, 2[ ' ! ≫ ! .

Integrated optical contrast
The intensity profile of a nanochannel image (defined as the collected scattering intensity integrated along the y-axis) is constant along the x-axis and can be written as ' ( ) = ( ' . The intensity profile of the differential image contains an intensity modulation stemming from the biomolecule that resides inside the nanochannel (Fig. S3E). The integrated optical contrast (iOC) that we here define as represents an integrated value of the intensity profile of the differential image normalized by the intensity profile of the empty nanochannel. As ∫ -( )

Equation 15
For small biomolecules ( ! ≪ ' ), this can be simplified to

Equation 16
that can be expressed in terms of the refractive indexes and the cross-sectional area of a nanochannel using Equation 6 as where for TE polarization and for TM polarization.
Here, it has to be noted that the intensity profile of the nanochannel can be modulated by the profile of the incident beam (I0 is not constant along the nanochannel). Nevertheless, the normalization by the intensity profile of the empty nanochannel from the same spot, as suggested in Equation 13, corrects for that. Moreover, as the biomolecule moves, the created image is altered by the motion blur resulting from time-averaging within one frame. However, as iOC represents the integrated value over the length of the nanochannel (Fig. S3E, Equation  13), it is not dependent on the shape of the intensity modulation stemming from the biomolecule and thus it is not altered by the motion blur. Moreover, iOC does not vary with the distance of the nanochannel from the focal plane, the wavelength, or NA of the imaging system. All of the aforementioned facts thus guarantee a high robustness and high precision of the measurement.
In order to confirm the conclusions of the analytical theory presented above, we have again carried out FDTD simulations. Compared to the model presented in the previous chapter, the size of the simulation cell was fixed to 3 ´ 3 ´ 0.5 µm 3 and near-field data collected in the backward direction with respect to the incident field were used to simulate the dark-field image by the following procedure. First, near field monitor data were decomposed into series of plane waves that propagate at different angles using a built-in function of the software package. Any plane waves with angles outside of the specific NA were then discarded. Finally, the remaining light was re-focused onto the image plane using the chirped Z-transform. Fig. S3 shows corresponding examples of simulated dark-field images of a biomolecule (Extended Data Fig.  3A), a nanochannel (Fig. S3B), and a biomolecule inside a nanochannel (Fig. S3C). Fig. S3D shows a differential image obtained by subtraction of the image of the nanochannel containing a biomolecule from the image of the empty nanochannel. iOC was then calculated by integrating the intensity profile of the normalized differential image (Fig. S3E) using Equation 13.

Light scattering of a biomolecule inside a coated nanochannel
Some NSM applications require surface modification of the channel walls, e.g., coating with a supported lipid bilayer (SLB) to prevent positively charged molecules from binding to the channel walls 4 . These modifications change the optical properties of the nanochannel and thus the optical contrast of the biomolecules inside it. Therefore, the translation of iOC to MW has to take the effect of such a coating into account when it is used.
To theoretically analyze the impact of the coating, a coated nanochannel can be described as a core-shell cylinder with one-dimensional polarizability 1

Equation 20
where ni, ns, no are the refractive indexes of the inner core, the shell, and the outside medium, respectively, A is the total cross-section area, Ai is the cross section area of the inner core, = * ⁄ is the fraction of the total cross-section area occupied by the inner core, and is the depolarization factor ( = 0 for TM polarization, = 1 2 ⁄ for TE polarization).
The one-dimensional polarizability for TM polarization can therefore be written as  where the second term in the denominator can be neglected for low f and small differences between the refractive indexes, When a biomolecule is present inside such a nanochannel, its contribution can be described in terms of increased effective permittivity of the inner core (Equation 7) and the total polarizability of a nanochannel containing a biomolecule that can then be written as -

Equation 24
For biomolecules with low polarizability that are sparsely distributed inside the nanochannel, and for a small difference between the refractive indexes, , ! ( * & − ( & ) ≪ 1, the second term in the denominator in Equation 24 can be neglected and the polarizability of the nanochannel containing the biomolecule (Equation 8 and Equation 9) can be written as where = 1 for TE polarization and for TM polarization. Note that the expression for the total polarizability of a core-shell cylinder (Equation 25) corresponds to the total polarizability of a cylinder without a shell (Equation 10), and therefore, Equation 11 to Equation 17 are valid for the core-shell cylinder as well. The value of the parameter d in Equation 17 is, however, different, as it accounts for the presence of the additional layer. It can be written as

Equation 26
where ' ' 7 ⁄ is the ratio between intensity of the scattered light from a cylinder with and without a coating (additional layer).

Polarizability of a protein
The fact that the refractive index of a protein solution increases linearly with the (mass) concentration of the protein (c) and that the specific refraction increments ⁄ have almost the same value for a large number of different proteins ( ⁄ = 0.185 mL • g 9: ) has been pointed out by many studies 5 .
Using the Clausius-Mossotti relation

Equation 27
that describes the relation between the polarizability of biomolecules present in solution ( ! ), volume density of the biomolecules ( , ), refractive index of the medium in which the biomolecules are present ( ), and the (effective) refractive index of the solution * , the relation between refractive index increment ( ⁄ ), molecular weight (MW), and the polarizability of a biomolecule normalized by MW can therefore be expressed as
Interestingly, the same value of ! ⁄ can be derived from a calculation of the properties over 171,729 different proteins presented by Young et al. that determined the mean specific volume of a protein (defined as volume per mass) as Vsp = 0.7446 mL g -1 and the mean refractive index of a protein as nm = 1.5867 RIU. Using the electrostatic approximation (Equation 6), the polarizability of a protein can then be expressed as

Equation 29
Since iOC is linearly proportional to the polarizability of a biomolecule (Equation 17), and since the polarizability of a biomolecule is linearly proportional to the molecular mass (Equation 28 and Equation 29), the molecular mass of a biomolecule inside a nanochannel can indeed be directly determined from iOC as

Equation 30
where = ! ⁄ = 0.46 Å ) • Da 9: and d, A are constants related to the properties of the nanochannels that can be determined prior to the measurements. In this study, the parameter A was determined from SEM images of the nanochannel cross section (Extended Data Fig. 3).
Parameter d = −10.25 was calculated from Equation 18 and Equation 19, where * = 1.33 RIU (water), ( = 1.46 RIU (SiO2) and even representation of TE and TM polarization was assumed, d = 0.5( d .0 + d ./ ). For nanochannels coated with a lipid bilayer, a modified parameter d 7 = d[ ' ' 7 ⁄ was used (Equation 26), where ' ' 7 ⁄ = 1.7 for Channel VI was determined from the intensity of scattered light from the nanochannel before and after the deposition of the lipid bilayer.

Error estimate of size determination due to partial-slip boundary condition
The size of individual particles is related to the diffusion constant through the Stokes-Einstein relation corrected for the hindrance effect due to the small nanochannel size as

Equation 31
where K is the hindrance factor 6 , is the radius of a circle defined by an area A. This expression assumes that the flow velocity at the boundary of the nanochannel vanishes, known as the "no-slip" condition. Recently, it was demonstrated 7 that SLB do not obey the no-slip condition. This can be taken into account by introducing a "slip length" , which denotes the distance below the surface of the bilayer at which the flow profile vanishes by extrapolating the flow profile. For bilayers, this was found to be around l = 5 nm. In order to account for this effect, we first note that the hindrance factor is obtained by averaging the influence of the hindrance on the nanoparticle diffusion over a cross section of the nanochannel,

Equation 32
Where B is the bulk diffusivity, is the radius of the particle, || is the diffusion constant parallel to the boundary 7

Equation 33
To lowest order, the effect of the slip length can be taken into account by extending the dimension of the channel to ´= + , while taking into account that the channel wall at = still constitutes the physical boundary. The integral boundaries therefore do not change, but the integrand changes to || ( + − ; )/ B . The effective hindrance factor can then be approximated as

Equation 34
The resulting effective hindrance factor relevant for Channel V (AV = 225 ´ 200 nm 2 ) is shown as a function of particle radius in Fig. S5, normalized to the hindrance factor assuming no-slip condition, ˆ⁄ . For particles up to 70 nm diameter (range of the BNPs detected in Channel V), it ranges between 0.96 -1.01. This suggests that the estimated error in size determination of BNPs presented in the context of the analysis of the conditioned cell culture medium ( Fig.  4C-F) is lower than 4% and therefore the effect of the partial-slip boundary condition can be safely ignored.

Removal of the background
To visualize the biomolecules and their trajectory inside a nanochannel in the form of kymographs (Fig. 2), the data acquired by the CMOS camera were processed using the following procedure.
In the first step, the raw images of the nanochannel in every frame were integrated along the short axis of the nanochannel and averaged using convolution with a Gaussian function with 18 pixels width. This step results in time series of an intensity profile of the nanochannel, * ( ), where i stands for i-th time frame and x is a spatial coordinate.
In the second step, the nanochannel's profile was corrected for spatial and intensity instabilities. At first, to correct for the spatial instabilities (vibrations of the system), the image profile in every i-th step spatially shifted by Δ * was estimated using spline interpolation (Matlab). After that, to correct for the intensity instabilities, the normalized differential image profile ( E ‹ ( )) was calculated as

Equation 35
from which the low frequency modulation was filtered using

Equation 36
where the denominator in Equation 36 stands for the convolution of E ‹ ( ) with a step function ℎ( ) with width 5 = 200 pixels, defined as The spatial shifts in every i-th step, Δ * , were found in an iterative procedure based on the Newton-Raphson method that finds a minimum of the function

Equation 38
where G(H is the length of the field of view. More specifically, a refined n-th guess, ∆ * I , is estimated from two previous guesses ∆ * I9: and ∆ * I9& , as Equation 39 ( where In the third step, the background (intensity profile of an empty nanochannel), * J , is estimated from corrected image profiles, as where a window of size At = 100 frames was used, whose size was truncated at the endpoints. Kymographs are then obtained by subtracting the background followed by normalization by the background, as

Equation 43
In the fourth step, the positions of the biomolecule are found using the particle tracking algorithm, described in detail in SI section "Particle tracking" and a matrix * ( ) was created for which * ( ) = * # ( ) when the i-th time frame and x position correspond to a found particle, and * ( ) = 0 otherwise. In the last step, all the four steps are repeated, but the intensities that were identified by the particle tracking algorithm as responses corresponding to biomolecules are excluded from the process by diving the time series of an intensity profile of the nanochannel, * ( ), by the matrix * ( ). The last step is repeated iteratively until convergence is achieved, i.e., the difference between * # ( ) derived in the subsequent iteration steps is not larger than the standard deviation of # .

Particle tracking
To find the positions of a biomolecule and link them to a trajectory, we have used the particle tracking algorithm described in 8 . In order to improve the performance for biomolecules with low iOC, we have modified some parts of the algorithm, as described below. To detect possible biomolecules, all the minima in every time frame of the kymograph ( # ) were found and their zero order intensity moments (m) and positions (x) were determined using a procedure described in 8 . In the next step, the found minima were used to create all possible short trajectories with temporal length of 8 frames for which the difference of the positions between two subsequent frames were lower than 3[2∆ M*!*-, where ∆ is the time difference between two subsequent frames and M*!*-is an optional parameter setting the highest diffusivity of a biomolecule that would be found by the algorithm. In all presented cases we have used a M*!*-= 50 µm 2 /s. The mean value of m ( • N -ddddddd ), standard deviation of m , and the diffusivity ( • K -) of every trajectory K that is composed of 8 found minima (possible biomolecules, * ) is calculated according to

Equation 44
To evaluate the characteristics of the noise, we followed the same procedure for a kymograph with inversed intensity (− # ), i.e., we find all minima, create all possible short trajectories ( K  I ) and calculate the mean value and the standard deviation of m that correspond to the noise ( . From the set of the values • N Idddddddd we determine its mean value and standard deviation, and define a limiting value ME!E-dddddddd = mean˜• N Idddddddd š − 4std˜• N Idddddddd š. In a similar fashion, from a set of values ! • K Iwe determine its mean value and standard deviation and define a limiting value ! M*!*-= mean˜!• K I -š + 4std• ! • K I --. All found possible trajectories for which • N -ddddddd > ME!E-dddddddd and ! • K -> ! M*!*are then considered to be noise and are discarded. All the remaining trajectories (with temporal length of 8 frames) are combined into all possible trajectories of longer temporal lengths and their cost function is calculated as The complete set of single biomolecule trajectories is then selected from longest trajectories with lowest cost function while assuming that every found minima is assigned to only one biomolecule. Trajectories that were shorter than 40 frames were considered to correspond to noise and thus discarded.
In the last step of the algorithm, parts of the trajectories that correspond to the biomolecules that are bound to the surface (not diffusing) are found using the following procedure. We assume that if the biomolecule is bound to the surface (does not change its position), the standard deviation of the found position, corresponds to the (yet unknown) localization error, 5 = , that is related to the noise in the kymographs. The standard deviation of the spatial displacement in time then corresponds to P5 = √2 . Therefore, the ratio of the two standard deviations corresponds to 5 P5 ⁄ = 1 √2 ⁄ . On the other hand, if the biomolecule is diffusing, the standard deviation of the displacement in time corresponds to P5 = 2 ∆ + , where D is the diffusivity of a biomolecule and ∆ is the time difference between two subsequent frames, and the standard deviation of the position corresponds to 5 = ∆ √2 + √2 where N is the number of frames. Therefore, the ratio of the two standard deviations then increases with number of frames and for ≫ ∆ ⁄ corresponds to 5 P5 ⁄ = √2 2 ⁄ . It can be seen that

P5
⁄ differs between the two states of the biomolecule (diffusing or bound) and can thus be used as merit to distinguish those two different states. Therefore, for every part of a trajectory with temporal length of N = 30 frames, 5 P5 ⁄ is calculated and if the derived value is lower than the threshold limit 0.9, the part of the trajectory is assigned to a biomolecule that is bound to the surface and it is discarded from the further analysis. . Both of these approaches resulted in comparable results, however, the second one lead to an improved processing speed. The diffusivity was obtained from the statistical analysis of the spatial displacement in time determined by the particle tracking algorithm as 9 It has to be noted, that the second factor corrects for both motion blur and the localization error 9 .

Machine learning (ML) analysis
To track and analyze single biomolecules within the nanochannels using a method that is independent from the standard analysis (SA) presented in the SI sections "Removal of the background" and "Particle tracking" and therefore validate its results, we implemented a machine-learning (ML) analysis workflow 10 whose results are summarized in Extended Data Fig. 4. It is comprised of five parts: 1. Pre-processing of raw image data into kymographs. 2. Image segmentation using a U-net neural-network 11 to identify the position of the single biomolecules. 3. Object detection using the YOLOv3 algorithm 12 to identify the trajectories of the single biomolecules. 4. Property calculation using a custom fully convolutional neural network (FCNN) with residual connections to determine the properties of the single biomolecules ( and ). 5. Conversion from to and from to 8 .
All intermediate results of the ML algorithm are described in the sub-sections below, and the whole pipeline is summarized in Fig. S6. to molecular weight and from to hydrodynamic radius 8 plotted in a 2D histogram for illustration purposes.

Data pre-processing from raw data to kymograph
The raw image data (Fig. S6A) was pre-processed to transform it into kymographs (Fig. S6B). First, the intensity of the raw CMOS image data was normalized according to where ( , ) is the intensity at position and time-frame , and 〈 ( , )〉 represents its time average. Second, a low-pass-filtered version of ̅ ( , ) was calculated by using two normalized sliding windows of sizes 200 × 1 and 1 × 200, and subtracted from ̅ ( , ). Finally, to obtain the kymographs used to calculate , the resulting image was down-sampled by a factor of 4 in the length dimension through mean pooling. To obtain the kymographs used to calculate , the image was instead normalized by its standard deviation before being down-sampled by a factor of 4 in the length dimension through mean pooling.

Image segmentation with U-net
Image segmentation is used to make the value of each pixel of a given image more representative of a property of interest. In our case, we transform a kymograph where the value of each pixel represents intensity (Fig. S6B), to a segmented image where the value of each pixel represents the probability S of the existence of a biomolecule in that location and time (Fig. S6C). We perform this transformation using a U-net 11 implemented using the Python software package DeepTrack 2.0 10 . The details of the U-net architecture are shown in Fig. S7A. The U-net was trained as part of a conditional Generative Adversarial Network (GAN) 13 . This GAN consists of two neural networks (Fig. S7B): a generator network (the U-net), which generates image segmentations based on input kymographs; and a discriminator neural network, which attempts to figure out whether the generated segmentations are the real trajectories of single biomolecules in the input kymographs. During the training process, the two networks compete against each other, so that the generator generates increasingly realistic trajectories, and the discriminator learns increasingly subtle ways to distinguish generated from real trajectories. Using this GAN for training was essential to achieve the required accuracy especially for smaller biomolecules, for which the low signal-to-noise ratio often results in trajectories being entirely overpowered by noise in several subsequent time frames. The U-net was trained on simulated kymographs (generated as described in subsection 1 and pre-processed as described in subsection 1) for which the corresponding ground-truth single biomolecule trajectory is known (using the ADAM optimizer 14 with an exponentially decaying learning rate of 10 9$ , a decay rate of 0.9, and 50 decay steps). The U-net was trained using 300,000 simulated kymographs with particle trajectories in the ranges 1 · 10 9? µm ≤ ≤ 3 · 10 9) , 1 ≤ D ≤ 100 µm & /s. The input data during training are simulated images (kymographs) of size 128´2048 with a random number of trajectories and of output segmented images of equivalent size. The model is also train-validated every 120 epochs (=~30000 simulated kymographs) against 150 simulated kymographs (of size 128´512, 128´1024, 128´2048) with experimentally measured channel noise, using an 80-20 train-validation split. 16 .

Trajectory identification with YOLOv3
As third stage of our analysis, we implemented a YOLOv3 12 (Fig. S6D) object detection algorithm in PyTorch 17 to identify single biomolecule trajectories by separating a given kymograph into smaller ones, each corresponding to a single biomolecule trajectory. If a kymograph cannot be separated in this manner (for instance, as a result of two biomolecule trajectories strongly overlapping with each other), the YOLO algorithm instead counts the number of inseparable single biomolecule trajectories in the given kymograph and outputs this value. The network architecture behind this algorithm is visualized in Fig. S8. This algorithm analyses the segmented images obtained by the single-biomolecule-tracking Unet described in subsection 2 to fit a minimally-dimensioned rectangular bounding box around each separate single biomolecule trajectory. If the minimal bounding boxes of two single biomolecule trajectories overlap more than the set threshold -= 60%, their bounding boxes are merged to a minimal rectangular bounding box around both of them that is labelled as containing two biomolecules. If three or more trajectories overlap, their bounding boxes are merged and labelled as three + biomolecules. The YOLOv3 algorithm was trained on simulated kymographs, generated in the same way as for the U-net (described in subsection 1 above), but where the input is a perfectly segmented kymograph and the outputs are minimally sized bounding boxes around each separate trajectory. Trajectories are considered separate in the training data if they are more than /16 frames apart, where is the total amount of frames in the kymograph. Images are continuously generated during training, such that each subsequent image is entirely unique and the risk of overtraining is avoided. The algorithm was trained using approximately 2 million such unique segmented images, where each image was generated with a diffusion in the range ≈ 1 to 100 µm & /s (the optical contrast is not relevant here because we employ perfectly segmented simulated images) with the standard YOLO error function 12 using the ADAM optimizer 14 with a learning rate of 0.001. The input data during training are simulated images (segmented kymographs) of size 128´8192, down-sampled to 128´128 to improve performance, and the output is a list of YOLO-labels containing class, position, occurrence probability and class probability of each trajectory in the input image.

Fig. S8. YOLOv3 for trajectory identification. The YOLO algorithm is built upon a DarkNet-53 backbone 18 , which consists of a neural network architecture comprised of 53 convolutional layers. This backbone feeds into three necks, each consisting of a sequence of 5 convolutional layers, which in turn feed into corresponding heads, which output bounding boxes at three different scales through YOLO detection layers 12 . Each neck is up-sampled and fed into its neighboring neck to make sure the last yolo detection layer benefits from information collected at every previous stage. Here, "Conv" represents a single convolutional layer, is the number of filters in each block,
is the kernel size, and is the activation function. The network is visualized with Netron 15 .

Property calculation with FCNN
To calculate and for each single biomolecule trajectory (Fig. S6E), we use a custom neural network architecture, which consists of a FCNN connected with residual blocks. The architecture itself is a sequence of convolutional layers followed by a max pooling layer, including short skip connections to preserve information, as shown in Fig. S9. The full architecture consists of seven of these blocks, culminating in a head module where the algorithm produces slightly different outputs depending on whether we are calculating or .
To calculate the , the head of the neural network returns three outputs: (1) The raw output UVWXYZ from the last convolutional layer of the neural network. (2) The mask, which is a downsampled representation of the original kymograph normalized by its pixel sum. This effectively gives us a weighted matrix, where each matrix element is the probability that the corresponding down-sampled region in the kymograph contains a biomolecule trajectory. (3) The calculated for each kymograph, obtained by multiplying UVWXYZ by the mask to produce a value of for each pixel in the down-sampled kymograph weighted by the probability of said pixel containing a trajectory, and finally summing over the entire down-sampled image to obtain a single mean value of for the entire kymograph. Calculating works as for , except that we use the down-sampled U-net biomoleculetracking algorithm's pre-segmented image as the mask, instead of re-calculating it using the FCNN. This is done for two reasons: because the intensity network cannot be trained on segmented images since is entirely dependent on the noise in an image, and because the diffusion calculation is inherently sensitive to the quality of the single biomolecule trajectory tracking so that a more powerful segmentation model is needed. To train the intensity-and diffusivity-calculating FCNN models, we employed a curriculum learning scheme with intermittent checkpoints to be used for later ensemble modelling prediction. Specifically, the intensity-calculating model was initially trained only on a narrow range of high iOC trajectories, representing the highest SNR and in principle easiest case for the model to begin learning correlations, and then slowly curriculum-learned down to the lowest range of relevant iOC values. In each narrow range of iOC values, checkpoint models which are more accurate in that particular narrow range of values are saved separately. Upon model inference, an initial prediction is made with a model trained on the entire range of iOC values with the scheme described above, and then a second model trained on the narrower range of values makes a second prediction on the same trajectory to achieve higher accuracy. The process is equivalent for diffusivity, with the difference being that the range of D values being trained on increases rather than decreases during curriculum learning. These were all trained using the ADAM optimizer 14 with a learning rate of 0.0001 on approximately 300,000 simulated kymographs in the same range as those employed for the Unet. The input during training of the FCNNs are simulated images (kymographs) of size 128´2048 with a single particle trajectory, and the output is a single value of either iOC or D of said trajectory, as well as the mask, which is a learnt downsampled representation of the original kymograph. We note here that the ensemble modelling technique adds yet another barrier against false signal detection, since if the epistemic uncertainty inherent in the data is high, which may be the case for very short trajectories or regions of unusually high noise, the two predictions will differ significantly and hence the prediction should be discarded. Of course, as is common in ensemble modelling 19 , one can implement an even larger and finer-grained ensemble for even higher prediction accuracy and more accurate uncertainty estimation.  S9. FCNN for property calculation. The custom FCNN architecture is comprised of "Conv" blocks, which are singular convolutional layers, and a sequence of "Conv*" blocks, which consist of a sequence of convolutional layers followed by a max pooling layer and a skip connection, as exemplified in the legend. This sequence culminates into a custom property layer that calculates and . Here, is the number of filters in each block, is the kernel size, and is the activation function. The network is visualized with Netron 15 .

ML analysis results
In Extended Data Fig. 4 we show the results using the ML analysis, which are in agreement with the results obtained using the SA analysis presented in Fig. 3. Note that, since the ML analysis retrieves longer trajectories that the SA analysis maximizing the available data for each data point, this results in fewer data points for the ML results than for the SA results, even though the total amount of data is the same for both methods. Since the ML analysis is completely independent from the SA analysis, the agreement between these two analyses further strengthens the confidence that these results are correct. The ML method is fully automatized and operates 1-2 orders of magnitude faster than the SA method.

ML analysis of cell conditioned medium
To analyze the cell conditioned media, the overall ML algorithm is the same as described above, with the exception that the YOLO is replaced with a thresholding function on the segmented kymographs and the U-net/FCNN models are trained in slightly different schemes. In general, YOLOv3 struggles to identify object boundaries in situations where several small objects are in very close proximity or even overlap, as has been shown consistently with, e.g., flocks of birds or other small animals 20 . As a result of the very high concentration of lipoprotein trajectories in the segmented kymographs, this limitation adversely affects EV analysis as lipoprotein trajectories may inadvertently be included in the bounding boxes of otherwise isolated EV trajectories and lead to inaccurate results. To avoid this issue, we implemented a simple thresholding function, which works by first identifying the starting point of each new trajectory (under flow) in a segmented kymograph (i.e., time " where S > ℎ ℎ at " = 0). Beginning with the first trajectory, we investigate all points in a square region of size -× -at ( " , " ) where S > ℎ ℎ . If none of these points are of equal t and unequal x, or vice versa, we consider the box to only contain a single trajectory, and the process is repeated in a box of size -× -at ( * , * ) where * = " + -, * = t( * ) in iteration . This forms a sequence of boxes [ " , : , . .. , * ] which are considered to contain a single particle trajectory, until box *F: either reaches the end of the kymograph or if two points in the box are of equal t but unequal x, or vice versa. If the latter occurs, the single particle trajectory is saved as contained in the aforementioned sequence of boxes, and the algorithm repeats by forming a new sequence either from box *F& or from the start of a new trajectory, if the end of the kymograph (lengthwise) has been reached. Hence, regions in which multiple trajectories cannot be separated by more thanpixels are discarded. Since the optical contrast of EV trajectories is orders of magnitude higher than that of lipoproteins, the scattered light from lipoprotein trajectories in the near vicinity of EVs becomes vanishingly small after the preprocessing steps described in Data pre-processing from raw data to kymograph, and hence each EV trajectory corresponds to a single EV. Additionally, the U-net and FCNN models are trained equivalently to the procedure described in Image segmentation with U-net and Property calculation with FCNN, with the following exceptions. First, to account for the induced flow in the channels, the models are trained on simulated particle trajectories within a large range of flow velocities and hence learn to ignore the effect of general flow on diffusion calculation. We note here that the accuracy of the diffusion calculation is dependent on flow velocity, as higher flow velocities in general lead to lower total measurement time of each separate biomolecule. This effect is however minimal in the low flow rates (~10 / ) used for cell conditioned media analysis. Second, the range of iOC values the models are trained on is considerably larger: ≈ 10 9> to 10 9: µm, and D values slightly smaller -≈ 0.1 to 30 µm & /s. Additionally, to improve accuracy below = 2.5 • 10 9) µm, we use the standard FCNN model described in Property calculation with FCNN in this regime. Third, since the channel size is larger and the trajectories of shorter length (in time), the models are trained on simulated kymographs of size 512x512 rather than 128x2048.

Evaluation of the data processing
To evaluate the performance of the data processing described in the SI sections "Removal of the background" and "Particle tracking", we have carried out benchmarking tests on ground truth data that were composed of experimentally recorded background signal, i.e., a time sequence of an intensity profile of a nanochannel filled with water (no biomolecules inside, " ) combined with a generated response of biomolecules ( C ) with defined properties (iOCdef and Ddef) that was simulated by the following procedure. At time zero, the positions of 10 biomolecules were randomly selected from the range of positions from -100 µm to +100 µm and their Brownian motion was recorded in time. At each time step (Δ ), the position of each biomolecule was calculated from the previous position, as * < = *9: is a normally distributed random number with a mean of zero and a standard deviation of one. When a position of a biomolecule reached a value that was either lower than -100 µm or higher than +100 µm, it was reflected back in a specular fashion. The optical response of a biomolecule was then simulated as a gauss function, whose central position, width, and magnitude were defined by * < , width of the diffraction limited spot ( Q4R ), and iOCdef, respectively. To match the frame rate of the recorded data (200 frames per second) and to mimic the continuous illumination, every time frame of the simulated response with temporal length of 5·10 -3 s was averaged over 100 time-steps (Δ = 5·10 -5 s) of the generated biomolecule positions, as where x is a space coordinate that corresponds to the space coordinate of the recorded data. For each combination of values of iOCdef = 10 -4 , 2·10 -4 , 5·10 -4 , 10 -3 , 2·10 -3 µm, and Ddef = 10, 20, 50 µm 2 /s, 10 different responses with a temporal length of 10000 frames were generated (selected examples are shown in Fig. S10). We note that the movement of a biomolecule within one time frame results in motion blur -broadening and shallowing of the intensity dip (insets Fig. S10).
The generated response was then combined with recorded background signal (Fig. S11) as = " • C and kymographs were created according to the procedure described in SI section "Removal of the background" (Fig. S12), positions of biomolecules were found using the algorithm described in SI section "Particle tracking algorithm". In the first step of the algorithm, the positions of all minima in the kymograph were found and their zero order intensity moments (m) were calculated. Since the m-values are calculated as a sum of intensities of a fixed fraction of a peak, and since the shape of the peak varies due to motion blur, mvalues are also slightly dependent on the diffusivity (Fig. S13). Specifically, the mean value of m corresponding to D = 50 µm 2 /s is 15% lower compared to the mean value of D = 10 µm 2 /s. Nevertheless, motion blur has a minimal effect on determination of the iOC value, as it is calculated as integral value of the whole intensity dip that always remains constant.    The determined values of iOC and D of all detected trajectories are shown as a scatter plot in Fig. S14A. The mean values of iOC and D (Fig. S14B) correspond to the position of the found peaks of the corresponding histograms of iOC (Fig. S14C) and D (Fig. S14D), where each trajectory is represented by N values of iOC and D, respectively. Clearly, the values identified by our procedure correspond to the ground truth very well, thereby corroborating the used data analysis protocol as a whole.
We also highlight that none of these tests revealed any false signals -i.e., no trajectories other than those simulated were found by the particle tracking algorithm. In addition, analysis of experimentally recorded background signal from a channel filled with pure PBS buffer, without the target molecules, did not reveal any false signal either (Fig. S11). That is the consequence of the fact that the used particle tracking algorithm discards all trajectories that have lower signal than 4 STD of the noise. In the case of machine learning, the models are trained on a wide variation of simulated noise distributions and train-validated against experimental channel noise to ensure that they do not find false signals in the experimentally relevant conditions that they have been validated against. Furthermore, we highlight that if the experimental conditions and noise distribution were to substantially change, the model could quickly be retrained by transfer-learning using the first few seconds of measurement of an empty channel and thereby always prevent false signal detection.

Noise analysis
To evaluate the level of the noise in the system, we measured the standard deviation of the intensity in a kymograph corresponding to the background recorded from a nanochannel filled with pure PBS buffer (Fig. S11) over 500 s for a range of time frame averaging values (temporal length of one video frame). The camera was operated at 5400 frames per second.
The experimental values are compared with the expected level of shot noise calculated as where \ & is the well depth of photoelectrons in a single pixel ( \ & = 33 9 for the Andor Zyla CMOS camera), 8 corresponds to spatial averaging (the signal was averaged over a diffraction-limited spot, 8 = 20 × 20 pixels) andcorresponds to the time frame averaging. Fig. S15 shows that the experimental values are very close to the theoretical shot-noise limit, which confirms that the measurement is shot-noise limited.

Resolution in molecular weight and hydrodynamic radius
The following section describes specific parameters and their interplay that influence the resolution of NSM in MW and RS. The values of MW and RS are inferred from the measured iOC and D, respectively. Therefore, we first discuss the factors that influence the resolution in iOC and D.
The uncertainty in iOC determination for each trajectory (standard deviation of the estimate for iOC, *_`) is expected to follow *_`= *_" k 1 ,

Equation 48
where *_" is uncertainty in iOC determination from a single frame and N is the number of frames a single trajectory is composed of. The uncertainty in D determination for each trajectory (standard deviation of the estimate for D, Q ) can be written (to first order in 1⁄ ) as 9 where = & ( Δ ) − 2 ⁄ , is the localization error, and = 1 6 ⁄ is a contribution corresponding to a motion blur for continuous illumination.
The resolution in iOC and D can be defined as the full-width-at-half-maximum of the peak in the histogram, where each detected trajectory from a data set corresponding to one type of a molecule is represented by N values of iOC and D, respectively. Since the number of frames varies stochastically, the width of the derived peak (and thus the resolution) depends on the distribution of the N values within one data set that can be described in terms of a probabilistic density function ( a ). a depends on the diffusivity, temporal length of a time frame, and the length of the field of view, and can be characterized by the mean value of N, ‹ ≈ ( ∆ ) 9 where x stands for either iOC or D. At first, to predict the shape of ℎ( ) that dictates the resulting resolution, we have simulated a for scenarios that correspond to experimental conditions using the following procedure. At time frame = 1, we have assumed a biomolecule at the position : = 0. After that, the positios of the biomolecule in the subsequent time frame was calculated from the previous position, as * = *9: + √2 Δ , where R is a normally distributed random number with a mean of zero and a standard deviation of one, and where Δ = 5 ms, and D =5 -60 µm 2 /s is the diffusivity of the biomolecule. If * < 0 or * > G(H = 17.5 µm, the biomolecule was considered to be out of the field of the view and the temporal length of a trajectory, N = i, was recorded. The process was repeated for 10 4 molecules. a was then plotted as the histogram of recorded N (an example for D = 20 µm 2 /s is shown on Fig.  S16A) and the mean value of N ( ‹ ) was calculated. Here we also take into account that all trajectories that have N < 40 are discarded by the particle tracking algorithm and are considered as noise. The obtained dependency of ‹ on D (Fig. S16B) was then fitted to the function ‹ = J , = 1540, = 0.57.

Equation 51
Subsequently, the expected histogram ℎ( ) for a general value ( − ̅ ) 5 " ⁄ was calculated from the simulated a using Equation 50 (Fig. S16C) and the FWHM of the peak (corresponding to 5 5 ⁄ ) was determined. Alternatively, the FWHM can be estimated from the gaussian fit, which can be highly relevant especially when evaluating statistics from a low number of trajectories. Interestingly, even though the shape of the peak is not strictly gaussian, the derived values of FWHM from the gaussian fit are very close to the real value (Fig. S16D). Therefore, we argue here that the gaussian fit can be used to estimate the resolution from the experimental data. The obtained dependency of 5 5 ⁄ on ‹ (Fig. S16D) was then fitted to the function respectively. ⁄ .
Having established the theoretical background, we now discuss the specific resolution in iOC and D obtained from three data sets analyzed by the standard analysis (SA), described in Methods: "Removal of the background" and "Particle tracking algorithm": simulated data corresponding to biomolecules defined by the combinations of values iOCdef and Ddef (Fig.  S12, more details in section "Evaluation of the data processing") and the experimental data obtained for different proteins and DNAs in Channel I and II (Fig. 2 in the main text). It can be seen that the resolution in both iOC and D exhibits a complex behavior. The resolution in iOC (Fig. S17A) increases both with diffusivity and iOC of a pertinent biomolecule, whereas the resolution in D (Fig. S17B) increases with diffusivity and decreases with increasing iOC of a pertinent biomolecule. To explain this behavior, we further discuss the main factors that influence both resolutions.
The resolution in iOC is directly proportional to the factor *_" that relates to the noise in the kymograph and that can be calculated from *_` (Fig. S17A) and ‹ (Fig. S17E), using the phenomenological model (Equation 53). The values determined for each data set (Fig. S17C) are compared with the value corresponding to the ideal system that is limited by the shot noise and that can be estimated as , where w = 1.18 µm (corresponding to 40 pixels) is the length of the nanochannel from which the iOC is calculated, -= 25 frames and 8 = 40 × 20 pixels corresponds to temporal and spatial averaging, and \ & = 33 9 is the well depth of a single pixel. It can be seen that *_" corresponding to the biomolecules from the data sets increases with iOC and the theoretical limit corresponding to the shot noise was reached only in the case of a biomolecule with the lowest iOC (200 bp DNA). That can be attributed to the additional noise introduced by the background removal algorithm. The resolution in D is directly proportional to the factor Q " that linearly increases with diffusivity and that is also related to the localization precision of the particle tracking algorithm (Equation 49), and that can be calculated from Q (Fig. S17B) and ‹ (Fig. S17E) using the phenomenological model (Equation 54). The values determined for each data set (Fig. S17D) are compared with the values corresponding to the ideal system with no localization error ( = 0). It can be seen that for high iOC the determined values for the data sets are very close to the case of the ideal system that indicates that for this regime, the localization error is negligible. However, with decreasing iOC and decreasing D, the values of Q " can be up to about 6 times higher compared to the ideal system which suggests that the localization error reduces the resolution in D in this regime. However, this behavior is expected for signals approaching the standard deviation of the noise in the system. Moreover, the resolution in both iOC and D are influenced by the distribution of the temporal lengths of the trajectories a pertinent data set is composed of and that can be characterized by ‹ . The values of ‹ derived for each data set (Fig. S17E) are compared with the expected values calculated using the phenomenological model (Equation 51). It can be seen that the derived values are lower than expected and show additional dependency on iOC that is not expected. This can be attributed to the fact that the trajectories with higher iOC suffer from higher variation in iOC than expected (see Fig. S17D) and therefore failing to fulfil the condition set on variation of iOC expected from an identical molecule. As a result, a single trajectory is separated into multiple shorter ones.
In addition, it has to be noted that the behavior for both simulated and experimentally obtained data sets are quantitively and qualitatively comparable, which confirms that the experimental data sets fulfil all the conditions that were applied to the simulated data set (e.g., free movement of the molecules). The exception is the ferritin and ADH systems, whose *_` (Fig. S17A) is about two times higher than for simulated molecules with comparable properties. This can, however, be attributed to the presence of additional populations containing different amounts of iron atoms in the case of ferritin (more details in the main text) and a possible population of dimers in the case of ADH. The determined value of iOC can be translated into molecular weight using Equation 1 in the main text. The resolution in MW, /c , is then linearly dependent on the resolution in iOC and linearly increases with the cross-sectional area of the nanochannel, /c = *_`• ( d /c ) ⁄ . We note here that these predictions assume that the shot noise level remains the same, i.e., the intensity and temporal averaging remain the same. In other words, downsizing of the nanochannel cross section improves the optical performance, but only to the level where the intensity of the scattered light saturates the camera at its maximal frame rate. In addition, downsizing the nanochannel cross section presents a challenge for nanofabrication, sets limits on the maximal size of molecules that can enter the channel and be analyzed, and increases the risk of clogging. However, in the section "Surface passivation by supported lipid bilayer" we show that the risk for clogging can be minimized by surface modification of the nanochannel walls by, e.g., a lipid bilayer, to avoid adsorption of molecules on the surface. In addition, in the section "Analysis of extracellular vesicles in conditioned cell culture medium" we show that the nanochannel dimensions can be tailored to accommodate even BNPs, such as EVs, without any obvious problems related to clogging. Looking forward, an analysis covering a wide range of molecular weights could be enabled by a series of nanochannels with varying sizes used in parallel. To prevent the clogging of the smaller nanochannels by larger molecules present in the sample, we propose that on-chip sorting systems 21 could be utilized.
The determined value of D can be translated into hydrodynamic radius using Equation 2 in the main text. The resolution in Rs, D ' , is then linearly dependent on the resolution in D and is indirectly proportional to the diffusivity of the biomolecule, D '~Q • 9: . Since the Q~ (Equation 49 and Equation 54) for systems with negligible localization error, the resolution in hydrodynamic radius for most of the studied molecules is around 2 nm (Fig. S18B). For the two smallest DNAs (200 kDa and 400 kDa) the resolution is worse (about 3 and 3.5 nm), due to the higher localization error specific for the regime of lower iOC and D.

The ferritin system
It is reported in the literature that the MW of ferritin may vary between 450 and 702 kDa, depending mostly on the amount of stored iron 22 . At the same time, exact and unbiased characterization of the MW of ferritin is challenging with existing methods, since its outer dimensions are independent of iron content 22 . Analysis is complicated further by the presence of reported up to five types of dimers with different interaction strengths 23 . This is unfortunate because accurate determination of iron content in ferritin from patients has recently been suggested as biomarker for clinical analysis of iron deficiency/overload related to infection, inflammation, and cancer 24 .
We investigate the ferritin system in more detail in Fig. S19, and include data from experiments in two additional nanochannels (Channels III and IV, Extended Data Fig. 3C, D) with different cross-section areas (AIII = 100 ´ 15 nm 2 , and AIV = 145 ´ 27 nm 2 ) analyzed by SA. The two-dimensional histogram of inferred MW and RS (Fig. S19A) and corresponding onedimensional histogram of MW (Fig. S19B) reveal three different populations, marked as M1-3, with MW ranging between ~ 500 and 900 kDa. Notably, they all have similar RS ~ 6 nm, agreeing well with reported values for ferritin monomers 25,26 . In addition, a small number of molecules with MW > 900 kDa can be seen (marked "Dim" in Fig. S19A). They exhibit a mean RS ~ 8 nm, agreeing well with the value reported for ferritin dimers 26 . This observation is further corroborated by plotting RS histograms for four different MW ranges that reveal a single RS population for the three MW populations "M1-3", with very good agreement between the different nanochannels ( Fig. S19C-E), and additional RS populations only for the data set obtained with Channel II (Fig. S19F), which had the largest cross-section. We thus conclude that the "M1-3" populations correspond to monomers with different amounts of iron, since this alters their polarizability 27 and thus iOC and inferred MW, and that the "Dim" population corresponds to dimers. The reason why the "Dim" population is only present in the largest Channel II is likely that dimers are effectively filtered by Channels I, III and IV, due to their smaller cross-sections.

Characterization of the conditioned cell culture medium using Nanoparticle Tracking Analysis
In order to validate the results of the NSM analysis of the conditioned cell culture medium, we have carried out a particle size distribution measurement of the same sample using Nanoparticle Tracking Analysis (NTA, Fig. S20), with which we identified a population of extracellular vesicles (EVs) whose mode value of RS = 37.9 nm is in very good agreement with the mode value identified by NSM (RS = 34 nm, Fig. 5D). The absence of the population of smaller particles (lipoproteins) in the NTA data is the consequence of these small particles being below the limit of detection of the method.