Fast custom wavelet analysis technique for single molecule detection and identification

Many sensors operate by detecting and identifying individual events in a time-dependent signal which is challenging if signals are weak and background noise is present. We introduce a powerful, fast, and robust signal analysis technique based on a massively parallel continuous wavelet transform (CWT) algorithm. The superiority of this approach is demonstrated with fluorescence signals from a chip-based, optofluidic single particle sensor. The technique is more accurate than simple peak-finding algorithms and several orders of magnitude faster than existing CWT methods, allowing for real-time data analysis during sensing for the first time. Performance is further increased by applying a custom wavelet to multi-peak signals as demonstrated using amplification-free detection of single bacterial DNAs. A 4x increase in detection rate, a 6x improved error rate, and the ability for extraction of experimental parameters are demonstrated. This cluster-based CWT analysis will enable high-performance, real-time sensing when signal-to-noise is hardware limited, for instance with low-cost sensors in point of care environments.


Additional Single-Peak Signal Analysis
Similar to the multi-peak events discussed in the main manuscript, single-peak events detected by the PCWA algorithm for 200 nm fluorescent beads ( Supplementary Fig. 1a) contain not only the (temporal) location and magnitude of the peaks, but also the t (scale) values. Here, t represents the width of the Ricker wavelet that matches the fluorescence peak. This t value can be converted to the particle's flow velocity using the known spatial

Nanopore Translocation Detection
Nanopore sensors have emerged as ultrasensitive tools for detection and analysis of individual nanoparticles with numerous applications such as next-generation sequencing 1-3 . They operate on the principle that individual particles moving through a nanoscopic membrane generate a characteristic modulation of an ionic current across the membrane. In this way, time-dependent electrical signals due to single events are produced analogous to the optical fluorescence signals discussed in the main manuscript.
We evaluated the PCWA algorithm by analyzing electrical signals recorded from the optofluidic chip augmented for use as a nanopore sensor. Supplementary Figure 2a illustrates the experimental setup for the experiment where single SARS-CoV-2 RNAs were driven through the nanopore by an applied voltage VNP. The optofluidic nanopore device shown in Supplementary Fig. 2a consists of a hollow-core waveguide delivering the target molecules, here SARS-CoV-2 RNAs bound to microbeads from inlet reservoir (1) to the nanopore capture region by the applied electrokinetic voltage (VEK). A trapassisted capture rate enhancement (TACRE) technique 4 employs the optical force from a light beam in the liquid-core waveguide to locally trap multiple beads holding target molecules underneath the nanopore. This increases the target concentration at the 4 nanopore location and the rate of detection upon thermal release from the beads by orders of magnitude. Translocations of individual released nucleic acid molecules through the nanopore are detectable from the current change between reservoirs (2)  with ~2,000 events detected in 2.6 s long trace. c-d Zoomed in windows to show location of the detected events using PCWA and two other CWT peak detectors 6,7 .

Mass Spectroscopy (MS) Peak Detection
In order to benchmark accuracy and recall performance of the PCWA method, we ran a peak detection task on a simulated mass spectra dataset 8 . This dataset provides simulated protein spectra with noisy raw data alongside the true location of the peaks, , In the case of a mass spectrometry trace, a false positive is a detected peak that is not located within ± 1% of the M/Z value of the true peak. Supplementary Figure 3

Multi-Spot Gaussian (MSG) Wavelet
Custom MSG wavelets are built up by adding N Gaussian functions representing the multi-spot excitation pattern in the analyte channels and encapsulating them with two negative Gaussian functions to provide a behavior similar to the Ricker wavelet. The side negative peaks also ensure that standard wavelet requirements (zero mean and square norm of one) are fulfilled.

Reference simulations for Multi-Peak Signal Analysis
The experimental single DNA data in the main manuscript do not consist of known events so that the accuracy of the different algorithms cannot be compared to a known ground truth. In order to eliminate this uncertainty, an additional benchmark analysis was done on a simulated multi-peak signal. The simulated trace was created by randomly placing 9 The same three methods were then used for detection and identification of multi-peak events and the results are summarized in Supplementary