Image-Based Single Cell Sorting Automation in Droplet Microfluidics

The recent boom in single-cell omics has brought researchers one step closer to understanding the biological mechanisms associated with cell heterogeneity. Rare cells that have historically been obscured by bulk measurement techniques are being studied by single cell analysis and providing valuable insight into cell function. To support this progress, novel upstream capabilities are required for single cell preparation for analysis. Presented here is a droplet microfluidic, image-based single-cell sorting technique that is flexible and programmable. The automated system performs real-time dual-camera imaging (brightfield & fluorescent), processing, decision making and sorting verification. To demonstrate capabilities, the system was used to overcome the Poisson loading problem by sorting for droplets containing a single red blood cell with 85% purity. Furthermore, fluorescent imaging and machine learning was used to load single K562 cells amongst clusters based on their instantaneous size and circularity. The presented system aspires to replace manual cell handling techniques by translating expert knowledge into cell sorting automation via machine learning algorithms. This powerful technique finds application in the enrichment of single cells based on their micrographs for further downstream processing and analysis.


System overview
Experimental setup. A schematic overview of the system is shown in Fig. 1A. A custom built, modular microscopic imaging setup is employed in this study for sorting droplets containing the target samples, cells in this case. The sample is illuminated with a 470 nm LED (M470L3, Thorlabs, UK) and imaged using 10× and 40× objectives onto a camera (Mako U-130B, Allied Vision Technologies, Germany) for image processing. The Optical Scattering N 30 Hz Y Solenoid Sorting based on # of cells in a droplet 96 Electrical Impedance Y 100 Hz Y -Uses low conductivity buffer for droplets, can detect single cells. 7 Brightfield Imaging Y 100 Hz Y DEP Can detect droplets colonised by bacteria. 62 Optical Scattering Y -Y -Can detect down to single bacteria. 10 Optical Scattering Y 240 Hz Y DEP Can detect droplets colonised by bacteria. 31 Optical Scattering Y 160 Hz Y DEP Works with chlorophyll-containing cells owing to autofluorescence. 11 Brightfield Imaging Y 10 Hz Y DEP Uses template-matching algorithm to identify circular or elliptic cells. 57 Brightfield Imaging Y 250 Hz Y -Automated inlet controls droplet volume and average # of cells. 89 Brightfield Imaging Y 0.2 Hz N Solenoid Sorts based on cell morphology. 61 Multi-Colour Imaging N 100 Hz N Piezoelectric Machine learning ready, highly capable system, sorts based on cell morphology. 58 Dual-Colour Imaging Y&N 4 Hz Yes DEP Machine learning ready, real-time sorting verification, sorts based on cell morphology. This work Table 1. Significance of this work compared to relevant studies involving cell detection and single-cell preparation for analysis. DEP: Dielectrophoresis, Y: Yes, N: No, Droplets: Does the study involve droplets?, Label-free: Is the detection label-free?
fluids are infused into the device using syringe pumps (Legato 130, KD Scientific, USA). Image processing and control of peripheral equipment are performed using bespoke LabVIEW software. Droplet sorting is carried out using high-voltage fields generated by an amplifier (HVA Series, Ultravolt, USA) which receives the control signal from a data acquisition system (NI-DAQ, USB-6211, National Instruments). Image capture from the first camera is synchronised using a laser-scatter based detection system detailed below. An additional camera is used to visualise the sorting region of the device at a slightly lower resolution for validation and quality control, which is of paramount importance for cell sorting. Imaging from these two cameras is referred to as dual-imaging in the manuscript. Dual-imaging is achieved by a 50:50 beamsplitter mounted on a kinematic stage (B4C, Thorlabs, UK) along with two cameras (Fig. 1A). This allows for real time image acquisition and verification for the sorting channels. By using a demagnifying lens (Fig. 1A) and adjusting the kinematic stage, the centre of the reflected image can be moved around the field of view and in this case, positioned to visualise the outlets to record images of which droplets are sorted into which outlets. There are three regions of interest (ROI) (see Materials & Methods) defined for the real time imaging camera 1 (Fig. 1A); one for counting droplets before they arrive the sorting Y-junction and one each for imaging droplets in the sorted and waste channels for verification and counting. Camera 2 is used for capturing high-resolution images of droplets at a specific location before the sorting electrodes for processing and decision making. In order to ensure the droplet is repeatedly positioned at the optimum location in the device, a laser based backscatter droplet detection system is used; an 850 nm laser diode is positioned at the desired location within the device coaxial with the transmission light. The backscattered light is directed to a silicon photodiode detector (DET36A, Thorlabs, UK) via a 50:50 non-polarising beamsplitter cube (Fig. 1A). The output signal from the detector is fed into an analog input of an Arduino (Mega 2560, Arduino) via a voltage divider circuit. The custom programmed Arduino looks for a change in the signal caused by change in backscatter from the droplet-oil interface owing to refractive index mismatch. When a droplet is detected, the Arduino outputs a trigger signal to camera 2 to capture an image of the droplet. The laser light is blocked from being imaged on the cameras using two IR blocking filters (FES0800, Thorlabs, UK). With this technique, the location of the droplet is well defined which is important for image analysis as well as the timing of the electric field to sort droplets. The full width at half maximum of the laser spot is measured as 66 μm (see Material & Methods), slightly smaller than droplets with an average width of 80 μm.
To allow target droplets to be isolated from the others, two outlets from the device have been implemented. The target outlet has a smaller cross section and thus a higher fluidic resistance, resulting in droplets normally flowing down the waste channel of the device. When a keep decision is made, LabVIEW instructs the NI-DAQ to generate 20 square wave pulses (0-3 V, 500 Hz, 40 ms total duration) which are passed to the amplifier which amplifies it to 1.32 kV and applies to the electrodes within the device. This creates a high voltage AC field in the channel and a dielectrophoretic effect that pulls the target droplet to the high electric field intensity zone steering it to the keep channel, known as dielectrophoresis (DEP) 42 . The electric field is applied only for a short time (40 ms) resulting in the following droplets travelling into the waste channel and ensuring only one droplet is selected per pulse.
Sorting performance metrics. Sorting performance measurements carried out in this study are mostly based on the discussions in the book by Lee et al. 64 . In order to understand these metrics, a few important parameters are defined. N x i and N o i are the numbers of target and waste droplets at the input, respectively (Fig. 1B). They add up to the total number of droplets input into the device, N i . N x i is the number of target droplets detected by the imaging algorithm. N x k and N x w are the target droplets while N o k and N o w are the waste droplets collected in the keep and waste outlets, respectively (Fig. 1B). They are also known as true positives, false negatives, false positives and true negatives in order of appearance. Finally, N k and N w are the total number of droplets in the keep and waste outlets. Efficiency is defined as 64 : Efficiency measures the efficiency of the sorting system, it is defined as the ratio of the number of droplets containing single cells in the keep outlet to the number of droplets detected as containing single cells. Occasionally, the electrical field timing or intensity prevents a successful steer.
Purity is the ratio of droplets containing single cells in the keep outlet to the total number of droplets steered into the keep outlet 64 Yield indicates how many targets from the inlet are captured in the keep outlet 64 . In the context of this study, it is the number of droplets containing single cells captured in the keep outlet divided by the statistical estimation of the number of droplets that should contain single cells. image analysis and decision making. When a droplet image ( Fig. 2A) is captured, it undergoes several steps of image manipulation for cell identification and extraction of other relevant parameters. Firstly, the on-demand recorded background image (Fig. 2B) is subtracted from the captured image to remove static features such as the channel walls and debris (Fig. 2C). Next, an intensity threshold (10 for 8 bit images) is applied to the image to isolate the entire droplet in the foreground; this manipulation generates salt and pepper noise as indicated by red circles in Fig. 2D. These are then removed with 'IMAQ Remove Particle VI' 67 to finally the get the outline of the droplet (Fig. 2E). From this binary image mask, numerous parameters are extracted 68 ; important among those are droplet area, circularity, length and width.
In parallel, the subtracted image goes through a higher threshold (12 for 8 bit images) that only highlights the edges of the droplet. Then, 'IMAQ MagicWand VI' 69 (similar to Microsoft Paint's 'Fill with colour' tool) is used to fill the inside of the droplet excluding any cells that may be in the droplet. This also leads to some salt and pepper noise shown by the red circles Fig. 2F which is removed by dilating the image twice ( Fig. 2G) using 'IMAQ Morphology VI' 70 . This binary image mask is used to obtain the number of cells in the droplet as well as the size of the cells using ' Area of Hole' parameter 68 , which counts the number of pixels in the dark region inside the droplet. This value can later be converted to μm 2 using pixels-to-microns ratio of the imaging system. It should be noted that the size of non-circular cells will significantly vary depending on the instantaneous orientation of the cell during image capture.
Finally, 'IMAQ MagicWand VI' 69 output (Fig. 2F) is inverted (Fig. 2H) and both the border and particles are removed to obtain cell binary mask as shown in Fig. 2I. 'IMAQ Particle Analysis VI' 71 automatically calculates various visual parameters pertaining to the cell(s) using this binary mask (Fig. 2I). A list of these parameters along with graphical examples can be found on their website 68 . In this study, some of these parameters were used to demonstrate how cells of interest could be selected using these readily available tools. Importantly amongst these, this study uses 'Number of Particles' to determine the number of cells in a droplet, ' Area' to calculate droplet volume, 'Ellipse Ratio' & 'Waddel Disk Diameter' for selecting circular single cells and rejecting erroneous masking. Furthermore, 'Heywood Circularity Factor' & ' Area' were used as input for supervised machine learning to sort between single cells and clusters. This binary image (Fig. 2I) could also be used to extract cell intensity characteristics from the original image ( Fig. 2A). For non-spherical cells that are constantly rotating inside the droplet, Scientific RepoRtS | (2020) 10:8736 | https://doi.org/10.1038/s41598-020-65483-2 www.nature.com/scientificreports www.nature.com/scientificreports/ 'Max Feret Diameter' (the longest line that could be drawn on the particle) 68 could be used to find the biggest diameter which should provide an insight into size of the cell population. All the above-mentioned, image-based properties of cells or droplets can be used individually or in combination with each other to enrich cell populations of interest which make this system versatile and powerful. For example, inherent droplet volume variability during droplet production has been shown to produce misleading results especially in droplet digital polymerase chain reaction (ddPCR) 72 . With the presented system, droplets could be sorted for their size to improve the accuracy of ddPCR measurements.
The decision making is performed real time with access to all the parameters explained above. Two main strategies are used in the decision making; logic gating and supervised machine learning. Logic gating is used for simpler scenarios such as single-cell encapsulation in droplets -the logic gate is programmed to keep all droplets where the number of detected cells is equal to one. This can be extended to include a certain lower and upper limit for apparent cell or droplet size, a cell circularity threshold and as many more as required. In supervised machine learning, all or some of the available parameters are fed as input to the classifier along with their keep or waste labels determined by experts in their field. Once the classifier is trained with the initial set of data, it can sort new samples that aren't suitable for logic gating or with better purity compared to logic gating. www.nature.com/scientificreports www.nature.com/scientificreports/ throughput. Throughput was a function of the inlet flow rates in this study. During experiments, LabVIEW recorded the total number of droplets detected by the system and the total duration of the experiment. The ratio of these values was used to determine the throughput for each experiment. At 3.5 Hz, where experiments were conducted, sorting efficiency of 100% was achieved while at 4.5 Hz, sorting efficiency dropped to 80%, therefore we report a throughput of 4 Hz as shown in Table 1.
Real-time image-based sorting is inherently a low throughput technique as it involves the acquisition and processing of image data on the order of megapixels. Since the images are acquired during flow; the most important parameters for a blur-free image are exposure time (t e ), pixel size (s p ) and flow velocity. If the target is not allowed to traverse more than one pixel during the exposure time; the maximum velocity can be found by V max = s p /t e . This is the first rate-limiting step in this setup. This can be improved by increasing the pixel size -sacrificing resolution -or decreasing the exposure time -sacrificing light -both of which are equally precious.
In this work, the pixel size was measured by acquiring a still image of a calibration target (R1L3S2P, Thorlabs, UK) and measuring the number of pixels between 10 divisions (100 μm). Pixel size for the brightfield camera (Mako U-130B, Allied Vision Technologies, Germany) was 250 nm and the pixel size for the fluorescent camera (340M-USB, Thorlabs, UK) was 390 nm. V max was calculated from V max = s p /t e as 1 mm/s and 0.8 mm/s, for the cameras, respectively. During optimisation experiments, the flow rate was increased until the image quality and sorting efficiency was acceptable. At 75 μL/hr of oil and 2 μL/hr of aqueous flow, the throughput was measured by the LabVIEW algorithm as 3.5 Hz. The corresponding flow velocity was calculated as 8 mm/s using the cross-sectional area at the detection region.
The second rate-limiting step in this study was the selected DEP actuation duration, 40 ms, limiting the throughput to 25 Hz. There was uncertainty in the droplet location when it was detected, also in the time between droplet detection and DEP triggering as well as in droplet velocity. 40 ms was chosen to minimise erroneous sorting events. Sciambi and Abate 73 showed that droplets could be sorted at 30 kHz using DEP, therefore DEP sorting is not considered a future barrier in image-based droplet sorting.
The third rate-limiting step was the LabVIEW program. Timing experiments carried out with LabVIEW including the usage of supervised machine learning algorithms showed a 20 ms (50 Hz) acquisition-to-decision time on an average PC. This could be improved by using a field-programmable gate array (FPGA) to acquire images and to process them for sorting decision. The work by Nitta et al. 58 focus on maximising throughput and reports a throughput of 100 Hz by using state-of-the-art equipment including an FPGA.
The fourth rate-limiting step was the cameras that were used in this study. For brightfield imaging, Mako U-130B (Allied Vision Technologies, Germany) with 168 fps maximum frame rate at full resolution was used. For fluorescent imaging, 340M-USB (Thorlabs, UK) with 200 fps maximum frame rate at full resolution was used. For completeness, next factors were the NI-DAQ (USB-6211, National Instruments), the Arduino (Mega 2560, Arduino), and the amplifier (HVA Series, Ultravolt, USA); all of which can achieve throughputs on the order of kHz and more.

Results & Discussion
cell concentration. A crucial parameter for single-cell encapsulation in droplets is the concentration of cells, CC, in the aqueous phase. This variable directly affects the number of empty droplets, droplets containing single cells, two cells and so on. λ, the mean number of cells in a droplet across many, can be calculated by multiplying cell concentration and mean droplet volume, V: Doublets, triplets and beyond are highly undesirable false positives for single-cell studies, therefore techniques like inDrop 74 and Drop-seq 75 reduce the cell concentration to achieve a λ between λ = 0.1 and λ = 0.5 where most droplets are empty, between 10% and 30% have single cells and less than 10% has doublets (Fig. 3A).
With the system presented here, the inlet cell concentration was adjusted to render λ ≈ 1 during red blood cell (RBC) sorting. As shown in Eq. 5, droplet volume is required for calculating λ; area of the droplet, recorded throughout the experiments using the binary image in Fig. 2E, is multiplied with the height of the microfluidic channel to obtain droplet volume (cylindrical assumption). Normalised (Probability Density Function) droplet volume from the reported experiments with RBCs are shown in Fig. 3B. The mean droplet volume across all experiments was calculated as 155pL.
The image analysis algorithm can identify single target cells encapsulated in a droplet with precision. In this way, it presents an opportunity to maximise the probability of single-cell encapsulation, P(1). Empty droplets, P(0), doublets, P(2), triplets, P(3), and even more cells in single droplets are biased to the waste outlet. The probability of encapsulating k cells in the next droplet, P(k), is given by 76 : The peak for P(1) occurs when λ = 1 where 36.8% of droplets are expected to contain single cells (Fig. 3A). Cell concentration (CC) was calculated using a haemocytometer before loading the input cell solution to a syringe. During experiments, droplet volumes were detected by the custom LabVIEW software and averaged to obtain mean droplet volume (V). CC and V were used in Eq. 5 to calculate the expected probability values for four independent experiments (also see subsection 'Data Map' in 'Materials & Methods'). The results are denoted 'Expected' in the legend ('Expected' because the final CC is lower due to cell sedimentation) and shown as box plots in Fig. 3A. λ turned out to be more than planned as this is an iterative process which involves pipetting of (2020) 10:8736 | https://doi.org/10.1038/s41598-020-65483-2 www.nature.com/scientificreports www.nature.com/scientificreports/ microliters of whole blood into the buffer solution. It is important to note that working with λ < 1 is more favourable than working with λ > 1 as this just increases the chance of doublets versus empty droplets.
Sorting performance results. To validate the system is capable of sorting droplets containing a single cell with high accuracy, four 1-hour sorting experiments were carried out on different days with disposable microfluidic chips. In each of these experiments, human blood was diluted to the desired cell concentration and infused as the aqueous input for the sorting microfluidic chip (see Materials & Methods).
As discussed above, the use of dual-imaging setup allows real-time verification of the sorting by tracking where each droplet leaves the sorting region. This capability was used to evaluate the droplet sorting ability of the system. Whenever a target droplet is detected, and the pulse is applied, the timestamp is recorded. If a droplet is then detected in the keep outlet within 200 ms, the system records this event as a successful sorting event. Efficiency is the parameter used for reporting the incidents where there were timing issues, or the electrical signal failed to induce a strong DEP (see Eq. 1). As can be observed from the Efficiency results in Fig. 4, this does not happen very often (i.e. ~100%) indicating that the sorting system is working up to its design standards. Out of 3126 actuation events, 25 droplets were missorted during experiments due to DEP handling errors.
Whenever a droplet ends up in the keep channel without being identified as containing a single cell (i.e. without electrical signal), this is recorded as a false positive, N o k . Purity (Real-time) (Fig. 4) measures the ratio of sorting events to the total number of droplets in the keep channel (see Eq. 2) and is >95%. To quantify cell identification errors and find the real number of true positives, all droplet images were looked at by humans and tagged as containing single or multiple cells. The results of this are shown as Purity (Verified) in Fig. 4 which is ≈85%. This value also includes the false positives recorded by the software (i.e. Purity (Real-time) ≥ Purity (Verified)).
It was observed that the majority of cell identification errors were caused by the boundary effect. For example, when a second cell is located at the very boundary of the droplet, it can become merged with the droplet boundary during image processing. The differing refractive indices of the oil and aqueous solution lead to refraction at the boundary which gives the droplet its dark edge as observed in Fig. 2A. This dark edge can mask objects at the boundary. Although this is important for detecting the droplet itself, it makes it difficult for the program to see the cells that are on the edge of the droplet at the imaging moment. By adjusting the focus of the lens, the dark ring could be made bigger or smaller which gives the user the option of optimising towards Purity or Yield.
Yield is the parameter for measuring the system's performance in identifying all the targets (see Eq. 3). The calculation for the number of target droplets at the input, N x i , is carried out by multiplying P(1) with the total number of droplets in the experiment to find how many of them are expected to contain single cells. That is then compared to how many droplets are identified as having single cells in them. Yield of the system is around 20% which is unexpectedly low (Fig. 4). This is believed to be due to cell sedimentation in the syringe which decreases the cell concentration entering the device and consequently P(1). On top of that, some droplets with single cells in them are not detected by the system (lensing effect) contributing to this decrease in Yield. In another set of experiments, Yield was increased to 40% by increasing flow rate (decreases cell sedimentation) but the system is not optimised at these flow rates so these results are not shown.
Finally, Enrichment is the parameter used to compare the initial state of the sample to the final state (see Eq. 4). Enrichment varies a lot more as it's defined as a ratio of ratios. The reported system was shown to enrich samples by 10 times (Fig. 4) and more. Enrichment can be improved greatly as the throughput and therefore yield www.nature.com/scientificreports www.nature.com/scientificreports/ is increased. For example, Enrichment as high as 300 was observed when yield was doubled up with the higher flow rate (results not shown).

Application -transmission imaging.
Configuring the sorting software for a specific application is very easy. To demonstrate decision making based on cell micrographs, two experiments were performed. In the first experiment, the sorting automation software was configured to capture all single red blood cells (RBCs) regardless of their shape and size (Fig. 5A). In the second one, logic gating was used to reject RBCs that had an Ellipse Ratio higher than 4. Ellipse Ratio of a perfect circle is 1 and it increases as the object is more elongated. It's calculated as the ratio of major to minor axis length. Due to the unique shape of RBCs (biconcave disc), their instantaneous orientation inside droplets vary from a rod-like ellipse to a circle. While it may seem pointless to sort RBCs from RBCs, we are using this variation to demonstrate the system's capability of decision-making via morphological differences in cell micrographs.
The Waddel Disk Diameter 68 (in μm) and Ellipse Ratio 68 of single RBCs encapsulated in droplets that the automation software decided to keep during these experiments are shown in Fig. 5A along with the chosen threshold. The red circle data points indicate all RBCs and the green triangular data points show selected RBCs. As can be seen from the results, the system rejected cells where the ellipse ratio was larger than 4 in the hunt for non-erroneous, circular single cells. Three special data points marked with a circle, square and a triangle ( Ratio to 4, the system was instructed to sort for circular single cells (Fig. 5A) and discard less circular ones which exhibit erroneous masking of the cells within droplets (Fig. 5I).
This application study demonstrates that the presented label and aerosol-free, transmission imaging based technique could detect visual differences in cell populations. This capability has been used, for example, in separating yeast cells based on their morphologies 61 which is critical in understanding cell division cycles. Yeasts that are budding, given that their orientation is right, would have a much larger ellipse ratio compared to ones that aren't. Setting up similar constraints in the presented setup would allow them to be sorted easily and efficiently. While the system is limited by instantaneous orientation of the imaged cells, this could be alleviated by aligning such non-circular cells using inertial forces 77 .
Application -fluorescent imaging and machine learning. In another application study, slight modifications to the system (see Materials & Methods) allowed camera 1 to carry out fluorescent imaging while camera 2 continued imaging in transmission mode for sorting verification. In this mode, K562 cells stained with Hoechst, a fluorescent dye that binds to DNA, were imaged within droplets in real-time using the presented microfluidic system.
Existence of fluorescent nuclei against the dark background was easily detected with the high signal-to-noise ratio similar to Droplet Detection with Region of Interest discussed in Materials & Methods section. After detection of cells, an adaptive intensity threshold (see Image Analysis and Decision Making), T f , was applied to the fluorescent images according to T f = 0.25I max where I max is the maximum intensity of the acquired image. The appearance and number of cell nuclei in a single droplet was accessed via this thresholding. www.nature.com/scientificreports www.nature.com/scientificreports/ To prevent confusion, the wording for this section will be as follows: labelling is carried out by humans, classification is done by supervised machine learning (MATLAB) and sorting by image manipulation and boolean logic gating (i.e if there's one cell in the droplet) in LabVIEW.
In the first experiment, the system sorted all the droplets it identified as containing a single cell. However, it wasn't successful all the time; there were cell clusters amongst these selections. The importance of clusters in cancer metastasis has been recently highlighted 78 . In one study, a microfluidic chip for capturing circulating tumour cell clusters was reported 79 . In a recent study, clustering of MCF7 cells led to reduced detection efficiency 80 . In this study, to separate clusters from single cells logic gating was not enough, therefore machine learning (ML) principles were employed. Images from this first experiment were labelled as single cells (circle) or multiple cells (triangle) and used as ML Training data. The scatter plot for this experiment can be seen in Fig. 6A.
The informative, discriminating and independent features of this sample were selected as circularity and size (measured by area) as plotted in this figure. For circularity, Heywood Circularity Factor is used 68 . Heywood Circularity Factor of a perfect circle is 1 and it increases as the object is more irregular. Clusters were observed to be less circular (higher circularity factor) and larger in size. These features along with their respective labels were fed into MATLAB's supervised machine learning (Classification Learner) tool. A quadratic SVM (Support Vector Machine) fit was chosen for segregating clusters from single cells. The area in green (close to origin) is the area identified by machine learning as the optimum region for isolating single cells (see Fig. 6A). Misclassified data points are shown as hollow whereas correct predictions are filled. The circle and triangle insets exemplify some of the single cells and clusters imaged within droplets during flow.
In the second experiment (Fig. 6B), the same green and red areas identified by MATLAB was used to segregate single cells (circles) from cell clusters (triangles). This demonstrates how the expert classification from the first experiment (Fig. 6A -ML Training) could be used to drive the decisions successfully in another experiment (Fig. 6B -ML Testing); the detailed numerical results are outlined in the inset tables. Most importantly, these two experiments show how expert identification of a specific cell subpopulation could be used to train our sorter for further experiments. To generalise and summarise the workflow: 1. The system is initially run in imaging cytometry mode (Fig. 6A). 2. The cells of interest are labelled by an expert (Fig. 6A). 3. The labels are used to train supervised machine learning classifiers (Fig. 6A). 4. The system is run with the classifier to enrich cells of interest for collection and further downstream processing (Fig. 6B).

conclusion
Presented here is a two-phase microfluidic system capable of real-time, dual-camera imaging (brightfield & fluorescent) within droplets during flow. The acquired image is processed to decide for sorting cells based on adjustable/programmable properties. The sorting is carried out by applying a short, high voltage pulse to the electrodes embedded within the chip that induce a well-established technique, dielectrophoresis. This system incorporates a laser-triggered droplet detector ensuring the timing for the electrical actuation as well as enhancing the throughput. The system is thoroughly characterised by quantifying the sorting performance.
To showcase its capabilities, the system is programmed to address the Poisson loading problem by sorting for droplets containing a single cell: a highly desired preparation technique for single-cell sequencing. In this way, initial cell concentration could be optimised for maximising the probability of single-cell encapsulation without www.nature.com/scientificreports www.nature.com/scientificreports/ worrying about plural encapsulation. The sorting performance of the system is monitored real-time with the aid of the second camera that observes the outlets and saves images of sorted droplets for verification purposes. Overall, it's established that the system performs with high purity, efficiency and enrichment.
To further demonstrate powerful application with this technique, the system was slightly modified for high resolution fluorescent imaging of cells within droplets during flow. By performing an initial cytometry run, single cells of interest were identified and fed this information to supervised machine learning algorithms to optimise the selection. This selection was used in the second experiment to sort the cells of interest, single cells from clusters in this case. The presented system offers a reliable technique for partitioning cell sub-populations based on their morphologies and visage. These partitions could then be further characterised for understanding their biological functions 81 . The former is often performed by a trained expert who manually picks the cells of interest using a pipette.
Overall, the developed system performs well in sorting cell populations, efficiently producing highly pure and enriched samples at the outlets for further observations such as single-cell RNA sequencing. Moreover, the microfluidic chip contains a secondary aqueous inlet that could be used for introducing chemicals like cell lysis buffer for examining and imaging cells going through rapid changes, such as activated immune cells 82 . The presented system will not be able to compete with high-throughput sorting techniques such as fluorescent activated cell sorting as the full image takes longer to acquire and process, however, it can be used as a complementary technique when small cell populations need to be analysed or when labelling is undesired or when the image data presents a unique opportunity for sorting cell populations. The presented platform could be strengthened by additional sensing modules to aid in the decision making such as differential detection photothermal interferometry 83 , digital holographic microscopy 84 , whispering gallery mode resonators 85 , refractive index cytometry 86 or impedance cytometry 87 .

Materials & Methods
Device fabrication. Microfluidic channels were patterned onto silicon wafers using SU-8 2025 (MicroChem Corp., USA) photolithography 88 . Channel height was measured via cross-section microscopy as ≈30 μm. PDMS was cast onto the moulds, degassed, cured, peeled, ported and bonded onto glass slides or coverslips as described elsewhere 89 . The chips were heated to 150 C for indium gallium arsenide (InGaAs) solder wicking into the electrode channels via capillary action 90 .

RBc preparation. Experimental protocols involving red blood cells (RBCs) were approved by the
Heriot-Watt School of Engineering and Physics Sciences Ethics committee. Whole blood was obtained by finger prick from healthy volunteers in accordance with Heriot-Watt University ethical guidelines with informed consent. RBC samples were prepared by adding 3 μL of whole blood into 2.65 mL of buffer solution consisting of 28.1% OptiPrep TM in Phosphate Buffered Saline (PBS -Mg -Ca) with 1% Pluronic. This buffer was produced to have a density of 1.095 g/mL to match that of RBCs 91 to prevent sedimentation as well as cell adhesion and clumping. The final sample was further diluted (1:10) and loaded into a haemocytometer for counting: ≈1.7 × 10 6 cells/mL (diluted), ≈17 × 10 6 cells/mL (input sample). Finally, samples were loaded into a syringe for performing experiments.
K562 preparation. K562s (human erythromyeloblastoid leukemic cells) were cultured with standard culturing techniques 92 . Hoechst 33258 stain was added to K562 cells suspended in culture media in a test tube (1:100 (v:v)). The tube was kept in the dark for 10 minutes while cells were counted in a counting chamber (haemocytometer) (≈10 6 cells/mL). The sample was then centrifuged at 6000 RPMs for 2 minutes and cell culture supernatant was removed before they were resuspended in a buffer solution consisting of 16% OptiPrep TM in Phosphate Data map. In this section, the details of the data used in figures will be outlined. Every experiment has a unique ID associated with it. All experiments except 1140 and 1150 are carried out with the brightfield configuration using red blood cells. Experiments 1140 and 1150 are carried in the fluorescent configuration with K562 cells. Figure 3A Figure 4A is the box plot of sorting performance parameters from experiments 1085, 1090, 1127 and 1130. A total of 41884 droplets were screened, 3126 droplets were detected to contain single cells. Figure 5A, circle (red) data points are from experiment 1085 containing 823 single cell detections. Figure 5A, triangle (green) data points are from experiment 1089 containing 397 detections with the logic gating applied. Figure 6 data is from experiments 1140 and 1150, the details of which are shown in the inset table. fluorescent imaging. For fluorescent imaging, the following changes were carried out in the experimental setup. Referring to Fig. 1; LED 470 nm was replaced with LED 660 nm, LED 350 nm was turned on for excitation, 50:50 beamsplitter before the cameras was replaced with Dichroic Mirror Long Pass 650 nm and camera 1 was replaced with 340M-USB (Thorlabs, UK).

Depth of focus.
Depth of focus, DOF, is defined as the maximum z-height where the sample being imaged is accepted to be in focus. It's given by the Berek formula 93  where λ is the wavelength of the excitation light, NA is the numerical aperture of the lens and M is the total magnification of the system. Resolving Power can be subjective, however, it is usually given as 340 μm 94 to estimate the DOF of the system. As can be seen from the Berek formula, higher NA decreases DOF; it's a trade-off between resolution in the x-y plane versus the z-plane, therefore, DOF should be considered in experimental system design. In our study, the microchannel height was measured as 34 μm; the DOF for brightfield and fluorescent imaging were calculated as 75 μm, and 25 μm, respectively.
Droplet detection with region of interest. Droplets can be optically detected using LabVIEW real time image acquisition and analysis tools. One such tool is called a region of interest (ROI). During real time acquisition of frames from a camera, ROIs are selected within the acquired frame to carry out further analysis. Built in function 'IMAQ Quantify 2 VI' 95 is used to extract the mean intensity, I m , of the current ROI. The lensing effect caused by the mismatch in the refractive indices of the fluid mediums gives rise to a dark ring forming around the droplet-oil interface. When this ring resides within the ROI, the mean intensity significantly drops below the automatically-set threshold, T. The intensity within the ROI is averaged over time, I m to determine the threshold for droplet detection, = .
T I 0 98 m . This accounts for minor changes such as lighting, ROI, etc. to reliably detect droplets as they pass through the ROI. The multiplier could also be modified to account for ROI size and droplet frequency changes.
An important consideration, which justifies the usage of laser droplet detection system, is the discrepancy between camera frame rate and droplet velocity. When that happens, a droplet might appear before an ROI in one frame and after in the other which makes it a ghost droplet for the ROI detection system. This might be compensated by increasing the size of the ROI up to a certain limit. The laser detection system ensures a stable location for the droplet for image manipulation and actuation timing purposes.
Laser spot size. Laser spot size is calculated by acquiring a non-saturated image of the laser itself without any other lighting. The intensity values from a line drawn along the axis of droplets are loaded into MATLAB's Curve Fitting Tool and fitted with a Gaussian to calculate the full width at half maximum in pixels. This is later converted to microns using previously calibrated pixel size of 390 nm (40×).

Data availability
All connected data to this manuscript is available from the authors through reasonable request.