Machine learning issues and opportunities in ultrafast particle classification for label-free microflow cytometry

Machine learning offers promising solutions for high-throughput single-particle analysis in label-free imaging microflow cytomtery. However, the throughput of online operations such as cell sorting is often limited by the large computational cost of the image analysis while offline operations may require the storage of an exceedingly large amount of data. Moreover, the training of machine learning systems can be easily biased by slight drifts of the measurement conditions, giving rise to a significant but difficult to detect degradation of the learned operations. We propose a simple and versatile machine learning approach to perform microparticle classification at an extremely low computational cost, showing good generalization over large variations in particle position. We present proof-of-principle classification of interference patterns projected by flowing transparent PMMA microbeads with diameters of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${15.2}\,\upmu \text {m}$$\end{document}15.2μm and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${18.6}\,\upmu \text {m}$$\end{document}18.6μm. To this end, a simple, cheap and compact label-free microflow cytometer is employed. We also discuss in detail the detection and prevention of machine learning bias in training and testing due to slight drifts of the measurement conditions. Moreover, we investigate the implications of modifying the projected particle pattern by means of a diffraction grating, in the context of optical extreme learning machine implementations.

Flow cytometers are instruments able to analyze and characterize large numbers of suspended biological cells and microparticles one by one, while these are flowing at high speed through a measuring device 1 . In traditional flow cytometers, the moving particles are illuminated, usually by a laser, and the corresponding forward and/or side-scattering intensities are measured, together with the fluorescent emission of selectively attached probes. These devices are widely used to investigate the structure and the chemical composition of large populations of cells in many applications concerning life science and clinical diagnosis. Moreover, they also find diverse applications in industrial and environmental engineering fields, e.g. in measuring bacteria viability 2 or water quality 3 .
Although flow cytometers were constantly innovated upon in the last few decades, their usage is still limited by high cost, complexity and size 4 . Let us now follow a path through some of the recent approaches proposed by the scientific and engineering community to overcome these limitations, in order to contextualize the presented work.
To begin with, the integration of microfluidic systems on a chip allows for a great reduction in cytometers' cost and size, which is particularly appealing for point-of-care applications 4 . Furthermore, the integration with other lab-on-chip devices provides the opportunity for increased automation and for scalable parallelization of particle analysis, potentially multiplying the overall device throughput [5][6][7] . While the use of fluorescent labels in microflow cytometry provides a powerful instrument to discriminate between different cell populations at high throughput (even exceeding 100, 000 cells/s 6 ), the application of fluorescent stains (also called labels) often hinders live cell analysis, e.g., because of cytotoxicity and requires dedicated effort and cost 8 . Two increasingly common approaches to enable accurate and relatively fast label-free analysis while improving detection sensitivity are given by electrical impedance detection and imaging flow cytometry 4 . This work mainly focuses on the latter, whose main advantage is the acquisition of detailed spatial information that can be used both for morphologybased detection and for human visualization as in traditional microscopy. On the other hand, the operational Scientific Reports | (2020) 10:20724 | https://doi.org/10.1038/s41598-020-77765-w www.nature.com/scientificreports/ speed of camera-based cytometers is limited by the acquisition frame rate, providing single-channel throughputs up to around 1000 cells/s when single cells are captured 9 . This limitation can be overcome, at the cost of increasing system and instrumentation complexity, by encoding optical spatial information into a temporal sequence that is measured by a single photodetector. An application of this technique, named Serial Time-Encoded Amplified Microscopy (STEAM), combines the wide spectral bandwidth of a femtosecond pulse laser with both temporal and spatial dispersive optical elements achieving label-free single cell imaging at a very high throughput, up to ∼ 100, 000 cells/s [10][11][12] .
The automatic analysis of digital images is a powerful and versatile tool, but it is usually computationally expensive and memory hungry due to the high data dimensionality given by the number of pixels. In highthroughput imaging cytometers, the huge number of stored images and the required processing time are an issue 9 , even more when compact and cheap applications are targeted, e.g. point-of-care. Furthermore, online image analysis often requires a too high computational power such that real-time cell sorting cannot easily be done. Several machine learning approaches have recently been proposed to automatically analyze the big amount of data generated by label-free imaging flow cytometry 8,[13][14][15][16][17][18][19] , although in most of them the image processing is carried out offline. Exceptions are 15,16,20 , where single-particle classifications respectively took < 1 ms , 0.2 ms and 3.6 ms when accelerated by a GPU. These were applied on images of respectively 21 × 21 and 32 × 32 pixels in the first 2 works, while in the third the original time-stretch-microscope resolution (which was not explicitly mentioned) was reduced by a factor of 40. However, these execution times are still far from enabling real-time classification for state-of-the-art high throughputs of around 100, 000 cells/s, especially if higher resolutions are required to distinguish specific cell features.
The employment of lensless microscopy constitutes a further step towards significantly cheaper and more compact imaging flow cytometers 17,21 . Since in these devices there are no hardware focusing components, an image reconstruction is performed in software, usually taking from few tenths of a second to several seconds depending on the algorithm or image resolution 16,22 . The idea of bypassing the computationally expensive image reconstruction and performing the machine learning classification directly on the acquired interference pattern was proposed in the past 23 and recently experimentally applied 16 .
In this work, we present an experimental proof-of-principle study of some key machine learning issues and opportunities regarding fast particle classification with label-free imaging flow cytometry. To do so, we employ a lensless microflow cytometer for real-time label-free particle classification in its minimalist form, both in terms of components and of computational cost. Including a simple visible laser, a pinhole, a microfluidic channel with pumping mechanism and a camera, it only requires a weighted sum of the pixel values to classify a particle from its background-subtracted 2D interference pattern. A simple-to-train machine learning linear classifier (logistic regression) is employed, which does not require any feature extraction based on domain knowledge. In spite of their simplicity, linear classifiers can be as powerful as other state-of-the-art classifiers when applied to high-dimensional representations of input data (a 2D interference pattern in this case). Extreme Learning Machines 24,25 (ELM) and Reservoir Computing 26,27 (RC) are two widespread machine learning approaches based on this principle, employed for time-independent and for time-dependent processing respectively. Indeed, complex classification tasks, such as separation of cell types with similar morphology, can be in principle improved by simply interposing proper optical diffractive layers between the microfluidic channel and the camera 28 , without increasing the classification time. Therefore, we also demonstrate a method to appropriately evaluate the change in classification performance when interposing a diffraction grating, which can be directly generalized to the interposition of other arbitrary diffractive layers, setting the ground for hardware-based improvement of the proposed classification technique.
In this paper, we also place a special emphasis on detecting and preventing a particularly deceptive and often underestimated type of overfitting (called here measurement bias), which occurs when the influence of the experimental conditions on the training samples is exploited by a machine learning model to wrongly learn how to carry out a classification task. If the samples used to test the classification performance are biased by the measurement conditions in a similar way, a traditional cross-validation would generally fail in detecting the problem and would instead provide misleadingly high performance evaluations. A machine learning-based cytometer employed for particle classification is likely to be affected by measurement bias when, during the training samples acquisition, the particles belonging to different classes are not mixed but are analysed at different times. Nevertheless, this option is often preferable in practice, because it allows to avoid including a dedicated and accurate ground truth provider system (e.g. based on fluorescent labels detection) in the cytometer. In this work we propose and demonstrate a training and validation approach that allows to detect and prevent such a measurement bias.
For our proof-of-principle study, we consider the classification of PMMA microparticles with different diameters: (15.2 ± 0.5)µm (class A) and (18.6 ± 0.6)µm (class B), where the error is given by the nominal standard deviation of the particle diameter. In the "Results" section, we first describe the main measurement and machine learning aspects. After that, our method to detect and prevent measurement bias is presented and demonstrated. The classification results obtained for different fields of view of the cytometer and for different image resolutions (including execution time evaluation) is discussed. We also study the effect of interposing different diffractive layers between the camera and the microfluidic channel. Additionally, we compare the obtained results with the ones presented in 3 other relevant works. In the "Discussion" section we summarize the work and discuss the general conclusions. The "Methods" section is dedicated to the technical details.

Results
Interference patterns acquisition and machine learning classification. Employing a CMOS image sensor, we acquired the interference patterns obtained by shining red laser light on transparent PMMA microparticles (with diameters of (15.2 ± 0.5)µm and (18.6 ± 0.6)µm ) flowing in a 100 µm × 100 µm microfluidic Scientific Reports | (2020) 10:20724 | https://doi.org/10.1038/s41598-020-77765-w www.nature.com/scientificreports/ channel ( Fig. 1a,b). We also performed some measurements interposing a double axis holographic diffraction grating between the microfluidic channel and the camera to modify the imaged pattern (see subsection "Classification performance when mixing with diffractive layers"). The setup configuration with no diffraction grating interposed will be referred to as NDG, while the configuration comprising the diffraction grating will be referred to as DG. Figure 1c,d show examples of the acquired background pattern for the two configurations. The classification process is schematized in Fig. 2. We performed background subtraction on each image by subtracting the previously acquired one. (Because of our flow rates, the probability of having two consecutive frames containing significant particle signal is low.) To ensure that the background subtraction did not introduce any significant artificial particle signal in the sample set, we discarded those images that directly followed an accepted one (image "acceptance" is described in the following lines). Since the CMOS sensor operated in a freerun mode, many of the acquired images contained the background illumination pattern without particles or with only a weak signal from particles far away from the illumination center. Instead of considering these unimportant images as an additional class for the machine learning classifier, we chose the simpler option of discarding them. To do this, we needed to measure the strength of the particle signal: for each background subtracted pattern we calculated the sum of all the squared pixel values, which from now on will be referred to as overall perturbation P. Examples of background subtracted images with the respective P values are shown in Fig. 1e-j respectively for the NDG and for the DG configurations. Only those images whose P value is larger than a chosen acceptance threshold θ P were accepted as samples used to train and test the machine learning classification. The criteria and the motivation for the choice of θ P will be explained in detail later on in this article. Finally, it should be stressed that the particle class could not be straightforwardly determined by human examination.
Similarly as in 28 , in this work we trained and tested a simple linear classifier based on logistic regression, directly applied on the pixel values of background-subtracted images. Its task was to classify the acquired interference patterns according to the microbead diameter. We employed L2 regularization to reduce overfitting and we optimized its strength by means of k-fold cross-validation, with number of folds N s = 11 (see next subsection for details). When images with high resolutions ( > 10, 000 pixels) were employed as classification samples, a feature selection procedure was applied to reduce both the risk of overfitting and the training time. In particular, we discarded those pixels that showed low class separation, i.e. where the value distributions corresponding to the considered classes showed a small difference (see "Feature selection" in the "Methods" section for more details). A PMMA microfluidic channel (cross section 100 µm × 100 µm ) is illuminated by a laser radiation (HeNe laser, = 632.8 nm ) focused on a pinhole. The resulting beam passes through a double axis holographic diffraction grating (only in one of the employed configurations) and is captured by a CMOS camera. (b) Schematic of the illuminated microfluidic channel region. The larger the particle distance from the field of view center, the weaker the acquired particle signal (measured by the perturbation quantity P). (c-l) Respectively for the NDG (top row) and DG (bottom row) configurations, examples of background pattern (1st column), background-subtracted particle patterns with increasing intensity (2nd to 4th columns) and class separation colormaps (last column). (e,h) are well below the respective acceptance thresholds, in this case θ NDG P ∼ 7200 and θ DG P ∼ 5100 (for a particle ratio R = 0.04 ). (f,i) are just above and (g,j) are well above the respective acceptance thresholds. Grey arrows suggest a qualitative link between these examples and the particle position w.r.t. the FoV shown in (b).

Effects and prevention of measurement bias. Supervised machine learning algorithms can learn how
to carry out a certain task on a given sample population, e.g. classification of cells in digital pictures, by analyzing a set of training samples for which the solution to the task (i.e., the training label) is given. Therefore, the performance of such algorithms when applied on unseen samples (generalization) is obviously limited by how comprehensively the training samples represent the target sample population. When the noise in the samples and the labels are uncorrelated, generalization can be usually improved by increasing the number of training samples or through regularization techniques, i.e. reducing the overfitting. This is a very well-known practice and in this case the presence of overfitting can be easily detected by testing the algorithm on samples that were not used in the training stage. Less known and more deceptive is the case where the noise and the training labels are correlated, e.g. in classification problems where samples from different classes are acquired or measured under significantly different experimental conditions. In this case, which we will be referring to as measurement bias, the machine learning training is most likely biased by the measurement conditions, which are mistakenly considered as a distinguishing trait of the classes. This leads to a worsening of the classification performance under new measurement conditions, i.e. to a decrease in generalization. The elusiveness lies in the fact that measurement bias leads to misleadingly high estimated accuracies and cannot be detected if the training and test samples are measured under the same biasing conditions. To apply this more concretely to our case of an imaging microflow cytometer, e.g. to train a label-free white blood cell classifier, for practical reasons, monocytes and granulocytes might be kept separated and their images (used as training and test samples for the machine learning algorithm) might be acquired in different measurement sessions, often leading to measurement bias because of drift in between sessions. Indeed, many factors may produce significant drifts in measurement parameters, such as fluctuations of the light source properties, displacement or distortion of the optical beam (e.g. due to thermal expansion of some elements), refractive index changes of the optical components (e.g. due to slow water absorption of the microfluidic channel walls) and so on.
It should be stressed that in this case background subtraction might mitigate but cannot completely remove the measurement bias, as it is demonstrated in the next paragraph. Indeed, the background signal is given by the unperturbed laser beam impinging on the camera screen while the particle signal is mainly given by a spatial optical path perturbation of the same laser beam. These two signals are combined in a strongly nonlinear way by the image sensor measurement and therefore they cannot be decoupled by a simple linear operation such as background subtraction.
Another approach to remove measurement bias is to mix the two kinds of cells and determine their class (i.e. their label) during the image acquisition using an auxiliary system, e.g. a fluorescent label detector. However, including such a system is more complex, also considering that to train an accurate classifier even more accurate ground truth data is required. This is therefore not what we considered in this paper.
In order to provide an experimental demonstration of the negative effects of measurement bias, we performed ad hoc sample measurements according to the following chronology: (1) A train (20 mins), B train (20 mins), A test (2 mins, 15 s), B test (2 mins, 26 s) Figure 2. Schematic of the machine learning classification pipeline. Intensity patterns are acquired by the image sensor in free-run mode. The difference between consecutive images is calculated (background subtraction), and if the squared sum of its pixels is lower than a chosen acceptance threshold value θ P the image is considered as background and discarted. A linear classifier (trainable weighted sum) is applied to accepted backgroundsubtracted images. If the outcome is positive, the analyzed particle is classified as belonging to class A, to class B otherwise.  (Fig. 3, compare left with middle). This means that the classifier training was influenced by the measurement conditions leading to an overestimated generalization capability when samples from the same measurement session were employed for testing. Such an effect is also responsible for a large variance in performance evaluation ascribed to the fluctuations of the measurement conditions during the measurement sessions. In this work we developed a simple method to solve this problem, i.e. to effectively decouple the training sample labels from slow fluctuations of the measurement parameters, avoiding measurement bias. In particular, we acquired the samples according to the following measurement sessions chronology (duration of 2 mins each): i.e. using intertwined class measurements to provide training, validation and test samples to the classification algorithm. In all cases, the measurement sessions were performed at different times in the same day. Considering a number of sessions per class N s = 11 , we then employed the following validation algorithm: where h k is a hyperparameter (L2 regularization strength in our case) to optimize by choosing among given options corresponding to k = 1, 2, . . . , N h and h is the chosen hyperparameter value; θ is the set of readout parameters (weights and intercept) determined by the training, p refers to a performance evaluation (the estimated accuracy in this case) of the machine learning classifier and p final is the final evaluation of the whole algorithm, including the hyperparameter selection. The generalization of the algorithm to multiclass and multiple hyperparameters cases is straightforward. The main concept here is that the training, validation and test datasets not only are always disjoint as it happens in traditional cross-validation, but they were also acquired in different Applying the proposed intertwined measurements and validation algorithm, we obtained better classification performance (Fig. 3, compare right with middle). Moreover, we obtained an evaluation of the accuracy average and variance generalized to different measurement sessions. As explained in the next subsection, we checked if the measurement bias was still affecting our results by means of a suitable test (UM test). The number of sessions per class N s should be chosen high enough to ensure that the measurement bias is removed and to achieve a satisfactory generalization capability of the trained classifier. Generally, N s is limited by the difficulty and the time required to perform a high number of measurement sessions to provide training samples. Therefore, an optimal N s is highly application-dependent.
Classification performance vs. field of view. According to how displaced the flowing particle is w.r.t. the laser beam center, the acquired interference patterns may vary in intensity, position and shape (e.g. Fig. 1c,e,f), making the particle analysis more or less difficult. The range of such displacement for which it is still possible to perform the particle classification is called field of view (FoV) of the cytometer. In this case, the time interval between two consecutive image acquisitions is much longer than the travel time of a particle through the FoV, implying that a fraction of the flowing particles are not measured. Thus, the larger the FoV the higher the number of particles that are analysed w.r.t. the total number of flowing particles and therefore the higher the maximum sensitivity of the cytometer. Usually, the sensitivity of particle detection can be enhanced by employing an effective microfluidic focusing system 4 , even though there is a trade-off between fabrication complexity, sensitivity and throughput. In any case, the particle displacement along a microfluidic channel always constitutes an important source of variability.
In this work, we estimated the classification performance considering different unidimensional FoV values along the microfluidic channel direction. The transverse channel dimensions were neglected since the illumination was considered to be relatively uniform on the channel cross sections. As it is intuitively schematized in Fig. 1b, the larger the distance of a particle form the illumination center, the smaller the P value of the obtained image. This implies that the FoV is determined by the choice of the acceptance threshold θ P . Still, two particles belonging to different classes and in the same position will lead to two images with different P value. Therefore, in order to have the same FoV for different classes of particles, the applied θ P should ideally be class-dependent. However, this is only feasible in the training stage, where the classes (labels) are known, while in the test stage a common acceptance threshold has to be used for all the acquired images. Since this mismatch between training and test sample populations may be detrimental for classification performances, in this work we chose to use a common θ P for the two classes in both training and test. In practice, the applied acceptance threshold θ P was chosen so that a desired value for the particle ratio R, defined as the ratio of the number of accepted particle images to the total number of acquired images, is obtained. The reason is that the particle ratio can be used as a more objective bridge quantity in the classification comparison with the cases where diffractive optical layers are interposed between the microfluidic channel and the camera (this is explained in the subsection "Classification performance when mixing with diffractive layers"). For each value of R, the FoV for each class can be estimated (see "Calculation of acceptance threshold and field of view given a chosen particle ratio" in the "Methods" section).
We evaluated the performance of the presented classification algorithm considering sample sets obtained through different choices of R = 0.02, 0.04, 0.06, 0.08 (Fig. 4a). The employed image resolution is 127 × 102 pixels, corresponding to a down-sampling with a factor 5 w.r.t. the camera resolution. A feature selection algorithm (see "Feature selection" in "Methods" section) was applied to remove the most noisy pixels and therefore to decrease the risk of overfitting, leaving a total of 10363 features, i.e. ∼ 80% of the pixels. www.nature.com/scientificreports/ The sample set corresponding to R = 0.04 provides the best classification performance (low error average and variance) due to a trade-off between the quality and the number of samples. Indeed, a lower R, or equivalently a higher acceptance threshold θ P , means we only keep the samples with the highest quality in the center of the laser beam, reducing the FoV. This results in a lower sample variability (which should make the classification easier), but also in a lower number of available samples (which makes it more difficult to train the classifier). It should be stressed that the optimal R value is application-specific. In particular, R should be chosen so that the classification accuracy is maximized, while trying to achieve the target cytometer throughput. Moreover, the number of available training samples and the classifier complexity (e.g. given by the image resolution) play two major roles in the choice of R, because of the need to avoid overfitting.
We furthermore double-checked whether the classifier would still be biased by the measurement conditions, in spite of our intertwined class measurements. This was done by training it on the same dataset but with half of the measurement sessions mislabeled, i.e. in list (Eq. 2): , and so on. In this way, the characteristic features given by the different sizes of the beads (corresponding to the true classes) were equally present in both the nominal classes (those presented to the training algorithm). Thus, if the classifier only learns the particle-related features and therefore it is not biased, it would provide the same accuracy of a random guess ( ∼ 50% in the two-classes case). This uniform mislabelling (UM) test shows indeed errors around ∼ 50% in Fig. 4b, which indicates that no significant bias is detected.

Classification performance and time vs. image resolution.
In imaging flow cytometry, the resolution of the acquired images is a key parameter, not only because of the obvious relation with the price and the frame rate of the employed image sensor, but also because it greatly influences the execution time of the particle analysis/classification and therefore the throughput limit of online operations, such as cell sorting.
We evaluated the performance of our particle classification technique for different resolutions of the employed images and we estimated the corresponding execution (inference) times. Different sample sets were obtained downsampling the acquired images by approximated factors 2, 5, 10, 20, 40, 100 and 400. Thus, the original resolution of 632 × 508 pixels was decreased respectively to 316 × 254 , 127 × 102 , 64 × 51 , 32 × 26 , 16 × 13 , 7 × 6 and 2 × 2 . Note that for the highest two resolutions respectively 87.1% and 20% of the pixels were discarded by means of feature selection (see "Feature selection" in the "Methods" section), in order to limit overfitting and the computational cost of training the classifier. For the remaining resolutions no feature selection was performed, i.e. all the pixel values were employed as features for machine learning.
Using the previously determined optimal particle ratio value R = 0.04 , we obtained classification errors below 10% for image resolutions of 127 × 102 , 64 × 51 and 32 × 26 pixels (Fig. 5a). The error is just slightly worse using 16 × 13 pixels, but it abruptly increases for 7 × 6 and 2 × 2 pixels, showing that the resolution is too low to provide the classifier with enough particle information. In particular, this shows that the classification task could not be carried out by just considering the total forward scattering intensity, as opposed to bead size discrimination in traditional flow cytometers. This suggests that our classification system presents much less stringent requirements on the alignment of flowing particles with the laser beam. Also selecting 12.9% of the pixels from higher resolution images ( 316 × 254 pixels) leads to a small but significant degradation of the classification performance. Setting R = 0.02, 0.06 or 0.08, similar performance trends with an overall degradation were obtained. It should be stressed that the relation between the classification error and the image resolution depends on the addressed classification task and cannot be generalized.
The average execution time of the classification algorithm inference (i.e. background subtraction + application of acceptance threshold + machine learning inference , see Fig. 2), was evaluated for different image resolutions running a Python script on a normal laptop (Intel Core i5-8250U, 1.60GHz ×8 ). Ultrafast image classification was achieved with computational times per particle in the order of 100 µs to 10 µs depending on the resolution (Table 1). It should be stressed that these values could be easily further decreased by, e.g., employing multi-core computing, a graphics processing unit (GPU) or a dedicated hardware.
Classification performance when mixing with diffractive layers. From a machine learning perspective, one might intuitively assume that applying a simple linear classifier on the raw pixel values of an image would generally provide a much weaker classification power w.r.t. common approaches based on feature extraction and deep learning. Actually, linear classifiers and regressors can provide state-of-the-art performance when applied to random high-dimensional nonlinear transformations of the input, as it happens in widespread approaches like Extreme Learning Machines 24,25 (ELM) and Reservoir Computing 26,27 (RC). Indeed, the relation between the optical particle features and the detected interference pattern (input and output) is mathematically nonlinear and the high number of pixels in an image sensor can potentially provide a high-dimensional mapping. Therefore, modulating and controlling the interference pattern projection e.g. through interposed diffraction layers can provide an extremely fast and power-efficient source of computational power, as it was experimentally demonstrated in 29,30 . Moreover, in the past 28 we numerically demonstrated that random diffractive layers that resemble diffraction grating structures can significantly improve the performance of a linear classifier in non-trivial classification of cell structures. However, by interposing diffractive layers between the particle and the image sensor, the automatic discrimination of particle images from background images is likely to be influenced. In particular, this can modify the cytometer sensitivity and the class balance in the training sample sets. In this subsection we present a method to avoid these issues and to guarantee a valid performance comparison, laying the groundwork for hardware-based improvement of the proposed classification technique.
For the sake of simplicity, we present the comparison between two simple configurations: no diffraction grating (NDG), and one interposed double-axis holographic diffraction grating (DG, see Fig. 1 and θ DG P to make the two cases comparable. We want to compare both cases for a fixed maximum sensitivity of the cytometer, i.e. when the FoV is the same in both configurations. Generally, the introduction of a diffractive layer changes the intensity of the acquired particle signal in a nonlinear way, so that θ NDG P = θ DG P or even θ NDG P ∝ θ DG P would lead to different FoVs. However, as we discuss in the calculation of the field of view in the "Methods" section, there is a one-to-one correspondence between the particle flow rate R f , the particle ratio R and the field of view. If we can guarantee in our experiments that the particle flow rate R f is Figure 5. (a,b) Box plots of the classification error for R = 0.04 evaluated on particle images of different resolution (x axis), with and without holographic double axis diffraction grating interposed between the camera and the microfluidic channel. Classification errors lower than 10% were obtained for image resolutions down to just 32 × 26 pixels. Generally, the interference patterns processed by the diffraction grating provide particle classification with similar or slightly higher errors. Note that the number of samples used to evaluate the first two points was further reduced by feature selection. (c,d) Particle rate R as a function of the acceptance threshold θ P for different measurement sessions. (c) Comparison between the configuration without interposed diffraction grating (NDG, blue dots) and with diffraction grating (DG, red dots). The diffraction grating changes the relation in a nonlinear way. (d) Comparison between measurement sessions (both in NDG configuration) performed with a time distance of 3 days. The curve do not change significantly from one measurement session to another, indicating stability in our measurements. Table 1. Execution time per particle of the proposed classification algorithm for different image resolutions, evaluated on a laptop (Intel Core i5-8250U, 1.60GHz × 8) using a Python script (Numpy library). The reported time values are averaged (median) over 10000 iterations of the following steps: computation of the difference between the target and the background image after conversion to float type matrices; application of the acceptance threshold to the sum of the squared elements of the difference matrix; weighted sum of the difference matrix (i.e. machine learning inference). www.nature.com/scientificreports/ constant, the requirement of having a fixed field of view translates to a requirement of having a fixed particle ratio R. This allows us to set the acceptance thresholds for both configurations, by looking at the experimentally determined relationship between the particle ratio R and the acceptance threshold θ P (see Fig. 5c). We also checked whether the particle flow rate R f did not change significantly from one measurement session to another, and therefore that the relation between R and θ P remained constant. This was experimentally confirmed by comparing two measurements in NDG configuration performed at significantly distant times (3 days one from another, see Fig. 5d).
Generally, the DG configuration provided similar or (in most cases slightly) inferior classification performances w.r.t. the NDG configuration (for example compare Fig. 5a,b). We ascribe the higher error rates mainly to the significant intensity attenuation by the diffraction grating, leading to a lower signal-to-noise ratio. This issue, however, can be easily overcome in a more mature cytometer implementation, e.g. by enclosing the system in a box or by screening the sensor with an optical filter to reduce noise due to environmental illumination. The main challenge of the classification task studied here is the variability due to the microbead displacement w.r.t. the illumination center, which can be in principle arbitrarily alleviated by decreasing the cytometer FoV. In that case, we expect that a properly designed diffractive layer may improve the classification performance, especially when the particle types are distinguished by differences in internal structure, such as in sorting of white blood cells 28 .
On the other hand, the fact that the classification performance is not significantly disrupted by the heavy deformation of the particle interference pattern due to the diffraction grating (visual examples are in Fig. 1d,h-j,l), demonstrates the robustness of the proposed cytometer. Indeed, the classifier can be trained without any problem when the acquired images are altered, e.g. by fabrication defects, misalignment or blurring, as long as the particle information regarding the difference between classes is not lost. This is relevant in practice, as motion blur is a common problem in imaging flow cytometry 4 and it often limits the achievable throughput.
Comparison with other works. In this subsection we compare the classification performance of our method with the performance presented in other three comparable works, reporting online label-free classification (Table 2). It should be specified that the throughput of our setup is quite low ( ∼ 2.7 classified cells per second for R = 0.04 ), since our work mainly focuses on general machine learning aspects of label-free imaging flow cytometry rather than on developing a high-throughput device. We should also stress that it is difficult to estimate and compare the complexity of the respective classification tasks, since not only do the particle characteristics play a crucial role, but also cytometer properties such as the FoV, the presence of an image focusing system or the control of measurement bias.
In particular, it should be stressed that a wider FoV not only introduces the challenge of generalizing the classification to a higher variability in particle position, but also implies a smaller contrast of the particle signal w.r.t. the background illumination. In this regard, in 20 the reported FoV is 25 µm , much smaller than what we estimated for this work ( ∼ 0.3 mm , Table 3). While in 16 there seems to be no mention of it, in 15 the FoV is comparable with ours, but the actual machine learning classification is applied on cropped and centered particle images so that the variability in particle position does not complicate the classification. Furthermore, it is interesting to note that our classification algorithm is not specifically built to extract position-invariant features, as opposed to the classifiers used in the other works here described. Finally, a distinguishing trait of this work is that the classifier could learn and operate on images that could not be straightforwardly classified or recognized by human inspection (e.g. see patterns in Fig. 1).
This said, the presented bias-free classification is at least 15 times faster w.r.t. the aforementioned works, even if it is only computed with a common laptop and without GPU acceleration.

Conclusion
We discussed some important machine learning aspects regarding fast particle classification with label-free imaging flow cytometry. To do so, we employed a simple, cheap and compact cytometer and demonstrated ultrafast classification of particle interference patterns, which can enable online high-throughput analysis (e.g. for cell sorting) at a low computational cost. Proof-of-principle experiments were performed by acquiring and classifying Table 2. Comparison of machine learning-related aspects regarding three other works (reporting online labelfree classification via particle imaging) and our work. CNN is the acronym for Convolutional Neural Network, while mAP is the abbreviation of mean Average Precision. www.nature.com/scientificreports/ interference patterns projected by transparent PMMA microparticles with diameters of (15.2 ± 0.5)µm and (18.6 ± 0.6)µm , that could not be easily classified by human inspection. In particular, we discussed and demonstrated the following fundamental aspects: • Detection and treatment of the deceptive bias that can affect machine learning models, rising from the correlation between the ground truth information (necessary for training and testing) and the experimental conditions that may influence the measurements. • Direct application of a linear classifier on background-subtracted images of particle interference patterns, allowing simple and robust machine learning classification of particles with high position variability at an extremely low computational cost. • A method to properly evaluate the change in classification performance when a diffractive layer (a doubleaxis holographic diffraction grating film in this case) is interposed between the camera and the microfluidic channel, making sure that the field of view (i.e. the sensitivity) and the class balance of the training sample sets remain unchanged.
A diffraction layer interposed between the camera and the microfluidic channel can in principle improve particle classification according to the Extreme Learning Machine (ELM) paradigm 28 , even though in this case similar or slightly worse performances were achieved. Nevertheless, we think that an experimental demonstration of the classification improvement due to an interposed diffractive layer should be tried by thoroughly exploring different configurations and considering a more morphology-based classification task, such as in white blood cell sorting. Quantitatively speaking, the best achieved performance in terms of classification accuracy and execution time is an accuracy above 90% (on 32 × 26 pixels images) with an estimated execution time of 13 µs (using a common laptop) and a field of view of ∼ 300 µm along the microfluidic channel. It should be noticed that the accuracy could be enhanced by simply employing a smaller field of view and by acquiring a sufficient number of samples to properly train the classifier. As mentioned, suitable measurements, validation algorithms and tests were devised and employed to obtain a correct training and evaluation of the classification performance, which would otherwise have been biased by slight drifts of the measurement conditions. The proposed particle classification algorithm is at least one order of magnitude faster w.r.t. the state-of-the-art, represented by other three works regarding fast online classification in label-free flow cytometry 15,16,20 , where instead GPU acceleration was employed.
The low computational cost of the proposed classification method could enable ultrafast ( ∼ 100, 000 particles/s) online particle analysis if applied to time-stretch microscopy 11,14 , removing or alleviating the issue of storing large amounts of data and allowing fast online operations in these systems, such as cell sorting. Another possible high-throughput application is to perform the cell analysis in parallel employing multiple particle streams, where the computational cost would be a bottleneck parameter 5,7 .
Finally, the all-round simplicity and the low cost of the presented flow cytometry approach make it suitable for compact point-of-care applications, where both the training and the use of the cytometer should not require high technical expertise.

Methods
Measurement details. The employed PMMA microbeads mixtures were obtained by diluting the original mixtures ( 5% solid content volume) in a solution of water with a small quantity of surfactant and a water purification tablet, reaching a fraction of solid content volume of 0.024% . The mixtures were pumped in a 100 µm × 100 µm straight PMMA microfluidic channel at a constant rate of ∼ 0.003 ml/s , using three different syringes (one at a time) respectively for the two particle classes and the flushing water, to avoid particle contamination. Between each measurement session, the microfluidic channel and tubes were flushed with water to remove possible residual microbeads.
The microfluidic channel was illuminated by focusing HeNe laser radiation (constant emitted power of 3.5 mW) on a pinhole (diameter of 25 µm ) tightly clamped to the microfluidic slide in order to prevent it from moving during measurements and to reduce vibration noise. When employed, the holographic diffraction grating Table 3. Correspondence between chosen particle ratio R values (same for both particle classes), the number of images accepted as samples for classification (with strong enough particle signal) and estimated FoV of the classification process. Left and right tables regard respectively the configurations with and without a diffraction grating interposed between the microfluidic channel and the camera (NDG and DG configurations). www.nature.com/scientificreports/ film was directly attached to the front side of the microfluidic slide. A schematic of the employed setup is shown in Fig. 1a. Images of 632 × 508 pixels were acquired in free-run mode by a Ximea MQ013MG-0N camera, at a frame rate of ∼ 138 fps and with 29 µs exposure time.
Machine learning pipeline. The whole image processing presented in this work was executed in Python. In particular, the machine learning pipeline was built on top of the scikit-learn library 31 and the following functions were employed: model_selection.GroupKFold to implement the two nested cross-validation loops; preprocessing. StandardScaler to normalize the features before each training or inference step; linear_model.LogisticRegression with "l2" penalty, "liblinear" solver and "balanced" class weight, as linear classifier. The only optimized hyperparameter was the inverse of the L2 regularization strength C, chosen among 13 values equidistant in log. scale from 10 −5 to 10. The downsampling to desired image resolutions was performed employing the "block_reduce" function from the Scikit-image Python library. The classification error rate reported in the box plots represents the fraction of misclassified test samples w.r.t. the total number of test samples, thus it is the complementary percentage of the classification accuracy.
Feature selection. The feature selection, applied only in the two highest image resolution cases, consists in selecting only a fraction of the pixels, in particular those that show the highest class separation. Given a pixel, the class separation tells how stochastically larger or smaller are the values corresponding to one class w.r.t. to the ones belonging to other classes. To obtain a measure for this quantity that is robust against outliers and nonnormality, we exploited a simple non-parametric statistic: the Mann-Whitney U 32 . In particular, the following normalized (from 0 to 1) expression was considered: where U is the aforementioned statistic, calculated through the Python function scipy.stats.mannwhitneyu (with the "alternative" parameter set to "two-sided"); n A and n B are the number of samples belonging to class A and B respectively. Rather than selecting few important features, the proposed feature selection method is more suited to discard unimportant noisy features from a large set, such as pixels that do not contain particle information, in a computationally cheap and statistically robust way. Moreover, visualising the class separation colormap may provide interesting insight on the interference pattern areas that are most class-dependent (Fig. 1g,n).
Calculation of acceptance threshold and field of view given a chosen particle ratio. The relation between particle ratio R and acceptance threshold θ P was graphically obtained by plotting the count of accepted images divided by the total number of images for many values of θ P (Fig. 5c). It was then straightforward to select an acceptance threshold corresponding to a chosen particle ratio. The field of view (FoV) can be derived from the acceptance threshold, knowing the aforementioned particle flow rate R f , the exposure time τ and the fluid velocity v. In particular, let us start by finding the probability that an image contains enough particle information, i.e. that a particle is at least partially present in a given FoV during an exposure time interval τ . Let us call t in and t out the times at which a particle respectively enters and exits the FoV. Then, let us call τ start and τ end the start and end times of the camera exposure. Thus, the conditions for capturing the signal of a particle in the FoV are t in < τ end and t out > τ start . We can substitute t out = t in + FoV/v , being FoV/v the time that a particle takes to travel through the FoV, obtaining τ start − FoV/v < t in < τ end . Since the density of particles in the mixture is quite low, we can consider the passage of particles as independent events. Therefore, the process of imaging the pattern from k particles in the FoV can be considered as the Poisson process describing the occurrence of k events t in , with a time rate R f , in a time interval τ + FoV/v , with probability: In our case τ = 29 µs and we can calculate R f by multiplying the flux rate (0.2 ml/min) by the estimated particle concentration, which depends on the particle class ( 1.6 × 10 4 and 0.91 × 10 4 particles ml respectively for class A and B) since the mixtures have a common solid content volume. Note that we are assuming that the number of particles that remain stuck somewhere before reaching the illumination area is negligible w.r.t. the total number of passing particles. Therefore, even if we deem this assumption sufficiently true in our case, we should keep in mind that the estimated R f is more an upper limit for the true particle flow rate. From the next calculation steps it will be evident that this implies that we will obtain a lower limit estimate of the true FoV. To provide an example calculation, assuming a reasonable FoV = 100 µm , respectively for classes A and B we obtain (keeping 2 significant digits): Pr A (k = 0) = 0.98 , Pr B (k = 0) = 0.99 , Pr A (k = 1) = 0.017 , Pr B (k = 1) = 0.0098 , Pr A (k = 2) = 0.00016 , Pr B (k = 2) = 0.000048 . These results are qualitatively consistent with both our visual checks and our assumption that the particles do not significantly often interact during their passage through the microfluidic channel (statistical independence). The particle ratio R can be estimated by R = 1 − Pr(0, τ + FoV/v, R f ) , with reference to equation (5). Thus, by inverting it, we can finally estimate the FoV corresponding to a chosen value of R: For each chosen value of R and for each particle class, we report in Table 3 the number of classification samples (accepted images) and the FoV estimates. The corresponding estimated FoV is quite large: ∼ 0.3 mm . It should (4) |U − (n A n B + 1)/2| (n A n B + 1)/2