The rapid and accurate identification of live microorganisms is of great importance for a wide range of applications1,2,3,4,5,6,7,8, including drug discovery screening assays1,2,3, clinical diagnoses4, microbiome studies5,6, and food and water safety7,8. Waterborne diseases affect more than 2 billion people worldwide9, causing a substantial economic burden; for example, the treatment of waterborne diseases costs more than $2 billion annually in the United States (US) alone, with 90 million cases recorded per year10.

Among waterborne pathogen-related problems, one of the most common public health concerns is the presence of total coliform bacteria and Escherichia coli (E. coli) in drinking water, which indicates fecal contamination. Analytical methods used to detect E. coli and total coliforms are based on culturing the obtained samples on solid agar plates (e.g., the US Environmental Protection Agency (EPA) 1103.1 and EPA 1604 methods) or in liquid media (e.g., Colilert test), followed by visual recognition and counting by an expert, as described in the EPA guidelines11,12,13. While the use of liquid growth media for the detection of fecal coliform bacteria provides high sensitivity and specificity, it requires at least 18 h for the final read-out. The use of solid agar plates is a relatively more cost-effective method and provides flexibility for the volume of the sample to be analysed, which can vary from 100 mL to several litres to enhance the sensitivity. However, this traditional culture-based detection method requires the colonies to grow to a certain macroscopic size for visibility, which often takes 24–48 h in the case of bacterial samples. Alternatively, molecular detection methods14,15 based on, e.g., the amplification of nucleic acids, can reduce the assay time to a few hours, but they generally lack the sensitivity for detecting bacteria at very low concentrations, e.g., 1 colony forming unit (CFU) per 100–1000 mL, and are not capable of differentiating between live and dead microorganisms16. Furthermore, there is no EPA-approved nucleic acid-based analytical method17 for detecting coliforms in water samples.

Overall, there is a strong and urgent need for an automated method that can achieve rapid and high-throughput colony detection with high sensitivity (routinely achieving, e.g., 1 CFU per 100–1000 mL in less than 12 h) to provide a powerful alternative to the currently available EPA-approved gold-standard analytical methods that (1) are slow, take ~24–48 h and (2) require experts to read and quantify samples. To address this important need, various other approaches18,19,20 have been investigated for the detection of total coliform bacteria and E. coli in water samples, including solid phase cytometry21, droplet-based micro-optical lens array measurements22, fluorimetry23, luminometry24, and fluorescence microscopy25. Despite the fact that these methods provide high sensitivity and some time savings, they cannot handle large sample sizes (e.g., ≥100 mL) or cannot perform the automated classification of bacterial colonies.

To provide a highly sensitive and high-throughput system for the early detection and classification of live microorganisms and colony growth, we present a time-lapse coherent imaging platform that uses two different deep neural networks (DNNs) for its operation. The first DNN is used to detect bacterial growth as early as possible, and the second DNN is used to classify the type of growing bacteria based on the spatiotemporal features obtained from the coherent images of an incubated agar plate (see Fig. 1). In this live bacteria detection system, which is integrated with an incubator, lens-free holographic images of the agar plate sample are captured by a monochromatic complementary metal–oxide–semiconductor (CMOS) image sensor that is mounted on a translational stage. The system rapidly scans the entire area of two separate agar plates (~56.52 cm2) every 30 min and utilizes these time-resolved holographic images for the accurate detection, classification, and counting of the growing colonies as early as possible (see Fig. 2a). This unique system enables high-throughput periodic monitoring of an incubated sample by scanning a 60-mm-diameter agar plate in 87 s with an image resolution of <4 μm; it continuously calculates differential images of the sample of interest for the early and accurate detection of bacterial growth. The spatiotemporal features of each nonstatic object on the plate are continuously analysed using deep learning to yield the count of bacterial growth and to automatically identify the type(s) of bacteria growing on the different parts of the agar plate.

Fig. 1: High-throughput bacterial colony growth detection and classification system.
figure 1

a Schematic of the device. b Photograph of the lens-free imaging system. c Detailed illustration of various components of the system

Fig. 2: Schematics demonstrating the workflow of the microorganism monitoring system.
figure 2

a Bacterial sample preparation workflow. b Steps of the image and data processing algorithms for the automated detection of the growing colonies and classification of their species. The scale bars for the holographic images of the growing colonies (E. coli and K. aerogenes) and a static particle (dust) are 100 µm

We demonstrated the efficacy of this platform by performing the early detection and classification of three types of bacteria, i.e., E. coli, Klebsiella aerogenes(K. aerogenes), and Klebsiella pneumoniae(K. pneumoniae), and achieved a limit of detection (LOD) of ~1 CFU/L in ≤9 h of the total test time. Moreover, we achieved detection time savings of more than 12 h compared to the gold-standard EPA methods26, which usually require at least 24 h to obtain a result. We also quantified the growth statistics of these three different species and provided a detailed growth analysis of each type of bacteria over time. Our detection and classification neural network models were built, trained and validated with ~16,000 individual colonies resulting from 71 independent experiments and were blindly tested with 965 individual colonies collected from 15 independent experiments that were never used in the training phase. In our blind testing, the trained models demonstrated an 80% detection sensitivity within 6–9 h, a 90% detection sensitivity within 7–10 h, and a >95% detection sensitivity within 12 h, while maintaining ~99.2–100% precision at any time point after 7 h, also achieving correct identification of 80% of all three the species within 7.6–12 h. In terms of the species-specific accuracy of our classification network, within 12 h of incubation, we achieved ~97.2%, ~84.0%, and ~98.5% classification accuracy for E. coli, K. aerogenes, and K. pneumoniae, respectively. These results confirm the transformative potential of our platform, which not only enables the highly sensitive, rapid and cost-effective detection of live bacteria (with a cost of $0.6 per test, including a culture plate) but also provides a powerful and versatile tool for microbiology research.


We demonstrated our system by monitoring bacterial colony growth within 60-mm-diameter agar plates and quantitatively analysed the capabilities of the platform for early detection of the bacterial growth and classification of bacterial species. To demonstrate its proof-of-concept, we aimed to automatically detect, classify, and count E. coli and coliform bacteria in water samples using our deep learning-based platform. Throughout our training and blind testing experiments, we used water suspensions spiked with coliform bacteria, including E. coli, K. aerogenes, and K. pneumoniae, and chlorine-stressed E. coli. A chromogenic agar medium designed for the specific detection and counting of E. coli and other coliform bacteria in food and water samples was used as a culture medium for specificity (see the “Methods” section for details). This chromogenic medium results in a blue colour for E. coli colonies and a mauve colour for the colonies of other coliform bacteria (e.g., K. aerogenes and K. pneumoniae). In addition, the medium inhibits the growth of different bacteria (e.g., Bacillus subtilis) or yields colourless colonies in the presence of other bacteria in the sample27.

Following the sample preparation method illustrated in Fig. 2a, the sample is placed inside the lens-free imaging system with the agar surface facing the image sensor. After an initialization step, the platform automatically captures time-lapsed holographic images of two separate Petri dishes (covering a total sample area of 28.26 × 2 = 56.52 cm2) every 30 min over a duration of 24 h starting from the incubation time; these individual holograms are digitally stitched together and rapidly reconstructed to reveal the bacterial growth patterns on the agar surface (see the “Methods” section). The reconstructed images of the sample captured at different time points are computationally processed using a differential image analysis method to automatically detect and classify bacterial growth and colonies using two different trained DNNs (see Fig. 3), which will be detailed next.

Fig. 3: Images captured using the microorganism monitoring system.
figure 3

a Whole agar plate image of mixed E. coli and K. aerogenes colonies after 23.5 h of incubation. b Example images (i.e., amplitude and phase) of the individual growing colonies detected by a trained deep neural network. The time points of detection and classification of growing colonies are annotated with blue arrows. The scale bar is 100 µm

Design and training of neural networks for bacterial growth detection and classification

We designed a two-step framework for bacterial growth detection and classification. The first step selects colony candidates with differential image analysis and refines the results with a detection DNN. We designed a pseudo-3D (P3D) DenseNet28 architecture to process our complex-valued (i.e., phase and amplitude) time-lapse image stacks (see the “Methods” section). In each time-lapse imaging experiment, we used 4 time-consecutive frames (4 × 0.5 = 2 h) as a running window for the differential image analysis to extract individual regions of interest (ROIs) containing objects that changed their amplitude and/or phase signatures as a function of time. These initially detected objects that were extracted by the differential analysis algorithm were either growing colonies or surface impurities, e.g., from spreading the sample on the agar surface, evaporation of air bubbles in the agar plate, or coherent light speckles. We then used a DNN-based detection model to eliminate the nonbacterial objects and only kept the growing colonies (i.e., the true positives), as illustrated in Fig. 2b. We used sensitivity (or true positive rate, TPR) and precision (or positive predictive value, PPV) measurements to quantify our results. Sensitivity is defined as

$${\mathrm{TPR}} = {\mathrm{TP}}/{P}$$

where TP refers to the number of true positive predictions from our system, and P refers to the total number of colonies resulting from manual plate counting after 24 h (i.e., the ground truth). Precision is defined as

$${\mathrm{PPV}} = {\mathrm{TP}}/\left( {{\mathrm{TP}} + {\mathrm{FP}}} \right)$$

where FP refers to the number of false positive predictions from our system.

In total, 13,712 growing colonies (E. coli, K. aerogenes, and K. pneumoniae) and 30,000 non-colony objects captured from 66 separate agar plates were used in the training phase. Another 2597 colonies and 13,078 non-colony objects from 5 independent plates were used as validation dataset to finalize our network models and achieved a TPR of ~95% and a PPV of ~95% once the network converged, which took ~4 h of training time. Examples of the training loss and detection accuracy curves are shown in Supplementary Fig. S1.

The second step further classifies the species of the detected colonies with a classification DNN model following a similar network architecture. To accommodate the different growth rates of bacterial colonies, we used a longer time window in this classification neural network, containing 8 consecutive frames (8 × 0.5 = 4 h) for each sub-ROI. Since the bacterial growth detection network uses a shorter running time window of 2 h, there is a natural 2-h time delay between the successful detection of a growing colony and the classification of its species. The network was trained with 7919 growing colonies, which contained 3362 E. coli, 1880 K. aerogenes, and 2677 K. pneumoniae colonies, and it was validated with 340 E. coli, 205 K. aerogenes, and 988 K. pneumoniae colonies from 6 independent plates and reached a validation classification accuracy of ~89% for E. coli, ~95% for K. aerogenes, and ~98% for K. pneumoniae when the network model converged (Supplementary Fig. S2).

After these network models were finalized through the training and validation data, we tested their generalization capabilities with an additional set of experiments that were never seen by the networks before; the results of these blind tests are detailed next.

Blind testing results for the early detection of bacterial growth

First, we blindly tested the performance of our system in the early detection of bacterial colonies with 965 colonies from 15 plates that were not presented during the network training or validation stages. We compared the predicted number of growing colonies on the sample within the first 14 h of incubation against a ground truth colony count obtained from plate counting after 24 h of incubation time. Each of the 3 sensitivity curves (Fig. 4a–c) were averaged across repeated experiments for the same species, e.g., 4 experiments for K. pneumoniae, 7 experiments for E. coli, and 4 experiments for K. aerogenes, so that each data point was calculated from ~300 colonies. The results demonstrated that our system was able to detect 80% of the true positive colonies within ~6.0 h of incubation for K. pneumoniae, ~6.8 h of incubation for E. coli, and ~8.8 h of incubation for K. aerogenes. In addition, our platform further detected 90% of the true positives after ~1 additional hour of incubation and >95% of the true positive colonies of all 3 species within 12 h. The results also reveal that the early detection sensitivities in Fig. 4a–c are dependent on the length of the lag phase of each tested bacteria species, which demonstrates interspecies variations. For example, K. pneumoniae started to grow earlier and faster than E. coli and K. aerogenes, whereas K. aerogenes did not reach a detectable growth size until 5 h of incubation. Furthermore, when the tails of the sensitivity curves were examined, some of the E. coli colonies showed late “wake-up” behaviour, as highlighted by the purple arrow in Fig. 4b. Although most of the E. coli colonies were detected within ~10 h of incubation time, some of them did not emerge until ~11 h after the start of the incubation phase.

Fig. 4: Sensitivity and precision analysis.
figure 4

Sensitivity of growing colony detection using our trained neural network for aK. pneumoniae, bE. coli, and cK. aerogenes. d Precision of growing colony detection using our trained neural network for all three species. The pink arrow indicates the time for late “wake-up” behaviour for some of the E. coli colonies. e Characterizing the growth speed of chlorine-stressed E. coli using our system. There was an ~2 h delay in colony formation for chlorine-stressed E. coli (orange curve) compared to the unstressed E. coli strain (blue curve). The error bars show the standard deviation values across multiple plates

We also quantified the false positive rate of our platform with the PPV curve shown in Fig. 4d, which was averaged across all the experiments covering all the species, i.e., 965 colonies from 15 agar plates. The precision can be low at the beginning of the experiments (the first 4 h of incubation) because the number of detected true positive colonies is very small, especially for K. aerogenes. This result means that even a single false positive-detected colony can dramatically affect the precision calculation. Nevertheless, the precision quickly rises up to ~100% within 6 h of incubation and is maintained at 99.2–100% for all the tested species after 7 h of incubation.

We should emphasize here that the results presented in Fig. 4 represent the lower limits of the detection capabilities of our system since we calculated these sensitivities with regard to the number of true positive colonies after 24 h of incubation, whereas some of these colonies actually did not exist at the early stages due to delayed growth; stated differently in some cases, there were no colonies present at the early stages of the incubation period. We also note that the rising sensitivity curves in our results stand for the emergence of new bacterial colonies, in addition to the growth of colonies. Even though the sensitivity curves converge to flat lines after 12 h, the colonies continue to grow exponentially until much later. Therefore, our system detects emerging colonies at an early stage, when they first appear, forming microscale features invisible to the naked eye.

These observations also indicate that our system can be very effective and used for high-throughput quantitative studies to better understand microorganism behaviour under different conditions, such as the evaluation of the differences in growth rates between stressed bacteria (e.g., under nutrient deprivation or chlorine treatment) and normal bacteria29,30,31,32,33. There are several reasons to detect and enumerate chlorine-stressed or injured coliform bacteria. First, the detection of injured E. coli or total coliform bacteria is directly related to the sensitivity of the detection platform33. For an effective and sensitive detection platform, false-negative results should be avoided for public health safety. Another important reason is that the detection of injured E. coli or low numbers of E. coli in water samples is correlated with Salmonella outbreaks, a foodborne pathogen causing 1.2 million illnesses and ~500 deaths per year in the United States34, which forms an indirect indicator of contamination in irrigation water35. To evaluate the capabilities of our system to detect injured bacteria, we prepared and imaged 3 agar plates containing chlorine-stressed E. coli (see the “Methods” section) and characterized their growth using our detection workflow, as summarized in Fig. 4e. Our results indicate that we can detect colony formation for chlorine-stressed E. coli on average with an ~2 h delay compared to the regular E. coli strain.

Blind testing results on the classification of growing bacteria

In addition to providing significant detection time savings while also achieving very good sensitivity and precision for the early detection of bacterial growth, our method also provides the automated classification of the corresponding species of the detected bacteria using a trained neural network. Therefore, an additional advantage of our system is its capability to further classify the total coliform subspecies, which is not possible with traditional agar plate counting methods. For example, both K. pneumoniae and K. aerogenes colonies appear mauve in our agar plates. However, since our classification neural network not only relies on the byproducts of colorimetric reactions, it can successfully distinguish between different species based on their unique spatiotemporal growth signatures acquired by our platform at the microscale.

Figure 5 shows our blind testing results on species classification using the same experiments reported in the blinded early detection tests, containing 965 colonies of 3 different species from 15 agar plates. In these results, if a colony was not detected in the previous step (i.e., a false negative event compared to the 24 h reading), then it was naturally not sent to the classification neural network. We defined the recovery rate as the number of colonies correctly classified into their corresponding species using our system divided by the total number of colonies counted after 24 h. As the classification of each individual colony is an independent event, we calculated the recovery rate for each bacterial species (reported in Fig. 5a–c) using all of the colonies detected in the previous step, i.e., 336, 280, and 339 colonies of E. coli, K. aerogenes, and K. pneumoniae, respectively. The shaded area in each curve represents the highest and lowest recovery rates found in all the corresponding experiments at each time point. The classification neural network correctly classified ~80% of all of the colonies within ~7.6, ~8, and ~12 h for K. pneumoniae, E. coli, and K. aerogenes, respectively. We once again emphasize that the results presented in Fig. 5a–c represent the lower limits of the classification capabilities of our system since ground truth is acquired after 24 h of incubation. In reality, at various earlier time points within the incubation period, there was no growth for certain regions of the plates, which exhibited significantly delayed growth. To further demonstrate the classification performance of our trained neural network in a manner that is decoupled from the sensitivity of the previous detection network, we report the classification confusion matrix in Fig. 5d for all the colonies that were sent to the classification network for blind testing at 12 h after the start of the incubation. The trained network achieved classification accuracies of ~97.2%, ~84.0%, and ~98.5% for E. coli, K. aerogenes, and K. pneumoniae, respectively.

Fig. 5: Classification analysis.
figure 5

Classification performance of our trained neural network for aK. pneumoniae, bE. coli, and cK. aerogenes colonies. The green shaded area in each curve represents the highest and lowest recovery rates found in all the corresponding experiments at each time point. d The blind testing confusion matrix of classifying all the colonies that were sent to our trained neural network after 12h of incubation. A diagonal entry of 1.0 means a 100% classification accuracy for that species. The numbers of colonies that were tested by the classification network in d are 325 (E. coli), 334 (K. pneumoniae), and 256 (K. aerogenes)

Limit of detection as a function of the total test time

We further quantified the detection limit of our system and compared its performance against both Colilert®-18, which is an EPA-approved method, and traditional plate counting (Supplementary Table S1, Supplementary Fig. S3). To compensate for the CFU loss during the sample transfer from the water suspension to the filter membrane, we introduced a signal amplification step by preincubating the water sample under test, mixing it with a growth medium for 5 h at 35 °C before the filtration step (see the “Methods” section for details). For each measurement, two agar plates were prepared and monitored at the same time for comparison, one of which was for the sample amplified with a 5-h preincubation step before filtering, while the other was for the sample directly filtered and transferred to the agar plate (see Supplementary Fig. S3). Both plates were incubated for the same amount of time at each imaging time point to provide a fair comparison between the two. The measurements were repeated using different concentrations of E. coli suspensions; these concentrations were compared to the average of three replicates of the same samples prepared using the Colilert®-18 method (Supplementary Fig. S3). As shown in Fig. 6a, our system is able to surpass the sensitivity of Colilert®-18 within ~8 h in total (including the time for signal amplification, sample concentration, and time-lapse imaging, altogether) and reach >2 times the sensitivity of Colilert®-18 in ~9 h. We also quantified the LOD of our system by preparing and imaging 3 agar plates without bacteria, which show on average <1 CFU count from our setup throughout the test period from 5 to 14.5 h (Fig. 6c), revealing a detection limit of µ + 3σ = ~2 CFU per test, where µ and σ refer to the mean and standard deviation of the detected CFU count, respectively. Due to the effective signal amplification enabled by the preincubation step, even with the lowest bacterial concentration of ~1 CFU/L, our system was able to detect 2 CFU at 8.5 h and 12 CFU at 9 h; in comparison, for the same contaminated water sample, Colilert® -18 achieved 1.4 ± 1.6 CFU/L after 18 h of incubation. Furthermore, for all the concentrations we have experimented with (~1–160 CFU/L), our system successfully detected more than 2 CFU per test in ≤9 h of test time, including all the necessary steps, i.e., the time for signal amplification, sample concentration, and time-lapse imaging; these results reveal that our system with a preincubation step achieves a detection limit of ~1 CFU/L within ≤9 h of total test time.

Fig. 6: Quantification of the LOD of our system.
figure 6

a The CFU count from our system is plotted against the CFU/L counts of the spiked samples, calculated independently using the Colilert®-18 method after 18 h of incubation. CFU counts acquired with our platform at different time points are coloured from blue to yellow, which corresponds to 5–14.5 h of total test time, including the signal amplification step that involves liquid culture media (5 h). b Without signal amplification, the LOD is decreased due to the low transfer rate from the filter membrane to the agar surface (see Supplementary Figs. S3 and S4). c As a control experiment, we prepared and imaged 3 agar plates that showed <1 CFU count from our setup throughout the test period from 5 to 14.5 h. d The LOD of our system is ~11 CFU/L at 8.5 h and ~1 CFU/L at ≤9 h

We also observe in Fig. 6b that without the signal amplification enabled by preincubation, the detection performance is negatively affected due to the low transfer rate of bacteria from the container to the agar plate (also see Supplementary Fig. S4). In general, the sensitivity and LOD of our method might be further improved by increasing the preincubation time of the water-broth mixture at the cost of an increase in the total time to achieve automated detection and classification.


We demonstrated a new platform for the early detection and classification of bacterial colonies, which is fully compatible with the existing EPA-approved methods and can be integrated with them to considerably improve the analysis of agar plates36. The presented approach can automatically detect bacterial growth as early as 3 h and can detect 90% of bacterial colonies within 7–10 h (and >95% within 12 h), with a precision of 99.2–100%. The system also correctly classifies ~80% of all of the tested bacterial colonies within 7.6, 8.8, and 12 h for K. pneumoniae, E. coli, and K. aerogenes, respectively. These results present a total time savings of more than 12 h compared to the gold-standard methods (e.g., Colilert test and Standard Method 9222B), which require 18–24 h. The presented learning-based bacteria detection and classification framework can potentially be further advanced by training it with a larger number of sample types20, and it can also be applied to other bacteria sensing applications beyond water quality monitoring. In addition to the automated detection of live bacteria and species classification, the rich spatiotemporal information embedded in the holographic images can be used for more advanced analysis of water samples and microbiology research in general.

Another advantage of this system is its high-throughput imaging capability of agar plates. Our prototype performs a 242-tile scan within 87 s per agar plate, corresponding to a raw image scanning throughput of ~49 cm2/min. To leave sufficient data redundancy for image postprocessing, we set a relatively large overlap of 30% on each side of the acquired holographic image, which reduces the effective imaging throughput of our platform to ~24 cm2/min. As our system is based on lens-free holographic microscopy, it does not require mechanical axial focusing at each position and instead autofocuses onto the object plane computationally. We characterized the spatial resolution of our system by imaging a resolution test target, as shown in Supplementary Fig. S5, achieving a linewidth resolution of ~3.5 µm, roughly equivalent to the performance of a 4× objective lens with a numerical aperture (NA) of ~0.1. Compared to our system, which takes 87 s to scan an agar plate, a traditional lens-based bright-field microscope using a 4× objective lens would take approximately 128 min to scan a plate with the same diameter (60 mm), owing to the requirement for mechanical axial focusing (see Supplementary Table S2). In addition, the holographic imaging that is at the heart of this system provides better performance for early colony detection over bright-field imaging. Since bacteria can be considered phase objects, growth-related changes in a holographic image are enhanced compared to the bright-field images, enabling the earlier detection of bacterial growth and more sensitive measurements (see Fig. 3b).

Another important advantage of our system is the minimum requirement for optical alignment; the presented platform is tolerant towards structural changes, such as variations in the sample-to-sensor distance or the illumination angle. Our computational refocusing capability also enables the screening of thick samples, e.g., melted agar plates37. An example of a 3D sample is illustrated in Supplementary Fig. S6, where E. coli colonies are formed at different depths inside the solid culture medium with a thickness of ~5 mm. For example, the colony marked with “A” grew at ~2170 µm measured from the surface of the agar, whereas the colony marked with “B” was on the agar surface. Our system localizes colonies growing at different depths within a 3D culture medium using a single hologram measurement at each scanning position. However, it is a nontrivial task to image a 3D sample using a conventional lens-based microscope because of the time required for mechanical focusing and the refractive index mismatch between the culture medium and the air, which degrades the image resolution as a result of aberrations. Therefore, the corresponding bright-field microscopy images of the whole plates could only be acquired after 24 h of incubation.

Our platform also employs a modular design that is scalable to a larger sample size and a smaller tile-scan time interval. The monitoring field of view (FOV) of this platform is fundamentally limited by the image acquisition time and the stage moving speed. With further optimization of the hardware and control algorithms, an imaging throughput of >50 cm2/min can be reached. Alternatively, several image sensors can be installed and connected to a single computer for high-throughput parallel imaging38. In our proof-of-concept implementation, our image processing for each time interval takes ~20 min and fits well into our 30 min measurement period between each scan. In case a shorter time interval is desired, an image processing procedure implemented using MATLAB and Python/PyTorch programming environments can be further accelerated by programming in C/C++. With the help of graphic processing units (GPUs), one can expect >10-fold time savings in computation39.

This unique platform is integrated with an incubator to keep the agar plates at a desired temperature. The incubator is a thermal glass plate that contains uniform lines of optically clear indium tin oxide electrode for heating the sample placed on top. This system is controlled with a controller, which is lightweight. Throughout the experiments, we set the temperature at the agar surface where bacteria grew at ~38 °C so that all of the tested bacterial species could grow and develop colonies. This temperature was not optimized to promote the growth of a specific species. Therefore, the adjustment of the incubation environment, temperature and humidity can potentially be used to further accelerate colony growth and help us achieve earlier detection and identification of specific bacterial colonies. Another important parameter for the growth of microorganisms is the humidity. Our system can also be integrated with a controlled humidity chamber for better control and analysis of the growth dynamics of various microorganisms40.

In summary, we presented a deep learning-based live bacteria monitoring system for the early detection of growing colonies and the classification of colony species using deep learning. We demonstrated a proof-of-concept device using 3 types of bacteria, i.e., E. coli, K. aerogenes, and K. pneumoniae, and achieved >12 h time savings for both the early detection and the classification of growing species compared to the gold-standard EPA-approved methods. Achieving an LOD of ~1 CFU/L in ≤ 9 h, we believe that this versatile system will not only benefit water and food quality monitoring but also provide a powerful tool for microbiology research.

Materials and methods

Sample preparation

Safety practices

We handled all the bacterial cultures and performed all the experiments at our Biosafety Level 2 laboratory in accordance with the environmental, health, and safety rules of the University of California, Los Angeles.

Studied organisms

We used E. coli (Migula) Castellani and Chalmers (ATCC® 25922™) (risk level 1), K. aerogenes Tindall et al. (ATCC® 49701™) (risk level 1), and K. pneumoniae subsp. pneumoniae (Schroeter) Trevisan (ATCC®13883™) (risk level 2) as our culture organisms.

Preparation of the poured agar plates

We used CHROMagar™ ECC (product no. EF322, DRG International, Inc., Springfield, NJ, USA) chromogenic substrate mixture as the solid growth medium for the detection of E. coli and total coliform colonies. CHROMagar™ ECC (8.2 g) was mixed with 250 mL of reagent grade water (product no. 23-249-581, Fisher Scientific, Hampton, NH, USA) using a magnetic stirrer bar. The mixture was then heated to 100 °C on a hot plate while being stirred regularly. After cooling the mixture to ~50 °C, 10 mL of the mixture was dispensed into Petri dishes (60 mm × 15 mm) (product no. FB0875713A, Fisher Scientific, Hampton, NH, USA). The agar plates were allowed to solidify, were sealed using parafilm (product no. 13-374-16, Fisher Scientific, Hampton, NH, USA), and were covered with aluminium foil to keep them in the dark before use. The plates were stored at 4 °C and were used within two weeks of preparation.

Preparation of the melted agar plates

CHROMagar™ ECC (3.28 g) was mixed with 100 mL of reagent grade water using a magnetic stirrer bar, and the mixture was heated to 100 °C. After the mixture cooled to ~40 °C, 1 mL of the bacterial suspension was mixed with the agar and dispensed into Petri dishes. The plates were either incubated in a benchtop incubator (product no. 51030400, ThermoFisher Scientific, Waltham, MA, USA) or in our imaging platform (for monitoring the bacterial growth digitally).

We used tryptic soy agar to culture E. coli at 37 °C and K. aerogenes at 35 °C and nutrient agar to culture K. pneumoniae at 37 °C. Twenty grams of tryptic soy agar (product no. DF0369-17-6, Fisher Scientific, Hampton, NH, USA) or 11.5 g of nutrient agar (product no. DF0001-17-0, Fisher Scientific, Hampton, NH, USA) were suspended in 500 mL of reagent grade water using a magnetic stirrer bar. The mixture was boiled on a hot plate and then autoclaved at 121 °C for 15 min. After the mixture cooled to ~50 °C, 15 mL of the mixture was dispensed into Petri dishes (100 mm × 15 mm) (product no. FB0875713, Fisher Scientific, Hampton, NH, USA), which were then sealed with parafilm and covered with aluminium foil to keep them in the dark before use. The Petri dishes were stored at 4 °C until use.

Preparation of the chlorine-stressed E. coli samples

We used E. coli grown on tryptic soy agar plates and incubated for 48 h at 37 °C. Disposable centrifuge tubes (50 mL) were used as a sample container, and the sample size was 50 mL. Five hundred millilitres of reagent grade water was filtered for sterilization using a disposable vacuum filtration unit (product no. FB12566504, Fisher Scientific, Hampton, NH, USA). A fresh chlorine suspension was prepared in a 50 mL disposable centrifuge tube to a final concentration of 0.2 mg/mL using sodium hypochlorite (product no. 425044, Sigma Aldrich, St. Louis, MO, USA), mixed vigorously, and covered with aluminium foil41. Sodium thiosulfate (10% [w/v]) (product no. 217263, Sigma Aldrich, St. Louis, MO, USA) in reagent grade water was prepared, and 1 mL of the solution was filtered using a sterile disposable syringe and a syringe filter membrane (product no. SLGV004SL, Fisher Scientific, Hampton, NH, USA) for sterilization. Water suspensions were prepared by spiking E. coli into filtered water samples. Fifty microlitres of the chlorine suspension (i.e., 0.2 ppm) was added to the test water sample, and a timer counted the chlorine exposure time. The reaction was stopped at 10 min of chlorine exposure by adding 50 µL sodium thiosulfate into the test water sample and vigorously mixing the solution to immediately stop the chlorination reaction. CHROMagar™ ECC plates were inoculated with 200 µL of the chlorine-stressed suspension, were dried in the biosafety cabinet for at most 30 min and then were placed on the setup for lens-free imaging. In addition, three TSA plates and one ECC ChromoSelect Selective Agar plate (product no. 85927, Sigma Aldrich, St. Louis, MO, USA) were inoculated with 1 mL of the control sample (not exposed to chlorine) and 0.2 ppm of the chlorine-stressed E. coli water sample and dried under a biosafety cabinet for approximately 1–2 h with the gentle mixing of Petri dishes at some time intervals. After drying, the plates were sealed with parafilm and incubated at 37 °C for 24 h. After incubation, the bacterial colonies grown on the agar plates were counted, and the E. coli concentrations of the control samples and chlorine-stressed E. coli samples were compared. If the achieved reduction in colony count was between 2.0 and 4.0 log, then the images of CHROMagar™ ECC plates captured using the lens-free imaging platform were used for further analysis.

Preparation of the culture plates for lens-free imaging

A bacterial suspension in a phosphate-buffered solution (PBS) (product no. 20-012-027, Fisher Scientific, Hampton, NH, USA) was prepared every day from a solid agar plate incubated for 24 h. The concentration of the suspension was measured using a spectrophotometer (model no. ND-ONE-W, Thermo Fisher), and the suspension was then diluted in PBS to a final concentration of 1–200 CFU per 0.1 mL. One hundred microlitres of the diluted suspension was spread on a CHROMagar™ ECC plate using an L-shaped spreader (product no. 14-665-230, Fisher Scientific, Hampton, NH, USA). The plate was covered with its lid, inverted, and incubated at 37 °C in our optical platform (Fig. 2).

Preparation of a concentrated broth

A total of 180 g of tryptic soy broth (product no. R455054, Fisher Scientific, Hampton, NH, USA) was added to 1 L reagent grade water and heated to 100 °C by continuously mixing using a stirrer bar. The suspension was then cooled to 50 °C and filter sterilized using a disposable filtration unit. The broth concentrate was stored at 4 °C and used within 1 week after preparation.

Preparation of samples for comparison measurements

We evaluated the performance of our method in comparison to Colilert®-18, which is an EPA-approved enzyme-based analytical method for several types of regulated water samples (e.g., drinking water, surface water, and ground water) to detect E. coli42 and for plate counting using TSA plates and ECC ChromoSelect Selective Agar plates (Supplementary Fig. S3). Two bottles of 1 L reagent grade water were filtered using disposable vacuum filtration units and 0.2 L of the concentrated broth was added into one of the 1 L sample bottles. The bottles were covered with aluminium foil and stored in a biosafety cabinet overnight. A glass vacuum filtration unit was used for the filtration of the 1 L water samples. The components of the unit were covered with aluminium foil and sterilized using an autoclave. The disposable nitrocellulose filter membranes (product no. HAWG04705, EMD Millipore, Danvers, MA, USA) used in the glass filtration unit were also sterilized using the autoclave. A bacterial suspension was prepared by spiking bacteria into 50 mL reagent grade water using a disposable inoculation loop from a TSA plate containing E. coli colonies. The suspension was mixed gently to obtain a uniform distribution of bacteria. Three TSA plates, 3 ECC ChromoSelect Selective Agar plates, and 4 CHROMagar™ ECC plates were removed from the refrigerator and were kept at room temperature for 30 min.

Three bottles of 120 mL disposable vessels with sodium thiosulfate (product no. WV120SBST-200, IDEXX Laboratories Inc., Westbrook, ME, USA) were filled with 100 mL filter sterilized reagent grade water. First, 0.1 mL of bacterial suspension was spiked into a 1 L water sample, a 1.2 L water sample (1 L water + 0.2 L concentrated broth), 3 bottles of 100 mL water samples, 3 TSA plates and 3 ECC ChromoSelect Selective Agar plates, sequentially. The timer was started immediately after adding the spike into the suspensions.

First, the suspensions on TSA plates and ECC ChromoSelect Selective Agar were spread using L-shaped disposable spreaders. Then, the water sample with broth was mixed for approximately one minute and then stored at 35 °C for 5 h. One Colilert®-18 reagent (product no. 98-27164-00, IDEXX Laboratories Inc., Westbrook, ME, USA) was added into each 100 mL bacterial suspension, and the mixture was shaken. The content of the bottle was poured into a Quanti-Tray 2000 bag (product no. 98-21675-00, IDEXX Laboratories Inc., Westbrook, ME, USA), and after removing bubbles in each well, the bag was sealed using Quanti-Tray Sealer (product no. 98-09462-01, IDEXX Laboratories Inc., Westbrook, ME, USA). Three bags sealed and labelled with the experimental details were incubated at 35 °C for 18 h. Next, 30 mL filtered reagent grade water was used to moisturize the membrane in the glass filtration unit, and then an E. coli-contaminated 1 L water sample was filtered at a pressure of 50 kPa. The bottle was rinsed using 150 mL of sterilized reagent grade water, and the solution was filtered on the unit (Supplementary Fig. S7). The funnel was rinsed twice using 50 mL of sterilized reagent grade water. After the filtration was complete, the membrane was removed and placed onto a CHROMagar™ ECC plate face down. Gentle pressure was applied on the membrane using a tweezer to remove any air bubbles between the agar and the membrane. Then, 30 g of weight was placed on the membrane to provide continuous pressure during the transfer of bacteria from the membrane to the agar plate (Supplementary Fig. S8). After 5 min of incubation, the membrane was gently peeled off from the agar surface and placed into another agar facing up. The agar containing the membrane was incubated at the benchtop incubator at 35 °C, and the agar containing the transferred bacteria was incubated at the lens-free imaging platform for time-lapse imaging. After 5 h of incubation, the bottle containing 1.2 L suspension was filtered using the same procedure as described before for filtration of a 1 L sample. The agar plate containing the transferred bacteria was incubated at the second sample tray of the lens-free imaging setup for time-lapse imaging, while the agar containing the membrane was incubated at the benchtop incubator.

Design of the high-throughput time-resolved microorganism monitoring platform

Our platform consists of five modules: (1) a holographic imaging system, (2) a mechanical translational system, (3) an incubation unit, (4) a control circuit, and (5) a controlling program. Each module is explained in detail below.

  1. i.

    We used fibre-coupled partially coherent laser illumination (SC400-4, Fianium Ltd., Southampton, UK), with the wavelength and intensity controlled through an acousto-optic tunable filter (AOTF) device (Fianium Ltd., Southampton, UK). The device was remotely controlled with a customized program written in the C++ programming language and ran on a controlling laptop computer (product no. EON17-SLX, Origin PC). The laser light was transmitted through the sample, i.e., the agar plate that contains the bacterial colonies, and forms an inline hologram on a CMOS image sensor (product no. acA3800-14 µm, Basler AG, Ahrensburg, Germany) with a pixel size of 1.67 μm and an active area of 6.4 mm × 4.6 mm. The CMOS image sensor was connected to the same controlling laptop computer through a universal serial bus (USB) 3.0 interface and was software-triggered within the same C++ program. The exposure time at each scanning position was precalibrated according to the intensity distribution of the illumination light and ranged from 4 to 167 ms. The images were saved as 8-bit bitmap files for further processing.

  2. ii.

    The mechanical stage was customized with a pair of linear translation rails (Accumini 2AD10AAAHL, Thomson, Radford, VA, USA), a pair of linear bearing rods (8 mm-diameter, generic), and linear bearings (LM8UU, generic), and it was aided by parts printed by a 3D printer for the joints and housing (Objet30 Pro, Stratasys, Minnesota, USA). The 2D horizontal movement was powered by two stepper motors (product no. 1124090, Kysan Electronics, San Jose, CA, USA)—one for each direction, and these motors were individually controlled using stepper motor controller chips (DRV8834, Pololu Las Vegas, NV, US). To minimize the backslash effect, the whole Petri dish was scanned following a raster scan pattern.

  3. iii.

    The incubation unit was built with the top heating plate of a microscope incubator (INUBTFP-WSKM-F1, Tokai Hit, Shizuoka, Japan), and it was housed by a 3D frame printed by a 3D printer. The Petri dish containing the sample was placed on the heating plate with the surface having bacteria facing downwards. The temperature was controlled by a paired controller that maintained a temperature of 47 °C on the heating plate, resulting in a temperature of 38 °C inside the Petri dish.

  4. iv.

    The control circuit consisted of three components: a microcontroller (Arduino Micro, Arduino LLC) communicating with the computer through a USB 2.0 interface, two stepper motor driver chips (DRV8834, Pololu Las Vegas, NV, US) externally powered by a 4.2 V constant voltage power supply (GPS-3303, GW Instek, Montclair, CA, US), and a metal–oxide–semiconductor field-effect transistor-based digital switch (SUP75P03-07, Vishay Siliconix, Shelton, CT, United States) for controlling the CMOS sensor connection.

  5. v.

    The controlling program included a graphical user interface and was developed using the C++ programming language. External libraries including Qt (v5.9.3), AOTF (Gooch & Housego), and Pylon (v5.0.11) were integrated.

Data acquisition

We prepared inoculated agar plates of pure bacterial colonies (see the Sample Preparation subsection under the “Methods” for details) and captured images of an entire agar plate at 30-min intervals. The illumination light was set to a wavelength of 532 nm and an intensity of ~400 μW. To maximize the image acquisition speed, the captured images were first saved into a computer memory buffer and then were written to a hard disk by another independent thread. At the end of each experiment (i.e., after 24 h of incubation), the sample plate was imaged using a benchtop scanning microscope (Olympus IX83) in reflection mode, and the resulting images were automatically stitched to a full-FOV image, used for comparison. Subsequently, the plate was disposed of as solid biohazardous waste. We populated the data (i.e., time-lapse lens-free images) corresponding to ~6969 E. coli, ~2613 K. aerogenes, and ~6727 K. pneumoniae individual bacterial colonies to train and validate our models. Another 965 colonies of 3 different species from 15 independent agar plates were used to blindly test our machine learning models.

Image processing and analysis

The acquired lens-free images were processed using custom-developed image processing and deep learning algorithms. Five major image processing steps were used for the early detection and automated classification and counting of colonies. These steps are described in detail below.

Image stitching to obtain the image of the entire plate area

Following the acquisition of holographic images using the multi-threading approach, all the images within a tile-scan of the whole Petri dish per wavelength were merged into a single full-FOV image. During a tile scan, the images were acquired with ~30% overlap on each side of the image to calculate the relative image shifts against each other. For each image, the relative shifts against all four of the neighbouring images were calculated using a phase correlation43 method, followed by an optimization step that minimized an object function, as defined by

$$\arg \mathop {{\min }}\limits_{T_{VF}} {\sum\limits_{A \in V\backslash \{ F\}}}\left( {{\sum\limits_{B \in V\backslash \{ F\}}} {\left\| {\vec t_{AF} - \vec t_{BF} - \vec p_{AB}} \right\|^2}}\right)$$

where V is the set of all tile images, \(F \in V\) is a fixed image, e.g., the image captured at the centre of the sample Petri dish, \(\vec t_{AB}\) stands for the relative position of image A with respect to image B, and \(\vec p_{AB}\) is the local shift between images A and B, calculated by the phase correlation method using the overlapping regions of the two neighbouring images, which can be formulated as

$$\vec p_{AB} = \left( {\Delta x,\Delta y} \right) = \arg \mathop {{\max }}\limits_{(x,y)} {\cal{F}}^{ - 1}\left\{ {\frac{{{\cal{F}}\{ A\} \cdot {\cal{F}}\{ B\} ^ \ast }}{{\left| {{\cal{F}}\{ A\} \cdot {\cal{F}}\{ B\} ^ \ast } \right|}}} \right\}$$

where \({\cal{F}}\)is the Fourier transform operator and \({\cal{F}}^{ - 1}\) is the inverse Fourier transform operator. The optimal configuration \(T_{VF} = \left\{ {\vec t_{AF}:A,F \in V} \right\}\) represents the relative positions of all the images with respect to the fixed image F, and it was used as the global position of each tile image for full-FOV image stitching. To eliminate tiles with a low signal-to-noise ratio that lead to incorrect local shift estimation values, a correlation threshold of 0.3 was applied during the optimization, meaning that if the cross-correlation coefficient of the overlapped parts of two images was below 0.3, the shift calculation was discarded. Once the positions of all of the tiles were obtained, they were merged into a full-FOV image of the whole Petri dish using linear blending. We defined a full-FOV image of the whole Petri dish as a “frame”. All the frames were normalized so that the mean value was 50, and they were saved as unsigned 8-bit integer (0–255) arrays.

Colony candidate selection by differential analysis

When a new frame was acquired at time t, it was cross-registered to the previous frame at time t − 1 and then digitally back-propagated to the sample plane44,45 to obtain the complex light field

$$\widetilde B_t = {\mathrm{P}}(F_t,{\mathbf{z}})$$

where Ft is the frame at time t, z is a surface normal vector of the sample plane obtained by digital autofocusing46 at 50 randomly spaced positions, and P denotes the angular spectrum-based back-propagation operation44,45, which can be calculated by multiplying the spatial Fourier transform of the input signal and the following transfer function

$$H_k(\nu _x,\nu _y) = \left\{{\begin{array}{*{20}{c}} {\exp \left[ { - j \cdot 2\pi \frac{{n \cdot z}}{\lambda }\sqrt {1 - \left( {\frac{\lambda }{n}\nu _x} \right)^2 - \left( {\frac{\lambda }{n}\nu _y} \right)^2} } \right]} & {\left( {\nu _x^2 + \nu _y^2 \le \left( {\frac{n}{\lambda }} \right)^2} \right)} \\ 0 & {{\mathrm{otherwise}}} \end{array}}\right.$$

where n is the refractive index of the medium, λ is the illumination wavelength, and vx and vy are the spatial frequencies. This operation was followed by an inverse 2D Fourier transform. The resulting complex-valued reconstruction provides both the amplitude and phase images of the illuminated objects. To accommodate the large FOV of a stitched frame (36,000 × 36,000 pixels), digital back-propagation was performed with 2048 × 2048-pixel blocks, which were then merged together.

Four consecutive frames were taken, i.e., from t − 3 to t, and a differential image was calculated defined by

$$D_t = {\mathrm{HP}}\left[ {{\mathrm{LP}}\left( {\frac{1}{3}\mathop {\sum}\limits_{\tau = t - 2}^t {\left| {\tilde B_\tau - \tilde B_{\tau - 1}} \right|} } \right)} \right]$$

where Dt is the differential image at time t, \(\tilde B_t\) represents the complex light field obtained by back-propagating frame t, and LP and HP represent low-pass and high-pass image filtering, respectively. The HP filter removes the differential signal from a slowly varying background (unwanted term), and the LP filter removes the high-frequency noise-introduced spatial patterns. The LP and HP filter kernels were empirically set to 5 and 100, respectively.

Following the differential image calculation, we selected regions in the differential image with >50 connective pixels that are above an intensity threshold, which was empirically set to 12. These regions are marked as colony candidates, as they give a differential signal over a period of time (covering four consecutive frames). However, some of the differential signals come from nonbacterial objects, such as a water bubble or surface movement of the agar itself. Therefore, we also used two DNNs to select the true candidates and classify their species.

DNN-enabled detection of growing bacterial colonies

Following the colony candidate selection process outlined earlier, we cropped out candidate regions of 160 × 160 pixels (~267 µm × 267 µm) across the four back-propagated consecutive frames and separated the complex field into amplitude and phase channels. Therefore, each candidate region is represented by a 2 × 4 × 160 × 160 array. This four-dimensional (phase/amplitude–time–xy) data format differs from the traditional three-dimensional data used in image classification tasks and requires a custom-designed DNN architecture that accounts for the additional dimension of time. We designed our DNN by following the block diagram of DenseNet28 and replaced the 2D convolutional layers with P3D convolutional layers47, as shown in Supplementary Fig. S9. Our network was implemented in Python (v3.7.2) with the PyTorch Library (v1.0.1). The network was randomly initialized and optimized using an adaptive moment estimation (Adam) optimizer48 with a starting learning rate of 1 × 10−4 and a batch size of 64. To stabilize the accuracy of the network model, we also set a learning rate scheduler that decayed the learning rate by half every 20 epochs. Approximately, 16,000 growing colonies and 43,000 non-colony objects captured from 71 agar plates were used in the training and validation phases. The best network model was selected based on the best validation accuracy. Data augmentation was also applied by random 90°-rotations and flipping operations in the spatial dimensions. The whole training process took ~5 h using a desktop computer with dual GPUs (GTX1080Ti, Nvidia). The decision threshold value after the softmax layer was set to 0.5 during training, i.e., positive for softmax value >0.5 and negative for softmax value <0.5, which implies equal penalty to false-positive and false-negative events. We adjusted the threshold value to 0.99, empirically based on the training dataset before blind testing, to favour fewer false-positive events.

DNN-enabled classification of the bacterial colony species

Once the true bacterial colonies are selected, they grow for another 2 h to collect 8 consecutive frames, i.e., 4 h, and then are sent to the second DNN as a 2 × 8 × 288 × 288 array for the classification of colony species. To perform the classification task, this time, the training data only contain the true colonies and their corresponding species (ground truth). The network follows a similar structure and training process as the detection model, as illustrated in Supplementary Fig. S9. The network was randomly initialized and optimized using the Adam optimizer48, with a starting learning rate of 1 × 10−4 and a batch size of 64. The learning rate decayed by 0.9 times every 10 epochs. To avoid overfitting to a specific plate, we discarded colony images extracted from extremely dense samples (>1000 CFU per plate). As a result, approximately 9400 growing colonies were used in the training and validation of the classification model. The whole training process took ~15 h using a desktop computer with dual GPUs (GTX1080Ti, Nvidia).

Colony counting

The respective ground truth information on the growing colonies in each experiment was created after the sample was incubated for >24 h. At the boundary of the plate, the agar always forms a curved surface owing to surface tension, thereby distorting the images of the colonies. Therefore, we limited the effective imaging area to a 50 mm-diameter circle in the centre of the agar plate. In cases where multiple colonies are closely spaced and eventually merge into one large colony (e.g., towards the end of the 24 h incubation period), we then used lens-free time-lapsed images to verify the true colony number when detected by our method to avoid overcounting.

Calculation of the imaging throughput

In Supplementary Table S2, we compared the imaging throughput of our system and a conventional lens-based scanning microscope in terms of the space-bandwidth product49 using the following formula:

$$N_{\mathrm{I}} = \alpha \cdot {\mathrm{FOV}} \cdot r^2/\delta ^2$$

where NI is the effective pixel count of a frame, δ is the half-pitch resolution, r is the digital sampling factor along the x and y directions, α = 2 represents the independent spatial information contained in the phase and amplitude images of the holographic reconstruction, and α = 1 represents the amplitude-only information contained in an image captured using the standard lens-based bright-field scanning microscope. In the lens-based microscope, we used a colour camera with a pixel size of 7.4 µm. Therefore, for a 4× objective lens, the image resolution is limited to ~3.7 µm, owing to the Nyquist sampling limit. Without loss of generality, we set r = 250.