Enabling reactive microscopy with MicroMator

Microscopy image analysis has recently made enormous progress both in terms of accuracy and speed thanks to machine learning methods and improved computational resources. This greatly facilitates the online adaptation of microscopy experimental plans using real-time information of the observed systems and their environments. Applications in which reactiveness is needed are multifarious. Here we report MicroMator, an open and flexible software for defining and driving reactive microscopy experiments. It provides a Python software environment and an extensible set of modules that greatly facilitate the definition of events with triggers and effects interacting with the experiment. We provide a pedagogic example performing dynamic adaptation of fluorescence illumination on bacteria, and demonstrate MicroMator’s potential via two challenging case studies in yeast to single-cell control and single-cell recombination, both requiring real-time tracking and light targeting at the single-cell level.


Supplementary Note 1: Software Description
The software was designed around different modules, such as image analysis, microscope control and a custom event system, which all come together and allows the user to design and create interactive feedback loops. The other core features of MicroMator's modular design include a clear and flexible API, a set of reactive image analysis modules, a microscope module that uses Micro-Manager's generic and open source framework and its Python library, pymmcore, and an extensive logging system and metadata management system.
MicroMator is open-source, and can be adapted to the user's needs. The main interface point is the event system, which is described in more detail below. One may also design their own modules, where users can do more complex tasks (see our model predictive control module Stoched as an example), but a more complete understanding of how MicroMator works is required to integrate these types of modules. Here we present brief descriptions of the software itself, MicroMator Core, along with the modules we have written to perform the experiments in the paper. The software is available on Gitlab, which includes the modules required to replicate the experiments presented here and in the main text. More information about using the software is available in the README.md file on the Gitlab page.

MicroMator Core
MicroMator Core is the system that links together different modules in a unified framework. Importantly, it manages the events of the system. Events consist of Triggers and Actions. Events are constantly checked by MicroMator Core. Users can write their own events that then interface with the rest of the system. A Trigger is a Python class that contains a method named check, which returns a Boolean indicating if a given action should take place. Each event is called throughout the experiment, and at that time the trigger.check() is called. If the check returns True, then the corresponding Action is called. Actions should have a method act, which describes what should happen when the event has been triggered. We use multithreading to constantly check if each event is available (i.e. it is not currently doing something) or busy. If a check returns True while an event is busy, the event is queued. Note that we do not verify the satisfaction of real-time constraints. Triggered events are queued and treated in order. If the user requires that an imaging routine lasting 4 minutes is triggered every 3 minutes, delays will accumulate and real-time issues will happen.
Importantly, MicroMator Core also handles all of the logging of the system. It records what each module is doing, and saves all of the logs into a consistent format. The log files are critical to debugging the system. For example, in an experiment with multiple fields of view, each of which is performing a different light stimulation, the logging system will record what the software is doing at each position, which modules were used, and any module or event specific action. Next, we will describe the modules that are included in MicroMator.

Microscope Module
The Microscope Module is in charge of acquiring the raw images, according to the acquisition protocol and the different channels and positions the user has pre-set at the start of the experiment. Micro-Manager's GUI is used to generate the experiment's initial conditions, such as the number of time steps, the different channels and the interesting positions for the user.

Image Analysis Module
MicroMator currently includes two image analysis modules. One module is called Segmator and uses DeLTA [1] and TrackPy [2] for image segmentation and cell tracking. The other module simply implements image segmentation and is based on Cellpose [3]. These modules uses modern deep learning approaches to segment bacterial or yeast cells based on brightfield images. Both are based on a U-Net architecture, a convolutional neural network with a structure that excels at image segmentation [4,5]. While deep learning approaches are computationally expensive to train, evaluations of a single 1024x1024 image are very fast and can accurately segment dense fields of cells in under a couple of seconds. The user can segment and track cells in real time within multiple fields of view. This is required for most reactive microscopy applications. We also provide interfaces to CellStar [6].
The modules generate and iteratively update a Pandas dataframe as data is collected [7]. Depending on the experiment, this data file contains identities for each cell, as well as the position, segmentation contour, cell size, fluorescence and other individual cell data that may be used to be reactive within the experiment.

Single-Cell Control Module
This is the module which implements our model predictive control for EL222 population and single-cell models, which we call Stoched. The module works by sorting the entire population at a given time into on or off cells according the control program, which is described below in Section 5. For the population control, by definition all cells are assigned as either on or off. However, for the single cell control experiments, different individual cells are assigned identities, and tracking is required. The module interfaces with individual cell identities using a main data file, which is a Pandas dataframe that is iteratively updated with tracking information, segmentation information, and fluorescence measurements. As mentioned above, this file is generated and updated in real time using the SegMator module [7]. This file is the input to the control algorithms for both single-cell and population based control. The module then interfaces with the DMD to apply light to selected cells.
For the individual cell control, each cell is running its own stochastic model. Because we took a computationally expensive approach to solving the model using the Finite State Projection (FSP) (see Section 5 for mathematical details), it would be computationally infeasible to solve the control problem for each cell in series. We developed Stoched so that when one has multiple controllers the control problem can be performed across multiple processes. We used Python's multiprocessing module to spawn a new process for each controller, which allowed us to solve the control problem within the 6 minutes we had between measurement updates for each cell.

Discord Bot Description
The discord bot is a web app running on the microscope's computer and connected to the discord communication software. It allows the user to check the state of the experiment from anywhere with an internet connection. Through chat commands the user can inspect the last images taken by MicroMator and have access to MicroMator's log files to see if an error occurred. Furthermore, the bot inspects the Python logs every frame looking for exceptions and errors, and will notify the user on discord if such errors are detected. Currently, the bot is capable of reading MicroMator data but not of modifying or cancelling the experiment as the experiment continues.

Metadata Management and Logging
MicroMator also feature some metadata management and stores all the parameters in a few separate objects: Protocol class stores all the information regarding the microscope acquisition loop.
Channel class stores all the channels the user will need for an experiment.
Position class stores all the different positions the user saves for the experiment.
It is from these objects that the various modules get their parameters. When the experiment starts, MicroMator creates various folders and subfolders to store all the relevant information that is generated during the experiment, i.e. raw images, analyzed images, protocols, positions, and logs.
One advantage of MicroMator is the centralized logging capabilities. All of the included modules extensively write to a single log-file. User created modules are also able to write to the same logger object.

Writing your own Modules
If the default modules are not suited for the strain or the setup, the user can create their own module using the MicroMator API. On the Gitlab page, we provide interfaces to CellStar [6] and to a simple numpy based image analyzer as examples of interfaces to other modules.

Supplementary Note 2: Example of a simple reactive experiment with MicroMator
We describe an experiment aiming at imaging objects of poorly-known fluorescence. The proposed strategy is to gradually increase exposure time until obtaining a strong enough fluorescence signal. Concretely, we start by exposing fluorescent cells during 10 ms and repeatedly increase the exposure time by 50 ms until the mean pixel fluorescence value reaches the (arbitrary) threshold of 1600. Note that for this toy example, and in contrast with the case study of Fig 2 in the main text, we simply consider here the mean pixel fluorescence of the entire field of view. No cell segmentation is performed. This is implemented by having MicroMator using the Analysis module to compute the mean pixel fluorescence and creating events that increase the exposure duration if needed. All the needed files can be found on the MicroMator gitlab. We will now present how to build an event object that will check the fluorescence, and take actions if needed. The code for the 3 functions below can be found in the example folder in the MicroMator git repository under the name reach fluorescence value event creator.py.

The trigger
The trigger is fairly simple, we need to fetch the result of the numpy analysis and compare it to the threshold, returning True if the fluorescence is below and False otherwise.

The effect
Here we need to modify the mmc.protocol object to raise the exposure by the chosen value. Note that this will change the parameters of the main acquisition loop.

The event creation function
The event creator file must have this function that returns a list of events. The events are built using the event class and the trigger and effect described earlier. This function will be launched at each time frame, and will effectively create the event.

Cell Segmentation with DeLTA in SegMator
To reach good performances, DeLTA needs to be trained beforehand. This training of the network has been done offline and outside of MicroMator. We started with the pretrained segmentation network for yeast provided in the DeLTA package [1], which was implemented with the popular neural network packages Keras and Tensorflow [8,9]. We developed our own preprocessing pipeline to implement the weight maps in Python, and created GUI's using the napari viewer [10] to create new masks for training. We used these tools to develop new training images for the network on our brightfield images. We found that relatively few (around 50) new training images were needed to adapt the network to our setup. This is likely because we started with a network which was already optimized for segmenting yeast, and the on-the-fly Python generators provided by DeLTA [1] which automatically apply random transformations to flip/shift/rotate the images and even simulate different illumination conditions. Such image augmentation allows the neural network to be more robust to out- of-sample images. Code and our pretrained model files are available on Gitlab (https://gitlab.inria.fr/InBio/Public/micromator/-/tree/master/segmator). The pipeline we used for training the network and typical analysis results are provided on Supplementary Fig. 2A and B, and the quality of the segmentation we obtained with DeLTA is documented on Supplementary Fig. 3A.

Cell Tracking with Trackpy in SegMator
After masks were created using the neural network, they were linked using Trackpy software [2]. It uses the particle tracking algorithm from Crocker and Grier [11]. The tracking is done directly on the binary masks using the center of mass of the segmented yeast cells. We verified the tracking algorithm quality by tracking the cells by eye from a movie to get an idea of how long cells are correctly tracked on average ( Supplementary Fig. 2C). We found that some cells ''hop" long distances between measurement times (every 3 minutes). These cells sometimes disappear completely (i.e. they are washed away) or they reappear at a far distance. We do not expect to accurately track such cells, and therefore do not count them as errors in our tracking algorithm. We found that the vast majority of cells were correctly tracked over the entire time horizon while some cells were not, as summarized in Supplementary Fig. 3B. We also found that the tracking is affected by the time between frames. For all of the EL222/mScarlet characterization and control experiments, we had three minutes between frames, which was sufficient for accurate tracking. However, for the recombination experiments we wished to limit the amount of light exposure, and we therefore took images every 6 minutes. While tracking was still possible when the field was sparse, for dense fields with bulk movement tracking quality might be more severely impacted. This can notably be seen in the relatively high fraction of errors that are attributed to tracking in the pie chart in Fig. 4c in the main text.

Cell Segmentation with Cellpose
We also provide an image analysis module that uses Cellpose [3] to segment cells. A key advantage of Cellpose is that it can work ''out of the box'' in that no training of the neural network is required. In Supplementary Fig. 4, we provide typical analysis results we obtained with this image analysis module with the "Cyto" model of Cellpose. The algorithm accurately identifies cells from background. The algorithms also tends to oversegment cells, meaning that some bacteria are segmented in several "cells". This problem can be expected on Mycobacteria due to their striated aspect with multiple septa. However, it is important to note that for our application that only requires a good estimate of the mean pixel intensities of the bacteria this oversegmentation problem is not an issue.

Supplementary Note 4: Analysis of DMD Precision Using Three Color Fluorescent Microscopy
To either image or activate single cells using the DMD, it is critical to understand the precision of the device. To do so, we targeted for illumination a subset of the cells and quantified the fluorescence in the targeted and in the non-targeted cells.
We also tested the effect on bleedthrough of eroding the single-cell illumination masks. More specifically, we used a yeast strain with mCerulean, mNeonGreen, and mScarletI constitutively expressed ( Supplementary Fig. 5A). Then at each measurement time, we select three groups of cells, each to be imaged in a given color. Each cell is ''tagged" in one color ( Supplementary Fig. 5B). Because all cells are expressing the three proteins, any accidental activation or bleedthrough in one imaging channel will show up as an off-target cell becoming illuminated. We measured the fluorescence values in all cells in each of the three channels using masks that were eroded to various degrees. A 75% erosion of the mask means that 25% of the area of the cell was retained for illumination. Sample microscopy images are shown in Supplementary Fig. 5D. We measured the fluorescence in the target cells and in the off-target cells for 75%, 66%, 50%, 33%, and 25% erosion levels. The measured single-cell fluorescence values correspond to mean cell pixel intensities. From these experiments we determined that 33% erosion was sufficient to limit bleedthrough, which is critical when we photostimulate individual cells (Supplementary Fig. 5E).

Supplementary Note 5: Model Predictive Control of Optogenetically Induced Protein Production
Characterization of Single-Cell EL222 Activation Because we want to use the optogenetic system to control fluorescence values in individual cells, we performed a calibration experiment in which we randomly assigned cells in the field of view to receive 0, 200, 1200, or 1600 ms of photostimulation. Cell masks were eroded to 33% of their original area to limit bleedthrough. When new cells were born, they were assigned a stimulation using our cell-sorter event, such that the population of cells in each bin was approximately even. Cells were stimulated every three minutes. The data is presented in Supplementary Fig. 6. We used these experiments as a proof of concept-to show that we could differentially stimulate the production of protein in individual cells.

Stochastic Model of Protein Expression
We work specifically with the EL222 optogenetic system. In this system, EL222 proteins are created and degraded/diluted at all times within the cell because of stochastic protein production and degradation and cell growth. Under blue light photostimulation, EL222 molecules dimerize and then become activated transcription factors, which bind to the pEL222 promoter, activating protein production. We model the system of light-induced protein production using the following set of biochemical reactions, describing the constitutive transcription and translation of EL222 molecules, which we call X, and the production of fluorescent protein, Y , together with their degradation/dilution, B is a random variable which describes a ''burst'' of EL222 expression, and is geometrically distributed [12]. This model is valid when mRNA are short-lived compared to protein, and allows us to simplify the model. The control input, u(t), is Boolean and represents the light being either on or off. We assume a delay τ between the time the light is applied and fluorescent protein is visible. The propensities of the system are: The mean burst size of the geometric distribution is given by 1/b. The infinitesimal generator can therefore be written We then need to solve the corresponding CME. Because the system is piece-wise constant (i.e. A is constant of fixed time intervals), we can pre-compute the infinitesimal generator for both the on and off state of the controller, where we define the ''on'' generator matrix as A on and the ''off'' generator matrix as A off . Because the input is piecewise constant, given a sequence of input u(t) the FSP solution can be iteratively evaluated on the constant time intervals, The matrix exponential is a notoriously expensive operation; however the product of the matrix exponential and the probability vector can be efficiently evaluated using Krylov subspace methods [13]. We implemented the Krylov subspace method algorithm in Python based on ExpoKit [13].

Population Model of Protein Expression
Using standard mass-action kinetics, we find the following set of ordinary differential equations to describe the time-evolution of the mean of the population In this model, m, x, and y represent the concentrations of the EL222 mRNA, EL222 protein and mScarletI protein, respectively.

Model Calibration
Our goal was to calibrate the model such that it would be sufficient to control either the population or single-cells. We primarily used the stochastic model of the system to find model parameters which matched experimental trajectories, shown in Supplementary Fig. 7. However, evaluation of the trajectory based likelihood function is very slow for these types of models, and therefore we used a combination of quantitative maximum likelihood based fitting along with hand-tuning to find the presented parameters in Supplementary Fig. 7 and Supplementary Tables 1 and 2. Interestingly, we note that our estimation of the delay matches well with the mScarlet-I maturation time estimated by McCullock and colleagues [14]. This suggests that the time needed for EL222 activation and EL222-driven gene expression is significantly smaller than the time needed for the maturation of the fluorescent reporter. This is again consistent with the observation that the EL222 transcription factor has a fast kinetics made by Benzinger and Khammash [15] time

Fluorescent Measurement Noise
We assume that fluorescence measurements noise is Gaussian. Therefore, if the number of fluorescent molecules in a given cell is known to be n i , the fluorescence measurements are taken to be distributed as where µ FP and σ 2 FP are the mean and variance of the fluorescence distribution. For this work, we calibrated the values of µ FP and σ 2 FP along with the kinetic parameters for the models, described below. Values were taken as µ FP = σ 2 FP = 50. For the stochastic single-cell control problem, we will use Eq. 7 to estimate the number of molecules in the cell. From the population (deterministic) model standpoint, the Gaussian measurement noise fits well into the common framework of Kalman filtering, which we describe in the next section [16].

Kalman Filtering for population state estimation
For the population control of the mean, we used the deterministic model described in Eq. 4. This model describes the time evolution of the three species, x = [m, x, y], where y is the fluorescent protein in units of number of molecules. We discretized the system in time according to the measurement times, k∆t for k ∈ (0, N t ). To estimate the state at each k∆t using the measurement y(k∆t) ≡ỹ k , we used a Kalman filter. The process noise P k follows from the assumption of Poisson noise across the population, and is given by where N c is the number of cells in the population. This term comes from the assumption that the population mean is normally distributed with independent variances stemming from Poisson noise. Because we only measure the fluorescence, the observation matrix is At measurement times, state and process noise are updated: x k|k =x k|k−1 + K k (ỹ k − Cx k|k−1 ) (10) P k|k = (I − K k C)P k|k−1 (11) where K k is the Kalman gain, The observation variance σ 2 FP comes directly from Eq. 7.

Bayesian state estimation for single cells using the FSP
Because the Kalman filtering approach assumes that everything is Gaussian, and we know that the process is non-Gaussian, and we cannot make use of the central limit as above, we now turn to an approach for state estimation which can be applied to single cell fluorescent measurements. For simplicity, suppose we start from the estimated state of the cell, which is distributed according to p((k − 1)∆t) = [p 0 , p 1 , ..., p N ]. This distribution can be propagated from (k − 1)∆t to k∆t according to u(t) and Eq. 3. In the Bayesian sense, p(k∆t) is a prior distribution on the current state x. Because we have a discrete state model, and we aim to estimate the number of protein Y , we need to marginalize over the variables which we do not observe. Recall that the stochastic model only has two variables, and therefore the marginal prior on the observation is The likelihood of the measurement comes from Eq. 7, which gives the probability ofỹ k conditioned on a given number of protein molecules in the cell, i.e. n i = [1, 2, 3..., N ] molecules of protein.
We assume that µ FP and σ 2 FP are known. As mentioned above, Eq. 13 is a prior assumption about how many molecules of fluorescent protein a given cell is likely to have, and is conditioned on the previous estimate of the state at (k − 1)∆t. Therefore, we can find the posterior estimate of the state as The proportionality constant is not necessary to find in this case, as we can renormalize the posterior estimate of the state p(n i , k∆t|ỹ k (k∆t)).

Receding Horizon Model Predictive Control
For both the population model and the single cell stochastic model we implemented a receding horizon model predictive control. For a given target T defined at all k∆t, we defined a cost function for each method. The target has units of molecule number of fluorescent protein, which is the same units as the state x.
For the standard population control, we defined a sum of squares cost function, For the stochastic individual cell control, we chose a different cost function which takes into account the potentially asymmetric (non-Gaussian) estimates of the state, which is the expected absolute deviation, |T k − n|p(n, k∆t).
We perform receding horizon MPC, in which we aim to optimize the cost function J over the finite time horizon H using the current state estimate, i.e. to find the light input sequence u(t) which minimizes J over the finite time horizon For these systems, the photostimulation is taken either to be on or off between k∆t and (k + 1)∆t, therefore there are 2 H/∆t possible light combinations over the horizon H. We optimized by exhaustively evaluating Eq. 18 for each possible light sequence. We tested this strategy in silico and obtained satisfying results, as shown in Supplementary Fig. 8.

Supplementary Note 6: Data Curation for Yeast Time-lapse Movies
We implemented an automated pipeline to remove cells from our analysis either because of their position in the field of view or because of poor segmentation or tracking quality. Importantly, this analysis was applied only to the characterization experiments, as we only applied these cleaning strategies after the induction experiments in Figure 3b of the main text and Supplementary Figures  6 & 7, and not in real time. These cleaning steps are shown in our data analysis Jupyter notebooks. In summary, we removed cells which met one of the following criteria: cells were within 150 pixels of the top of the field of view, since this part of the field of view is imaged but not stimulated by the DMD, or cells were within 50 pixels of the border of the image, cells were present in the data set for less than 10 frames, cells had an area change of more than 100 square pixels between two consecutive frames.
Moreover, we discarded images collected after cells stopped growing in the microfluidic chip.