Introduction

With the technical advances in optics and the development of novel fluorescent reporters and probes, intravital microscopy is entering a new era of in vivo high-resolution real-time imaging, helping to answer biological questions under physiological conditions1,2,3,4. Unfortunately, naturally occurring periodic and random motion artifacts continue to pose one of the biggest challenges for intravital imaging5. Image degradation by motion artifacts is directly proportional to the spatial resolution: at lower resolutions there are fewer effects whereas motion can become the limiting factor at higher spatial resolutions. Medical imaging techniques (e.g. magnetic resonance imaging (MRI), X-ray computed tomography (CT) and ultrasound (US)) have driven the field of motion compensation but many of the solutions are poorly adaptable to intravital microscopy. Therefore, new methods are urgently needed to enable motion-free, high resolution intravital microscopy.

The major sources of motion are respiration and cardiac activity, even under deep anesthesia, while other sources such as peristalsis, muscle contraction and slow drift can be more easily avoided or corrected. When image acquisition is operated at a speed such that motion occurs only within certain frames, simple frame rejection is one solution. Unfortunately acquisition at such speeds is not always possible and when it is, the low integration time necessary to avoid image distortions, will result in poor signal to noise ratio. A number of different motion compensation methods have been described for different applications and organs6,7,8,9,10,11,12,13,14,15,16,17,18,19. Many of these require specific experimental setups depending on the organ of interest. Physical immobilization and mechanical restraints (e.g. through the use of a glass cover slip), are among the most commonly employed strategies. Despite their simplicity, the level of stabilization achieved rarely allows true high-resolution imaging. Moreover, the applied pressure necessary to achieve a sufficient degree of stabilization can severely impact physiological measurements. Alternative methods include active7,19 or passive8,10,11 acquisition schemes, sometimes in combination with specialized, custom-made stabilization holders. Other approaches based on image processing are also promising as they do not require any specific setup modification. Algorithms based on motion distortion models with constant velocity assumption20 or on Lucas-Kanade registration modeling21 are effective in compensating in-frame motion distortions. However, because these methods are based on 2D plane motion models, they work best in 2D, i.e. when motion is restricted within the imaging plane. When inter-frame motion artifacts predominate, such as in high-speed acquisitions, artifacts can be successfully corrected by image registration with realignment of the video-frame sequence and removal of unmatched components. However, distortion-free images have to be collected prior to imaging using high speed scanning systems18,21,22,23, a solution which is not always feasible.

Here we present a new generalizable method for motion compensation in intravital imaging based on image processing. The described method, which works for periodic motion artifacts, does not require any a priori knowledge or recording of the animal physiology such as the cardiac and respiratory activity, nor information of distortion free images. Moreover, there are no retrospective gating or prospective triggering schemes necessary, simplifying instrumentation. The reconstruction algorithm is capable of providing motion-compensated images of every organ independently of the physiological source (or sources) of motion. We first validate the algorithm in phantoms and then apply the method to in vivo imaging of kidneys, pancreas, heart and dorsal window chambers in the mouse. Further extrapolation of the method should also be useful for intraoperative clinical imaging where surgeons need to map extended moving areas.

Results

In contrast to widefield microscopy techniques where image data is acquired in one exposure via a CCD, laser scanning microscopy (LSM) relies on sequential point-by-point excitation, scanning along a preset path to cover the imaging field of view (Fig. 1a). In the frame of reference of the objective, the scanning path lies along a horizontal plane, while in the frame of reference of a moving organ it will belong to a curved surface modulated in time by the organ's motion components10. The acquired image is then not representative of a physical horizontal plane optically sectioning the organ of interest12, but will instead present motion-induced distortions (Fig. 1a). Our motion compensation method is based on automated recognition of distortion-free areas within an acquisition sequence, which subsequently allows reconstructions of artifact-free images truly reproducing horizontal imaging planes.

Figure 1
figure 1

Motion artifacts during laser scanning microscopy acquisition.

(a) In LSM, acquisition is performed in “line-by-line” modality with the excitation beam tracing a path along the imaging field of view. Cardiac and respiratory activity will therefore induce geometric distortions in the final acquired images. The timescales of the two images are different. (b) If some degree of reproducibility in certain part of the motion is introduced, images acquired at different time points will present “locally” different values of correlation. (c) Sequence of images acquired sequentially of a tumor orthotopically implanted in the pancreas, in a spontaneously breathing mouse. Images (512 × 512 pixels) have an acquisition time of 1.6 sec. Acquisition and source of motion are out of phase. Different colors are associated at each distinct image. (d) By combining parts of the images (segments) that present high values of correlation coefficient, we can reconstruct a final motion artifact-free image. Here different colors were used to show how segments are gathered from different images of the sequence. The size of these segments, which corresponds to the number of “views” or scanned lines, is not constant but varies depending on the timing between the image acquisition and the minimum movement periods TR and TC.

As motion cycles over time, it is possible to identify a period of minimum respiratory and cardiac movement, both at the end of expiration and inspiration (TR) and during the diastolic phase in the cardiac cycle (TC) (Fig. 1a). If during scanning acquisition, breathing and cardiac activity traces are recorded and several gating time windows coincident with TR and TC are chosen, different portions (“segments”) of the acquired images coincident with the gated windows and, free of any artifact, can then be extracted among all collected images. The presence of a mechanical stabilizer, working as a fixed boundary constraint, also enables reproducibility in position at the selected temporal windows. This guarantees that all different segments can be combined together to produce a final reconstructed image free of any motion-induced distortions, as typically done in sequential cardiorespiratory gating (SCG) segmented microscopy12.

While effective, this approach necessitates direct information about the organ's motion through the use of ECG traces and pressure waveforms. In terms of hardware equipment, this requires a high-speed multichannel data acquisition system, a differential signal amplifier, a mechanical animal ventilation system or a respiration sensor in addition to the specific function support from the microscope system such as triggering function or real-time scanning timing output function. The basic idea behind the proposed method is to automatically identify all artifact-free segments within a sequence and combine them into a final “stabilized” image. The method's basic principle is illustrated in Fig. 1b. Here two LSM images of a beating heart are shown. Both images present significant motion artifacts mainly in the lower parts, while motionless areas reside in the upper parts. We can quantify the degree of similarity between the two images by taking the Correlation Coefficient (CC) of all the segments that are temporally overlapping (i.e. possess the same image coordinates). A CC bar graph, calculated on all segments, presents high values (i.e. more than 0.9) in correspondence of the motionless areas indicated by the red lines, while the other motion-distorted segments give rise to low values of CC (high distortion). We can therefore envision applying this procedure for an entire sequence of images such as the one presented in Fig. 1c. Here a tumor orthotopically implanted in the pancreas of a spontaneously breathing mouse kept under anesthesia, is imaged. Motion distortions due to the respiratory activity are clearly visible in all the images throughout the entire sequence, causing streak-like artifacts. With the help of CC bar graphs we can manually identify in the sequence motionless segments within each image, combine them according to their time coordinates and finally obtain motion artifact-free image reconstructions (Fig. 1d).

Obviously this “cherry-picking” approach is impractical when dealing with large datasets. Fig. 2 thus illustrates the algorithm for automated image reconstruction. During the first phase, a sequence of multiple images Ij (j = 1..M) is gathered. Each image Ij is then divided in N segments Si,j (i = 1..N). Typically, we choose N to have 8 or 16 or 32 views/segments (e.g. for a 512 × 512 image, 32 segments with size 512 × 16 pixels). All the segments Si,j which are temporally coincident to each other with respect to the frame scan (i.e. i = constant), are then collected in N groups. For each group we build a “segment correlation coefficient” table Ti (MxM) obtained by calculating the correlation coefficient (CC) of one segment with all the others. The diagonal will contain the CC of each segment with itself, giving rise to a maximum value of 1. All other components will have different values of CC depending on the degree of similarity over the entire sequence. In the tables, dark color indicates high values of correlation coefficient, i.e. similarity within segments. A similarity threshold TCC (TCC = 0.9) is then chosen and all segments belonging to a table Ti with a CC value higher than TCC are chosen and a “similarity ensembleEi is built. All N ensembles Ej will contain a number of elements ni which can be different for each individual ensemble. But each ensemble will contain all segments within a row of a sequence that will present the highest values of similarity and therefore are the ones lacking any motion artifact contribution. Starting from the first ensemble, each segment is chosen as a seed and in the next adjacent ensemble another segment with the highest boundary matching conditions is chosen. In this way, a path Ps (s = 1..p) between the ensembles is built, which will correspond to a choice among all segments, of all the ones presenting lack of distortions and preserving boundary conditions. From the multiple paths of each seed, the one with the highest value of total boundary continuity is chosen as a final path. Note that multiple final paths can be present depending on the number of stability points p within a physiological motion cycle. For example, during ventilator assisted respiration two distinct points (p = 2) of minimal motion are present corresponding to fully inflated or deflated lung, giving rise to two different paths (P1,P2) with the corresponding two artifact-free images. In order to automatically find the p possible seeding segments we use a K-means clustering algorithm. From all ensembles, similar segments are grouped together in distinct clusters and for each cluster a representative is chosen as stability seed. A final path for each seed cluster is then traced temporally as described above. For each final path all segments are finally combined together to reconstruct an artifact free image/images. Because only respiratory and cardiac motion are present, the parameter p is equal to 2 when in the presence of both or when cardiac motion is negligible and to 1 when only cardiac motion is present and no respiratory activity is present (e.g. by stopping the mechanical ventilator) or its effect is negligible (e.g. imaging distal to the lungs). The minimum number of images necessary to obtain a full reconstruction depends on three parameters (integration time, image dimension and segment size) and their relation to temporal motion. For example, for a typical protocol of 140 breaths per minute, an image size of 512 × 512 pixels and an acquisition time of 0.5 seconds, 20 images are required to reconstruct an artifact free image. For spontaneous breathing 5–10 images are instead required.

Figure 2
figure 2

Automatic image reconstruction algorithm.

M sequentially acquired images are divided in N patches (“segments”). A collection of all segments (M) corresponding to the same position in the microscope frame of reference, are collected and correlation coefficient is calculated between each view with all remaining others. A “segment correlation coefficient” table is then calculated for each individual segment (N). In the table dark color represents high values of correlation coefficient i.e. similarity within segments. For each “segment correlation coefficient” table all views with a CC higher then a set threshold TCC are collected giving rise to a “similarity ensembleEi. If the physiological cycle has multiple point of high stability (e.g. acquisition during ventilation) different segments can be found within the first “similarity ensembleE1 using K-means clustering and then used as seed for automatic image reconstructions.

The fundamental limit for the new algorithm is determined by the temporal acquisition of the single “view”. If motion occurs on the time scale during which the excitation laser beam traces a line, no segment can be considered as reproducible and the proposed algorithm will fail. It is therefore a necessary condition to have a motion free period over which stability in the acquisition is guaranteed. This imposes a limit in size on the generic segment, to the minimum of one line. Of note, choosing a proper image acquisition time can circumvent this limitation (shorter integration times and smaller image sizes imply larger segments with respect to the total image size).

To test the effectiveness of the reconstruction method we first imaged a biological phantom under “controlled motion” conditions. (Fig. 3). A phantom consisting of a fixed heart embedded in agar and stained with Rhodamine Lectin to visualize the microvasculature, was fixed on a speaker membrane (Fig. 3a). The speaker was continuously driven by a current waveform presenting a high frequency component such that the sample moved along the objective vertical axis perpendicular to the imaging plane. In a raw LSM acquired image (Fig. 3b) motion distortions are present in correspondence to the speaker displacements (red boxes) while stable artifact-free acquisition falls within temporal windows Tsg. Fig. 3c shows an automatically reconstructed image using the algorithm. A comparison with an image acquired without motion (no current) and representing the “ground truth” image, is shown in Fig. 3d. The reconstructed image and the “ground truth” image are almost identical with a high value of CC (Fig. 3f, 2 vs 3). This result shows that the reconstruction methodology can, in theory, produce motion-artifact free images, starting from distorted images, without any a priori information regarding the motion components or the morphology of the sample at rest.

Figure 3
figure 3

Phantom measurements for proof of principle.

(a) A fixed heart stained for microvasculature (Rhodamine lectin) is moved along the objective's axis by a loudspeaker oscillating at a frequency of 8 Hz. (b) The acquired image presents severe motion artifacts for all segments (red boxes) occurring in correspondence to the driving current (blue curve). Segments within a stabilized gating time windows present no artifacts and over time high values of CC. (c) If images are acquired out of sync with the driving current, dynamically reconstructed images are automatically obtained. (d) Comparison with images obtained from the same sample and along the same imaging plane but with the sample in static position (i.e. no current is driving the speaker) indicated optimal correlation between the two and is proof of the validity of our method. Comparison between segments at the same position in the microscope frame-of-reference and belonging to a “raw” image, a “dynamically” reconstructed and a “static” one, show different degrees of correlation. Segments belonging to adjacent images present a very low degree of correlation.

In vivo results

We next tested the algorithm under different in vivo imaging applications. Fig. 4a shows one example of renal imaging using a GFP-ubiquitin expressing mouse. The mouse was anesthetized and mechanically ventilated, while a tissue stabilizer was employed in order to introduce reproducibility in the breathing induced motion. Despite the presence of the stabilizer, motion artifacts are evident throughout the entire sequence resulting in severe image distortions. Using the new algorithm, automatic reconstructions were achieved within 10–20 frames. The resulting distortion-free image was then compared with acquisition during static conditions. Here motion from the ventilator is eliminated introducing a brief pause of a few seconds in duration in the ventilator drive waveform and the axial position of the objective is controlled in order to match the same imaging plane during respiratory activity. Direct comparison between the dynamically reconstructed image and the static one show a high degree of correlation, proving again the effectiveness of the proposed method.

Figure 4
figure 4

Ventilator induced motion artifacts.

(a) An in vivo sequence of images of a kidney from a GFP-ubiquitin mouse is shown. A comparison between an automatically reconstructed motion artifact-free image and an image obtained with the ventilator transiently paused to eliminate residual motion artifact indicates a high degree of correlation between the two. Red boxes indicate all the individual segments automatically collected. (b) During a ventilation cycle, pressure-induced motion presents a high frequency component with the organs moving at high speed between two different positions of minimal motion (black trace) with low frequency components. (c) An in vivo sequence of images of a liver from a GFP-ubiquitin mouse (green) perfused with Rhodamine-dextrane (red). The transition from two planes of high stability is evident in the sequence. Both planes can be automatically reconstructed using the self-seeding procedure.

During a ventilation cycle, pressure-induced motion presents a high frequency component with the organs moving at high speed between two different relatively stationary positions followed by low frequency components. The imaged organ will therefore move along a vertical axis transitioning alternatively between two stationary planes. Using the new algorithm, two different seeds can be automatically identified within the temporal sequence and two motion artifact-free images can be reconstructed corresponding to the expiration and inspiration phases (π1, π2). An in vivo image sequence of a liver from a GFP-ubiquitin mouse (green) perfused with Rhodamine-dextran (red) is shown in Fig. 4c. From the images it is evident that a transition between two stationary phases is present during ventilation. Automatic seed extraction in combination with the reconstructing algorithm gives rise to two images representing two different horizontal planes π1 and π2.

When N multiple stability points are present during the acquisition of a dataset, automatic seeding extraction guarantees that N horizontal planes free of any motion induced artifact are reconstructed. During ventilation, two parallel planes are easily isolated as demonstrated above. So if the imaged organ is translated (or alternatively the objective as is mostly the case, Fig. 5a) both stability planes will sample the imaged organ along the objective's axis, granting the possibility to automatically reconstruct a 3D dataset without any motion artifact. To demonstrate this concept we imaged in vivo a tumor orthotopically implanted into a mouse pancreas, while continuously translating the objective along the axial direction. An incremental step of 1.5 microns at a speed of 0.9 microns/sec was chosen. The whole reconstructed images were then combined to generate a 3D volume (Fig. 5b).

Figure 5
figure 5

3D motion artifact free automatic reconstructions.

(a) While the sample is moving along a trajectory determined by physiological activity, the objective can be translated axially with a z-stage motion controller. (b) In a similar fashion as illustrated before, individual planes can be automatically reconstructed. Once combined, 3D motion compensated images can be automatically obtained. (c) Motion components induced by respiration or heart beating can even be detected in a mouse dorsal window chamber when imaging at high magnification. Here an image pre and post automatic reconstruction is shown. Arrows indicate cells that are morphologically distorted in the “raw” acquired images.

Breathing induced motion artifacts have a broad effect on all organs, with the degree of distortion depending on the degree of magnification. This is evident even when imaging within the dorsal skinfold window chamber (left panel Fig. 5c), which is typically firmly fixed during data collection. Here motion artifacts are still observable at high resolution but can be successfully removed through the use of the algorithm (right panel Fig. 5c).

Cardiac excursion is another major source of motion. We therefore tested whether the above approach could be used to reduce cardiac motion. An in vivo temporal sequence of acquired raw images of a beating heart is shown in Fig. 6a. Without motion compensation, there are severe image distortions as expected. In addition, due to the anatomical proximity of the heart to the lungs, respiratory artifacts are also substantial. Using the developed algorithm in combination with self-seeding extraction, two stabilized images representative of horizontal imaging planes were easily extracted, compensating simultaneously for both cardiac and respiratory motion. Due to the typical short time windows over which the heart is at a resting position, the total number of images required to obtain a stabilized reconstruction is higher in comparison with the case of respiratory activity only (approximately 5 times). When operating at 20 frames per second, that will reflect in a stabilized reconstruction within 2–5 seconds.

Figure 6
figure 6

Artifact free automatic reconstructions of the beating heart in vivo.

A sequence of images of the beating heart is shown. Images are aligned such that time is along the horizontal axis. The ECG trace (red curve) indicates the point where the maximum motion occurs. If only cardiac motion is present, only segments within a specific temporal window coinciding with the end diastole will present a high degree of correlation over time. If only breathing motion is present, segments acquired within two distinct time windows of the respiratory cycle will present high values of correlation coefficient. If both physiological motion components are present, the probability to obtain segments with high values of correlation coefficient will be the product of the two. By using the illustrated automatic algorithm with self-seeding, two motion artifact-free images are obtained corresponding to two distinct planes π1 and π2. A comparison between reconstructed images and “raw” images is shown. All traces (ECG, ventilation) are reported for explanatory purpose and are not considered for the automatic reconstructions provided.

Discussion

Here we have presented an image processing algorithm capable of removing motion artifacts during intravital microscopy. The method is universally applicable to different laser scanning modalities (confocal, two photon, SHG, etc.) and offers artifact free reconstructions truly representative of horizontal imaging planes. Moreover, 3D reconstructions of moving organs can be easily obtained through algorithm iteration. The method is not reliant on the periodicity of organ movements, a priori morphological information, or prospective triggering or retrospective gating. Rather, the method is based on simple raw data acquisition followed by image processing, facilitated with the use of a mechanical stabilizer. The advantages of the approach are obvious: the method is simple to adapt, inexpensive to implement and robust. For images 256 × 256 in size and with 2 microsecond/pixel integration time (4–12 raw images), rapid (2–5 seconds) reconstructions can be easily obtained making the technique particularly useful to study cellular interaction, drug diffusion and organ morphology. Listed in In Table 1, as an illustrative example, are the minimum numbers of raw images necessary to obtain a final artifact-free reconstruction in the presence of different physiological motions. Values are calculated for the case of acquired raw images 512 × 512 pixels in size and with 2 microseconds/pixel integration time. Ultimately, the minimum number of raw images is dependent on the acquisition parameters and the fluorescent signal present in the sample (i.e. concentration of endogenous fluorescent protein or contrast agent, quantum yield, excitation power, tissue penetration depth, etc.) such that the collection of high quality signal-to-noise ratio images is achieved. While we describe a first iteration, there could be further refinements. Multi-channel acquisition, for example, could be used to increase the amount of information present in each image thus speeding up the reconstruction process. Also the implementation of a scale invariant feature transform (SIFT) algorithm could be beneficial for handheld acquisition stabilization, where inevitable rotation and changes occur in the focal plane and for the case of random occurring physiological motions. The developed method also works for intraoperative imaging of moving organs. Artifact-free images can be obtained while hand-free panning and stitched sequentially for whole organ mosaicking.

Table 1 Minimum number of raw images necessary to obtain a final artefact-free reconstruction for the different cases of physiological motions. The numbers are given for images 512 × 512 pixels in size and with 2 microseconds/pixel integration time. A reduced number of raw images (and shorter time) can be obtained if smaller sizes or shorter integration times are considered

Methods

Imaging setup

An Olympus FV1000-MPE laser scanning microscope system was used in confocal mode and two-photon mode. XLPlan N 25×, a water-immersion objective with 1.05 of numerical aperture and 2 mm of working distance were used. A custom-made stabilizer was fabricated in-house. The detailed dimension and structure of the stabilizer is found in7 but it has not been previously used with the newly developed algorithm. The main function of the stabilizer is not to immobilize tissue but rather to introduce reliable, reproducible positions.

Mouse preparation

Experiments were approved by the Institutional Review Board. During surgical procedures and imaging, mice were anesthetized with 2% isoflurane in oxygen. The anesthetized mice were placed on a 37°C heating pad. For the case of mechanical ventilation, mice are ventilated using a small animal ventilator (Harvard Apparatus INSPIRA ASV 55-7058) after intubation. For imaging the abdominal organ such as liver, kidney and pancreas, the organ was externalized in a minimally invasive manner. For imaging heart, thoracotomy was performed and the heart was exposed with the use of the animal ventilator. The exposed organ was then held by the stabilizer and kept moist with saline during the experiment.

Image processing

Raw images were processed in Matlab (The Math Works, Natick, MA). Automatic image reconstruction described in the result section (Fig. 2) was also implemented using a custom-designed routine in Matlab. The correlation coefficient (CC) between two segments (or views) S1 and S2 is defined as

where the average values are

CC is in the range [−1,1]. A value of 1 indicates maximum match between two segments, while −1 corresponds to a maximum mismatch.

Through k-means clustering the “similarity ensemble” E1 containing n1 segments is partitioned into p clusters {X1, X2, …Xp}. First, initial p guesses m1,m2,…mp are made by randomly selecting p segments. Second, the set of segments are classified based in the following way: Xi = {Sjsuchthatd(Sj, mi) ≤ d(Sj, mk), k = 1…p}. Here the distance between segments S and m, is defined as . Third, all clusters' means are calculated as . Second and the third steps are repeated until there is no change in the mean.