Identification of non-activated lymphocytes using three-dimensional refractive index tomography and machine learning

Yoon, Jonghee; Jo, YoungJu; Kim, Min-hyeok; Kim, Kyoohyun; Lee, SangYun; Kang, Suk-Jo; Park, YongKeun

doi:10.1038/s41598-017-06311-y

Download PDF

Article
Open access
Published: 27 July 2017

Identification of non-activated lymphocytes using three-dimensional refractive index tomography and machine learning

Jonghee Yoon^1,2^na1^nAff5,
YoungJu Jo^1,2^na1,
Min-hyeok Kim³,
Kyoohyun Kim^1,2,
SangYun Lee^1,2,
Suk-Jo Kang³ &
…
YongKeun Park^1,2,4

Scientific Reports volume 7, Article number: 6654 (2017) Cite this article

5942 Accesses
85 Citations
5 Altmetric
Metrics details

Subjects

Abstract

Identification of lymphocyte cell types are crucial for understanding their pathophysiological roles in human diseases. Current methods for discriminating lymphocyte cell types primarily rely on labelling techniques with magnetic beads or fluorescence agents, which take time and have costs for sample preparation and may also have a potential risk of altering cellular functions. Here, we present the identification of non-activated lymphocyte cell types at the single-cell level using refractive index (RI) tomography and machine learning. From the measurements of three-dimensional RI maps of individual lymphocytes, the morphological and biochemical properties of the cells are quantitatively retrieved. To construct cell type classification models, various statistical classification algorithms are compared, and the k-NN (k = 4) algorithm was selected. The algorithm combines multiple quantitative characteristics of the lymphocyte to construct the cell type classifiers. After optimizing the feature sets via cross-validation, the trained classifiers enable identification of three lymphocyte cell types (B, CD4+ T, and CD8+ T cells) with high sensitivity and specificity. The present method, which combines RI tomography and machine learning for the first time to our knowledge, could be a versatile tool for investigating the pathophysiological roles of lymphocytes in various diseases including cancers, autoimmune diseases, and virus infections.

Integration of mechanistic immunological knowledge into a machine learning pipeline improves predictions

Article 12 October 2020

VoPo leverages cellular heterogeneity for predictive modeling of single-cell data

Article Open access 27 July 2020

Multiplexed single-cell morphometry for hematopathology diagnostics

Article 11 March 2020

Introduction

Lymphocytes consist of various cell types including B, helper (CD4+) T, cytotoxic (CD8+) T, and regulatory T cells, and play crucial roles in the adaptive immune system¹. Each lymphocyte cell type has different functions: B lymphocytes produce antibodies, and T lymphocytes recognize a specific antigen and execute effector functions. The lymphocyte population and function are tightly regulated to defend the host against harmful invaders or abnormal conditions^{1, 2}. Disturbances in lymphocyte function and regulation are related to various diseases including cancers^3,4,5, autoimmune diseases^{6, 7}, and virus infections^{8, 9}.

To understand the roles of different types of lymphocytes, several methods based on labelling techniques have been developed to identify and separate lymphocyte cell types. Because different types of non-activated lymphocytes have very similar cellular morphology such as a large nucleus with small cytosolic regions and round shapes, it is impossible to discriminate lymphocyte cell types with conventional optical methods such as bright-field microscopy or phase contrast microscopy¹⁰. To overcome this difficulty, specific surface membrane proteins, known as surface markers, are recognized and tagged with magnetic beads or fluorescence molecules via antigen-antibody binding. Then each type of lymphocytes can be distinguished and separated by magnetic forces or fluorescence signals¹¹. Targeting surface markers is a precise and efficient approach to determine the cell types; however, labelling methods have potential risks of altering cellular functions by modifying membrane protein structures. In addition, labelling methods have limitations in the number of cell types that can be identified simultaneously due to the limited multiplexing capability of the labelling agents¹².

Label-free approaches such as mass spectroscopy¹² and Raman spectroscopy¹⁰ have also been introduced to overcome the limitations of labelling methods because these spectroscopic methods exploit intrinsic biochemical properties of cells. Mass spectroscopy measures cellular biochemical properties which enable the profiling of lymphocyte proteins as well as the identification of lymphocyte cell types. However, it has a limitation in live-cell analysis due to the homogenization process of the cells. Raman spectroscopy measures molecular vibrations and also characterizes biochemical properties of a sample. Raman spectroscopy permits label-free live-cell analysis of lymphocytes with high accuracy; however, it requires a bulky optical system and long acquisition time (typically several seconds per cell) which limits its practical use.

Here, we present a method to identify lymphocyte cell types by exploiting optical diffraction tomography (ODT) and machine learning. ODT is a label-free imaging technique that measures a three-dimensional (3-D) refractive index (RI) tomogram of a sample which provides quantitative morphological and biochemical information^{13, 14}. ODT has been widely used to study various biological samples including red blood cells^{15,16,17,18,19,20,21,22}, white blood cells (WBC)^{23, 24}, hepatocytes²⁵, cancer cells^{16, 26,27,28,29,30,31,32}, neurons^{32, 33}, bacteria^34,35,36, phytoplankton³⁷, and hair³⁸. In our previous study, we reported that ODT enables the quantitative analysis of WBCs including lymphocytes and macrophages²³; we demonstrated that the two WBC subtypes could be discriminated using ODT. However, we were unable to simply identify lymphocyte cell types due to their nearly indistinguishable cellular morphology and biochemical characteristics.

In the present study, we use machine learning techniques to systematically interrogate the subtle differences between the lymphocyte cell types. Since RI is an intrinsic property of each biochemical component, the measured 3-D RI tomograms should encode the cell-type-specific fingerprints. However, it is difficult to manually discover such fingerprint information due to the complexity of 3-D tomograms. To solve this difficulty, statistical classification methods construct classification models by combining multiple features in a data-driven manner, instead of conventional hypothesis-driven investigations. This approach is especially powerful for high-dimensional data that are extremely difficult to be manually processed by humans due to the complexity and large size³⁹, and thus machine learning techniques have been widely used to solve complex biological problems: identification of bacterial species^{40, 41}, discrimination of WBC subtypes^{42, 43}, investigation of pathophysiological conditions^44,45,46, and classification of kinetic cell states⁴⁷. Here we combine 3-D RI tomography and machine learning for the first time; we exploit statistical classification techniques to establish the cell type classifiers using the quantitative morphological and biochemical information extracted from the 3-D RI tomograms of individual lymphocytes. The trained classifiers enable identification of three lymphocyte cell types (B, CD4+ T, and CD8+ T cells) with high sensitivity and specificity.

Results

The overall procedures for identification of non-activated lymphocytes are summarized in Fig. 1. The present approach involves three steps: (i) measurement of the 3-D RI tomograms of individual lymphocytes (Fig. 1a), (ii) construction of the statistical cell type classifiers using the quantitative biochemical and morphological features extracted from the tomograms (Fig. 1b), and (iii) identification of the new individual lymphocytes using the established classifiers (Fig. 1c).

Figure 1a shows the procedures for reconstructing the RI tomograms of the individual lymphocytes. To reconstruct a 3-D RI tomograms, multiple 2-D holograms of a cell are measured at various angles of illuminations using an interferometric microscope⁴⁸ (Fig. 1d). A coherent laser beam is split into two arms by a beam splitter. One arm passes through a sample, and then the diffracted light from the sample is projected onto a camera plane through a microscope. At the camera plane, the sample beam interferes with the other arm to generate a spatially modulated hologram. The angle of the beam impinging onto the sample is controlled by a dual-axis galvanomirror. From the measured holograms, complex optical fields consisting of both the amplitude and quantitative phase images are retrieved using a field retrieval algorithm^{49, 50}. Then, a 3-D RI tomogram of a lymphocyte is reconstructed using the multiple optical amplitude and phase information via an optical diffraction tomography algorithm^{14, 51} (see Methods).

We obtained and sorted three lymphocyte cell types, B, CD4+ T, and CD8+ T cells, from mice peripheral blood (see Methods). The measured 3-D RI tomograms of the individual lymphocytes are shown in Fig. 2. The cross-sectional slices of the representative 3-D tomograms of the three cell types are shown in Fig. 2a–c, respectively. To facilitate visualization, the tomographic data was 3-D rendered using a customized transfer function in a commercialized software (TomoStudio^TM, Tomocube Inc., Republic of Korea) to resemble haematoxylin and eosin staining (Fig. 2d–f and Supplementary Videos 1–3). Clearly, the measured RI distribution visualizes the cellular boundaries and internal organelles such as nuclear membranes and nucleoli. The B cell shows a well-defined nucleus and nucleoli with RI values ranging from 1.34 to 1.41. We also note that the RI values of the cytosolic regions of the CD4+ and CD8+ T cells are higher than that of the B cell. Despite of these slight differences in RI distribution, the cell-type-specific fingerprints for cell type identification could not be clearly defined through visual interrogation, mainly due to the cell-to-cell variations.

Next we extracted the quantitative characteristics of the individual lymphocytes from the 3-D RI tomograms as illustrated in Fig. 3 (n = 149, 95, and 112 for B, CD4+ T, and CD8+ T cells, repsectively). The five quantitative morphological (surface area, volume, and sphericity) and biochemical (protein density and dry mass) parameters were calculated from the tomograms (see Methods). The cellular surface area and volume of the lymphocytes were simply calculated from the segmented (RI threshold = 1.340) voxel information of the 3-D tomograms, and then the sphericity, a dimensionless parameter that indicates the roundness of the cellular morphology, was obtained by the ratio of the surface area and volume. The cellular protein density and dry mass were retrieved using the RI values that are linearly proportional to the local concentration of non-aqueous molecules (mostly proteins).

To investigate the differences among the lymphocyte cell types, a statistical analysis was conducted on the quantitative parameters of the three lymphocyte cell types. First we studied the inter-type differences in the quantitative morphological features. The cellular surface areas of the B, CD4+ T, and CD8+ T lymphocytes were 145.87 ± 20.25, 167.23 ± 32.08, and 160.80 ± 19.12 μm², respectively (Fig. 3a). The B cells had significantly smaller cellular surface areas compared to the T cells (P < 0.001), while there was no significant difference between the CD4+ and CD8+ T cells. The cellular volumes of the lymphocytes also show a similar tendency with the result of the cellular surface area analysis. The cellular volumes of the B cells (133.43 ± 26.47 fL) were significantly smaller (P < 0.001) compared to those of the CD4+ T (155.73 ± 35.14 fL) and CD8+ T (152.77 ± 26.52 fL) cells (Fig. 3b), while the CD4+ and CD8+ T cells had similar cellular volumes. The sphericities were 0.86 ± 0.06, 0.84 ± 0.06, and 0.86 ± 0.05 for the B, CD4+ T, and CD8+ T cells, respectively (Fig. 3c). The sphericities of the CD4+ T cells were statistically smaller than those of the B cells (P < 0.01) and CD8+ T cells (P < 0.05). Note that all the lymphocyte cell types had high sphericity values which suggest the round shapes of the lymphocytes.

We also compared the biochemical properties of the lymphocyte cell types. The cellular protein densities of the B, CD4+ T, and CD8+ T cells were 15.43 ± 1.88, 14.81 ± 2.54, and 16.66 ± 1.88 g/dL, respectively (Fig. 3d). The CD8+ T cells had significantly higher cellular protein densities compared to those of the others (P < 0.001). The cellular dry mass was 20.28 ± 2.97, 22.65 ± 4.49, and 25.19 ± 3.51 pg for the B, CD4+ T, and CD8+ T cells, respectively, and showed significant differences (P < 0.001) between all the cell types (Fig. 3e). The B cells had a smaller cellular dry mass compared to the T cells. Moreover, the CD8+ T cells were statistically heavier than the CD4+ T cells. To summarize, we observed several statistical differences in both morphological and biochemical features at the population level. However, we failed to manually establish cell type classifiers based on these parameters at the single-cell level, due to the high dimensionality of the feature space and the cell-to-cell variations.

To achieve accurate identification of individual lymphocytes, we employed a machine learning approach to combine and exploit multiple features encoded in the 3-D RI tomograms. First we enlarged the feature space by extracting the quantitative parameters calculated at 20 different threshold RI values (from 1.340 to 1.378 with an increment of 0.002) to reveal the information specific to intracellular components in addition to the overall morphology (Supplementary Fig. 1). Then we systematically investigated the 100-dimensional feature space (5 parameters per threshold value), which is impractical be manually explored, using statistical classification models. We employed the well-known k-nearest neighbours (k-NN) algorithm^{52, 53} (see Methods) with k = 4 after comparing several models (Supplementary Fig. 2).

The statistical classification, or supervised machine learning, was performed through the training and test stages as explained earlier (Fig. 1b,c). We randomly split each lymphocyte subtype data into 70% and 30% of training (n = 104, 66, and 77 for B, CD4+ T, and CD8+ T cells, repsectively) and test sets (n = 45, 29, and 35 for B, CD4+ T, and CD8+ T cells, repsectively), respectively. First we constructed the cell type classifiers by combining subsets of the features extracted from the training data set. We exhaustively searched the optical combinations of the features via cross-validation (see Methods) and the classifier with the best training accuracy was selected. The established classifiers exploit the cell-type-specific fingerprints recognized from the high-dimensional feature space. To test if these fingerprints are general to new lymphocytes, we identified the individual lymphocytes categorized as the test data. The test accuracy and its sub-parameters, called sensitivity (true positive results over all positive inputs) and specificity (true negative results over all negative inputs), were calculated by comparing the machine-predicted and true cell types.

Figure 4 and Tables 1–3 illustrate the identification performance for both training and test stages. We performed statistical classification on three different combinations of the lymphocyte cell types: (i) binary classification of B and T lymphocytes, (ii) binary classification of the two T lymphocyte types (CD4+ and CD8+), and (iii) multiclass classification of all three types of lymphocytes. First, the two T cell types were considered as one class to train a binary classifier of B and T cells (Fig. 4a). The accuracy of the optimized classifier was 93.15% and 89.81% for the training and test cases, respectively (selected features: surface area (RI threshold = 1.342, 1.368), volume (1.368), sphericity (1.342, 1.368), protein density (1.368), and dry mass (1.342, 1.368)). Second, the CD4+ and CD8+ T cells were statistically classified also in a binary fashion (Fig. 4b). The accuracy was 87.41% and 84.38% for the training and test sets, respectively (selected features: surface area (1.342, 1.362) and sphericity (1.324, 1.362)). Lastly, the multiclass cell type classifier of the three lymphocyte cell types was constructed. The identification accuracy for the training and test were 80.65% and 75.93%, respectively (selected features: surface area (1.340), sphericity (1.370), protein density (1.370), and dry mass (1.370)). The small differences in accuracy (i.e. negligible overfitting) suggest that the trained cell type classifiers make use of the general characteristics of each lymphocyte cell type; thus the trained classification models would accurately identify the newly observed individual lymphocytes.

Table 1 Detailed performance of the B and T lymphocyte cell type classifiers.

Full size table

Table 2 Detailed performance of the CD4+ and CD8+ T lymphocyte cell type classifiers.

Full size table

Table 3 Detailed performance of the three lymphocyte cell types classifiers.

Full size table

Discussion

We demonstrated the identification of non-activated lymphocyte cell types at the single-cell level using ODT and machine learning. ODT provides quantitative morphological and biochemical information on the individual lymphocytes by measuring their 3-D RI distribution. We extracted the quantitative parameters from the 3-D tomograms and observed significant differences between the cell types at the population level. However, we failed to construct accurate cell type classifiers at the single-cell level, mainly due to the high dimensionality of the feature space and the cell-to-cell variations. To overcome this limitation, the k-NN (k = 4) algorithm was employed as a statistical classification method to systematically extract and exploit the cell-type-specific fingerprints encoded in the 3-D RI tomograms. The optimized cell type classifiers can discriminate B and T cells with high accuracy of approximately 90%. The CD4+ and CD8+ T cells could be distinguished as well with an overall accuracy of over 80%. In addition, the simultaneous multiclass identification of the three lymphocyte cell types presented an overall accuracy of over 75%.

The identification performance shows that the classification models discriminate B and T lymphocytes more precisely compared to the T cell subtype identification (CD4+ and CD8+ T cells); these results imply that the differences in cellular morphology and biochemical properties between the B and T cells are more distinct than those between the CD4+ and CD8+ T cells. This observation is consistent with the previous knowledge of the lymphocyte-differentiation pathway⁵⁴. The B and T lymphocytes originate from hematopoietic stem cells and then mature in different organs. Thus, the lymphocytes have similar cellular phenotypes such as one large nucleus and spherical shapes; however, the B and T cells have entirely different cellular functions. Even though our method established the cell type classifiers by optimizing the features based on the statistical performance instead of biological relevance, the machine learning algorithm automatically recognizes and exploits the distinct differences in the morphological and biochemical properties between the lymphocyte cell types.

The present approach combines 3-D RI tomography and statistical classification, for the first time to our knowledge, which provides several advantages. First, the present method enables the identification of lymphocyte cell types exploiting intrinsic optical properties of cells, which cannot be achieved by conventional optical microscopic techniques without using fluorescent methods. As ODT measures intrinsic 3-D RI distribution of individual lymphocytes, it provides consistent and highly reproducible results. In contrast, labeling methods such as fluorescence microscopy techniques can provide molecular-specific localization information. However, their signals are generally qualitative, and may vary significantly depending on experimental protocols, skills, and equipment. This variability of labeling methods may decrease overall identification accuracy. Alternatively, existing bright-field or phase contrast microscopy can be used to measure 2-D morphological features, such as projected area, aspect ratio, and nucleus size. However, these qualitative image data obtained with these conventional bright-field or phase contrast microscopy only provide limited information. We found that statistical classification performed by morphological information only (surface area, cellular volume, and sphericity) lowers overall accuracy (Supplementary Fig. 3). This result clearly indicates the advantages of using the 3-D RI tomogram. Second, the present method uses a simple and cost-effective optical setup compared to fluorescence-activated cell sorting or other label-free techniques such as Raman spectroscopy. Recently, a 3-D holographic microscope has become commercially available, which simplifies the optical system and achieves over 100 tomograms per second by exploiting a digital micromirror device to reduce the required time for measuring a 3-D RI tomogram^{55, 56}. Thus, the present approach can be easily transferred to basic research facilities and clinics. Lastly, there is no limitation in applying the present method to discriminate other types of cells including RBCs, cancer cells, neurons, and glial cells. Because ODT has been widely used to measure various biological samples, the present approach can be readily used to identify various cell types.

One of the limitations of the current study is that antibody-labelled lymphocytes were used to demonstrate the proof of principle. Because labeling and sorting procedures are inevitable to confirm the presence of antibodies in lymphocytes, and the use of antibody labeling is a standard technique to assure the subtypes of antibody-specific classification. Nonetheless, this approach still uses the intrinsic imaging contrast – 3-D RI tomography, and the labeling agents used to specify antibodies have negligible effects to the measured 3-D RI signal. CD4 antibody bound per cell value is about 48,000 and the total number of protein per cell of a eukaryotic cell is in the order of billions^{57, 58}. Thus, antibody staining accounts for less than 0.0005% of total protein numbers, which can be ignored in the measured 3-D RI tomogram because the RI signals attribute to total protein distributions.

From the algorithmic point of view, there are several points to be improved for practical use of the proposed technique. As described earlier, we compared several statistical classification algorithms including the k-NN (k = 4 and k = 6), linear discrimination analysis, quadratic discrimination analysis, naïve Bayes, and decision tree (Supplementary Fig. 2), and selected the k-NN (k = 4) as the classification model. While we tested several machine learning algorithms exploiting quantitative features with different thresholds widely used in ODT-based studies, employing advanced features extraction methods and statistical classification models could enhance the overall identification performance. Unfortunately, designing powerful features for 3-D biological microscopy data, especially for 3-D RI tomograms, has been largely unexplored and is beyond the scope of this proof-of-concept study. The methods developed in different disciplines could be translated to facilitate this direction: scale-invariant feature transform for X-ray computerized tomography⁵⁹ or histogram-based features for lidar-based point clouds⁶⁰. However, these ‘shallow’ descriptors require laborious optimization procedures specific to samples and imaging setups employed.

We expect that ‘deep’ learning, a state-of-the-art machine learning technique based on multilayered neural networks, could be a powerful and generic feature extraction strategy for 3-D biological microscopy. Recently, our group successfully combined 2-D holographic microscopy and deep learning for the label-free screening of multiple pathogens⁶¹. Upon extension to 3-D ODT, the remarkable flexibility and learning abilities of deep neural networks would let us fully exploit the complex information encoded in the 3-D RI tomograms and dramatically enhance the identification performance. An important step in this direction is to combine the proposed method with high-throughput imaging technologies⁶² to obtain a large size of training data sufficient for deep learning.

In summary, we envision that ODT combined with machine learning will be a useful tool in biomedical research. ODT quantitatively provides the morphological and biochemical characteristics of the samples, and then machine learning enables the label-free identification of cell types using the measured quantitative information. The present method can be widely used in the study of immunology, cancer biology, and neuroscience.

Methods

Mice

C57BL/6 J mice (gender and age-matched, 6–8 weeks) were purchased from Daehan Biolink (Republic of Korea). Animal care and experimental procedures were performed under approval of the Institutional Animal Care and Use Committee of KAIST (KA2010-21, KA2014-01 and KA2015-03). All the experiments in this study were carried out in accordance with the approved guidelines.

Flow cytometry for lymphocyte sorting

White blood cells were isolated from the blood harvested from the heart of mice. Erythrocytes were removed by ACK lysis. Cells were blocked with anti-CD16/32 and then stained for surface molecules. DAPI (4,6-diamidino-2-phenylindole; Roche, Switzerland) was used for dead cell exclusion. Sorting was performed on an Aria II or III system (BD Biosciences, CA) using an 85-μm nozzle or Astrios system (Beckman Coulter, CA) using a 70-μm nozzle. Antibodies for flow cytometry were purchased from BD Biosciences, eBioscience (CA), Biolegend (CA). The antibodies used were CD3ε (clone 17A2), CD4 (GK1.5), CD8α (53-6.7), CD19 (1D3), CD45R (B220, RA3-6B2), NK1.1 (PK136).

3-D refractive index tomography

To reconstruct the 3-D RI tomograms of lymphocytes, a Mach-Zehnder interferometric microscope was used¹⁵ (Fig. 1d). A laser beam from a diode-pumped solid-state laser (λ = 532 nm, 100 mW, Shanghai Dream Laser Co., China) is split into two arms using a beam splitter. One arm illuminates a sample with various illumination angles ranging from −60° to 60° in air at the sample plane with respect to the optic axis, which is systematically controlled with a dual-axis galvanomirror (GVS012, Thorlabs, NJ), and the other is used as a reference beam. The sample is placed between a condenser lens (UPLSAPO Water 60×, numerical aperture (NA) = 1.2, Olympus, Japan) and an objective lens (PLAPON Oil 100×, NA = 1.4, Olympus, Japan). For a single cell level analysis, we locate a single cell in the field of view using a manual translation stage. The diffracted light from the sample is then collected by the objective lens and projected onto the camera plane. At the camera plane, the sample beam interferes with the reference beam, generating spatially modulated holograms, which are then captured by a CMOS camera (1024 PCI, Photron USA Inc., CA). For reconstructing a 3-D RI tomogram, a total of 300 holograms of a sample are measured with a frame rate of 1000 Hz by changing the angle of illuminations which takes 0.3 s. Then, the optical field information (amplitude and phase) of the measured holograms are retrieved using a field retrieval algorithm based on Fourier transform^{49, 50}. From the retrieved multiple amplitude and phase information, a 3-D RI tomogram is reconstructed using an optical diffraction tomography algorithm^{14, 51}. An iterative regularization algorithm with a non-negativity constraint was used to fill the missing cone information which results from the limited NA of the condenser and objective lenses⁶³. Details on reconstructing 3-D RI tomograms can be found elsewhere^{15, 64}. The experimental resolution of our setup estimated by imaging a micro-bead was 373 nm and 496 nm for lateral and axial directions, respectively, which is consistent with the theoretical values⁶⁵.

Image processing and statistical analysis

Image processing was performed with Matlab (R2014b; MathWorks Inc., MA) and ImageJ (the National Institutes of Health, MD). The RI isosurfaces were rendered with commercial software (TomoStudio^TM, Tomocube Inc., Republic of Korea). Statistical analysis was done with GraphPad Prism (GraphPad Software Inc., CA). P-values were calculated by Student’s t−test.

Calculation of the quantitative structural and biochemical characteristics

The quantitative structural and biochemical parameters of the individual lymphocytes were calculated from the measured 3-D RI tomograms. To calculate the cellular surface area S and volume V, the voxels with the RI values higher than the threshold RI value were selected for segmentation from a 3-D RI tomogram of a lymphocyte. The surface are and volume were calculated from the number of voxels at the boundary and inner region of the segmented region, respectively. The sphericity, which is a dimensionless parameter that indicates the roundness of a lymphocyte, was obtained from the calculated surface area and volume as follows: Sphericity = π ^1/3·(6 V)^2/3/S. The biochemical characteristics (protein density and dry mass) were obtained from the RI values due to the well-characterized linear relation between the RI value and the local concentration of non-aqueous molecules (i.e., proteins, lipids, and nucleic acids inside cells; mostly proteins). RI values were converted to the protein density C with the following relation: n = n ₀ + αC, where n and n ₀ are the RI values of a voxel and the medium, respectively, and α is the refractive index increment (RII). Because it is known that most proteins have similar RII values, we used a RII value of 0.2 mL/g in this study. The total dry mass of a lymphocyte was calculated by simply integrating the protein density over the cellular volume. Details on calculating the quantitative information from 3-D RI tomograms can be found elsewhere^{23, 25}.

Machine learning

We investigated the 100-dimensional feature space as described in the main text. We selected the k-NN (k = 4) as the classification model after comparing several algorithms (Supplementary Fig. 2). The k-NN algorithm predicts the class of a newly observed data by choosing the most frequent class labels of k nearest neighbour data points in the feature space. We standardized all features prior training and test because k-NN is sensitive to pre-processing. Since there exists substantial redundancy between the features and it is desirable to choose minimal number of features to reduce overfitting, it was crucial to select the optimal feature set. We exhaustively searched all combinations of the morphological and biochemical features obtained at a single or two different RI threshold values. The feature set with the highest cross-validation accuracy was selected. The optimized classifier was tested using the data that was not utilized for training.

References

Radtke, F., Wilson, A., Mancini, S. J. C. & MacDonald, H. R. Notch regulation of lymphocyte development and function. Nat Immunol 5, 247–253 (2004).
Article CAS PubMed Google Scholar
Moser, B. & Loetscher, P. Lymphocyte traffic control by chemokines. Nat Immunol 2, 123–128 (2001).
Article CAS PubMed Google Scholar
Alizadeh, A. A. et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511, doi:10.1038/35000501 (2000).
Article ADS CAS PubMed Google Scholar
Rose, M. G. & Berliner, N. T-cell large granular lymphocyte leukemia and related disorders. Oncologist 9, 247–258, doi:10.1634/theoncologist.9-3-247 (2004).
Article PubMed Google Scholar
de Visser, K. E., Eichten, A. & Coussens, L. M. Paradoxical roles of the immune system during cancer development. Nat Rev Cancer 6, 24–37, doi:10.1038/nrc1782 (2006).
Article PubMed Google Scholar
Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511, doi:10.1038/nature01621 (2003).
Article ADS CAS PubMed Google Scholar
von Boehmer, H. & Melchers, F. Checkpoints in lymphocyte development and autoimmune disease. Nat Immunol 11, 14–20 (2010).
Article Google Scholar
Cabrera, R. et al. An immunomodulatory role for CD4(+)CD25(+) regulatory T lymphocytes in hepatitis C virus infection. Hepatology 40, 1062–1071, doi:10.1002/hep.20454 (2004).
Article CAS PubMed Google Scholar
Saez-Cirion, A. et al. HIV controllers exhibit potent CD8 T cell capacity to suppress HIV infection ex vivo and peculiar cytotoxic T lymphocyte activation phenotype. P Natl Acad Sci USA 104, 6776–6781, doi:10.1073/pnas.0611244104 (2007).
Article ADS CAS Google Scholar
Hobro, A. J., Kumagai, Y., Akira, S. & Smith, N. I. Raman spectroscopy as a tool for label-free lymphocyte cell line discrimination. Analyst 141, 3756–3764 (2016).
Article ADS CAS PubMed Google Scholar
Fischer, K. et al. Isolation and characterization of human antigen-specific TCR alpha beta(+) CD4(−)CD8(−) double-negative regulatory T cells. Blood 105, 2828–2835 (2005).
Article CAS PubMed Google Scholar
Wollscheid, B. et al. Mass-spectrometric identification and relative quantification of N-linked cell surface glycoproteins. Nat Biotechno 27, 378–386 (2009).
Article CAS Google Scholar
Kim, D. et al. Refractive index as an intrinsic imaging contrast for 3-D label-free live cell imaging. bioRxiv, doi:10.1101/106328 (2017).
Kim, K. et al. Optical diffraction tomography techniques for the study of cell pathophysiology. J Biomed Photonics Eng 2, 020201 (2016).
Google Scholar
Kim, K. et al. High-resolution three-dimensional imaging of red blood cells parasitized by Plasmodium falciparum and in situ hemozoin crystals using optical diffraction tomography. J Biomed Opt 19, doi:10.1117/1.Jbo.19.1.011005 (2014).
Kim, Y. et al. Common-path diffraction optical tomography for investigation of three-dimensional structures and dynamics of biological cells (vol 22, pg 10398, 2014). Opt Express 23, 18996–18996, doi:10.1364/Oe.23.018996 (2015).
Article ADS PubMed Google Scholar
Park, H. et al. Three-dimensional refractive index tomograms and deformability of individual human red blood cells from cord blood of newborn infants and maternal blood. J Biomed Opt 20, 111208, doi:10.1117/1.jbo.20.11.111208 (2015).
Article PubMed Google Scholar
Park, H. et al. Characterizations of individual mouse red blood cells parasitized by Babesia microti using 3-D holographic microscopy. Sci Rep 5, doi:10.1038/Srep10827 (2015).
Lee, S. Y., Park, H. J., Best-Popescu, C., Jang, S. & Park, Y. K. The Effects of Ethanol on the Morphological and Biochemical Properties of Individual Human Red Blood Cells. PloS one 10, e0145327 (2015).
Article PubMed PubMed Central Google Scholar
Hur, J., Kim, K., Lee, S., Park, H. & Park, Y. Melittin-induced alterations in morphology and deformability of human red blood cells using quantitative phase imaging techniques. bioRxiv 091991 (2016).
Lee, S., Park, H., Jang, S. & Yongkeun, P. Refractive index tomograms and dynamic membrane fluctuations of red blood cells from patients with diabetes mellitus. Blood 128(22), 4813–4813 (2016).
Google Scholar
Park, H. et al. Measuring cell surface area and deformability of individual human red blood cells over blood storage using quantitative phase imaging. Sci Rep 6 (2016).
Yoon, J. et al. Label-free characterization of white blood cells by measuring 3D refractive index maps. Biomed Opt Express 6, 3865–3875 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kim, K., Yoon, J. & Park, Y. Simultaneous 3D visualization and position tracking of optically trapped particles using optical diffraction tomography. Optica 2, 343–346 (2015).
Article CAS Google Scholar
Kim, K. et al. Three-dimensional label-free imaging and quantification of lipid droplets in live hepatocytes. Sci rep 6, 36815 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Kim, D. et al. Label-free high-resolution 3-D imaging of gold nanoparticles inside live cells using optical diffraction tomography. bioRxiv 097113 (2016).
Sung, Y. J. et al. Optical diffraction tomography for high resolution live cell imaging. Opt Express 17, 266–277, doi:10.1364/Oe.17.000266 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Kus, A., Dudek, M., Kemper, B., Kujawinska, M. & Vollmer, A. Tomographic phase microscopy of living three-dimensional cell cultures. J Biomed Opt 19, doi:10.1117/1.Jbo.19.4.046009 (2014).
Sung, Y., Choi, W., Lue, N., Dasari, R. R. & Yaqoob, Z. Stain-Free Quantification of Chromosomes in Live Cells Using Regularized Tomographic Phase Microscopy. Plos One 7, doi:10.1371/journal.pone.0049502 (2012).
Su, J. W., Hsu, W. C., Chou, C. Y., Chang, C. H. & Sung, K. B. Digital holographic microtomography for high-resolution refractive index mapping of live cells. J Biophotonics 6, 416–424, doi:10.1002/jbio.201200022 (2013).
Article PubMed Google Scholar
Hsu, W. C., Su, J. W., Tseng, T. Y. & Sung, K. B. Tomographic diffractive microscopy of living cells based on a common-path configuration. Opt Lett 39, 2210–2213, doi:10.1364/Ol.39.002210 (2014).
Article ADS PubMed Google Scholar
Cotte, Y. et al. Marker-free phase nanoscopy. Nat Photo 7, 418–418, doi:10.1038/nphoton.2013.116 (2013).
Article ADS CAS Google Scholar
Yang, S.-A., Yoon, J., Kim, K. & Park, Y. Measurements of morphological and biochemical alterations in individual neuron cells associated with early neurotoxic effects in Parkinson’s disease using optical diffraction tomography. Cytometry Part A, doi:10.1002/cyto.a.23110 (2017).
Bennet, M., Gur, D., Yoon, J., Park, Y. & Faivre, D. A Bacteria‐Based Remotely Tunable Photonic Device. Adv Opt Mater (2016).
Kim, T. et al. White-light diffraction tomography of unlabelled live cells. Nat Photon 8, 256–263 (2014).
Article ADS CAS Google Scholar
Kim, T. I. et al. Antibacterial Activities of Graphene Oxide–Molybdenum Disulfide Nanocomposite Films. ACS Appl Mater Interfaces 9, 7908–7917 (2017).
Article CAS PubMed Google Scholar
Lee, S. et al. High-Resolution 3-D Refractive Index Tomography and 2-D Synthetic Aperture Imaging of Live Phytoplankton. J Opt Soc Korea 18, 691–697 (2014).
Article Google Scholar
Lee, S. et al. Measurements of morphology and refractive indexes on human downy hairs using three-dimensional quantitative phase imaging. J Biomed Opt 20, 111207–111207 (2015).
Article PubMed Google Scholar
Kotsiantis, S. B., Zaharakis, I. & Pintelas, P. Machine learning: a review of classification and combining techniques. Artif Intell Rev 26, 159 (2007).
Article Google Scholar
Jo, Y. et al. Label-free identification of individual bacteria using Fourier transform light scattering. Opt Express 23, 15792–15805, doi:10.1364/Oe.23.015792 (2015).
Article ADS CAS PubMed Google Scholar
Jo, Y. et al. Angle-resolved light scattering of individual rod-shaped bacteria based on Fourier transform light scattering. Sci Rep 4, 5090, doi:10.1038/srep05090 (2014).
Article PubMed PubMed Central Google Scholar
Vercruysse, D. et al. Three-part differential of unlabeled leukocytes with a compact lens-free imaging flow cytometer. Lab on a Chip 15, 1123–1132 (2015).
Article CAS PubMed Google Scholar
Chen, C. L. et al. Deep Learning in Label-free Cell Classification. Sci Rep 6 (2016).
Ramos-Pollan, R. et al. Discovering Mammography-based Machine Learning Classifiers for Breast Cancer Diagnosis. J Med Syst 36, 2259–2269, doi:10.1007/s10916-011-9693-2 (2012).
Article PubMed Google Scholar
Ozolek, J. A. et al. Accurate diagnosis of thyroid follicular lesions from nuclear morphology using supervised learning. Med Image Anal 18, 772–780, doi:10.1016/j.media.2014.04.004 (2014).
Article PubMed PubMed Central Google Scholar
Bergner, N. et al. Identification of primary tumors of brain metastases by Raman imaging and support vector machines. Chemometr Intell Lab 117, 224–232, doi:10.1016/j.chemolab.2012.02.008 (2012).
Article CAS Google Scholar
Hejna, M., Jorapur, A., Song, J. S. & Judson, R. L. High accuracy label-free classification of kinetic cell states from holographic cytometry. bioRxiv 127449 (2017).
Lee, K. et al. Quantitative phase imaging techniques for the study of cell pathophysiology: from principles to applications. Sensors 13, 4170–4191 (2013).
Article PubMed PubMed Central Google Scholar
Takeda, M., Ina, H. & Kobayashi, S. Fourier-transform method of fringe-pattern analysis for computer-based topography and interferometry. J Opt Soc Am 72, 156–160 (1982).
Article ADS Google Scholar
Debnath, S. K. & Park, Y. Real-time quantitative phase imaging with a spatial phase-shifting algorithm. Opt Lett 36, 4677–4679 (2011).
Article ADS PubMed Google Scholar
Wolf, E. Three-dimensional structure determination of semi-transparent objects from holographic data. Opt Commun 1, 153–156 (1969).
Article ADS Google Scholar
Miska, E. A. Microrna expression profiles classify human cancers. Cytometry B Clin Cytom 72b, 126–126 (2007).
Google Scholar
Chen, X. W., Zhou, X. B. & Wong, S. T. C. Automated segmentation, classification, and tracking of cancer cell nuclei in time-lapse microscopy. Ieee T Bio-Med Eng 53, 762–766, doi:10.1109/Tbme.2006.870201 (2006).
Article Google Scholar
Ogawa, M. Differentiation and Proliferation of Hematopoietic Stem-Cells. Blood 81, 2844–2853 (1993).
CAS PubMed Google Scholar
Shin, S. et al. Optical diffraction tomography using a digital micromirror device for stable measurements of 4D refractive index tomography of cells. SPIE BiOS. 971814-971814-971818 (2016).
Shin, S., Kim, K., Yoon, J. & Park, Y. Active illumination using a digital micromirror device for quantitative phase imaging. Opt Lett 40, 5407–5410 (2015).
Article ADS PubMed Google Scholar
Wang, L. et al. Discrepancy in measuring CD4 expression on T‐lymphocytes using fluorescein conjugates in comparison with unimolar CD4‐phycoerythrin conjugates. Cytometry B Clin Cytom 72, 442–449 (2007).
Article PubMed Google Scholar
Milo, R. What is the total number of protein molecules per cell volume? A call to rethink some published values. Bioessays 35, 1050–1055 (2013).
Article CAS PubMed PubMed Central Google Scholar
Flitton, G. T., Breckon, T. P. & Bouallagu, N. M. Object recognition using 3d sift in complex ct volumes. BMVC (2010).
Behley, J., Steinhage, V. & Cremers, A. B. Performance of histogram descriptors for the classification of 3D laser range data in urban environments In Robotics and Automation (ICRA), 4391–4398 (2012).
Jo, Y. et al. Holographic deep learning for rapid optical screening of anthrax spores. bioRxiv, 109108 (2017).
Sung, Y. et al. Three-dimensional holographic refractive-index measurement of continuously flowing cells in a microfluidic channel. Phys Rev Appl 1, 014002 (2014).
Article ADS PubMed PubMed Central Google Scholar
Lim, J. et al. Comparative study of iterative reconstruction algorithms for missing cone problems in optical diffraction tomography. Opt Express 23, 16933–16948, doi:10.1364/Oe.23.016933 (2015).
Article ADS CAS PubMed Google Scholar
Kim, K., Choe, K., Park, I., Kim, P. & Park, Y. Holographic intravital microscopy for 2-D and 3-D imaging intact circulating blood cells in microcapillaries of live mice. Sci Rep 6 (2016).
Lauer, V. New approach to optical diffraction tomography yielding a vector equation of diffraction tomography and a novel tomographic microscope. J Microsc 205, 165–176, doi:10.1046/j.0022-2720.2001.00980.x (2002).
Article MathSciNet CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by KAIST, Tomocube Inc., and the National Research Foundation of Korea (2015R1A3A2066550, 2014M3C1A3052567, 2014K1A3A1A09063027 to Y.P., 2012M3A9B4027955 to S.K.). Y.J. acknowledges support from KAIST Presidential Fellowship.

Author information

Jonghee Yoon
Present address: Department of Physics, University of Cambridge, Cambridge, CB3 0HE, UK
Jonghee Yoon and YoungJu Jo contributed equally to this work.

Authors and Affiliations

Department of Physics, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
Jonghee Yoon, YoungJu Jo, Kyoohyun Kim, SangYun Lee & YongKeun Park
KAIST Institute Health Science and Technology, Daejeon, 34141, Republic of Korea
Jonghee Yoon, YoungJu Jo, Kyoohyun Kim, SangYun Lee & YongKeun Park
Department of Biological Sciences, KAIST, Daejeon, 34141, Republic of Korea
Min-hyeok Kim & Suk-Jo Kang
Tomocube, Inc., Daejeon, 34051, Republic of Korea
YongKeun Park

Authors

Jonghee Yoon
View author publications
You can also search for this author in PubMed Google Scholar
YoungJu Jo
View author publications
You can also search for this author in PubMed Google Scholar
Min-hyeok Kim
View author publications
You can also search for this author in PubMed Google Scholar
Kyoohyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
SangYun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Suk-Jo Kang
View author publications
You can also search for this author in PubMed Google Scholar
YongKeun Park
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.P. conceived of the idea and directed the work. J.Y. performed the optical experiments and processed the tomographic data. Y.J. designed and implemented the classification models. J.Y. and Y.J. optimized the cell type classifiers. M.K. isolated and sorted the lymphocytes from mice peripheral blood. K.K. designed the optical system. J.Y., Y.J., M.K., S.K., and Y.P. analysed the data. All authors wrote and revised the manuscript.

Corresponding author

Correspondence to YongKeun Park.

Ethics declarations

Competing Interests

Y.P. has financial interests in Tomocube Inc., a company that commercializes optical diffraction tomography and quantitative phase imaging instruments and is one of the sponsors of this study.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Video 1

Video 2

Video 3

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yoon, J., Jo, Y., Kim, Mh. et al. Identification of non-activated lymphocytes using three-dimensional refractive index tomography and machine learning. Sci Rep 7, 6654 (2017). https://doi.org/10.1038/s41598-017-06311-y

Download citation

Received: 20 February 2017
Accepted: 05 June 2017
Published: 27 July 2017
DOI: https://doi.org/10.1038/s41598-017-06311-y

This article is cited by

On the use of deep learning for phase recovery
- Kaiqiang Wang
- Li Song
- Edmund Y. Lam
Light: Science & Applications (2024)
AI-driven projection tomography with multicore fibre-optic cell rotation
- Jiawei Sun
- Bin Yang
- Juergen W. Czarske
Nature Communications (2024)
Real-time simultaneous refractive index and thickness mapping of sub-cellular biology at the diffraction limit
- Arturo Burguete-Lopez
- Maksim Makarenko
- Andrea Fratalocchi
Communications Biology (2024)
Long-term label-free assessments of individual bacteria using three-dimensional quantitative phase imaging and hydrogel-based immobilization
- Jeongwon Shin
- Geon Kim
- YongKeun Park
Scientific Reports (2023)
Label-free liquid biopsy through the identification of tumor cells by machine learning-powered tomographic phase imaging flow cytometry
- Daniele Pirone
- Annalaura Montella
- Pietro Ferraro
Scientific Reports (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Discussion

Methods

Mice

Flow cytometry for lymphocyte sorting

3-D refractive index tomography

Image processing and statistical analysis

Calculation of the quantitative structural and biochemical characteristics

Machine learning

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links