Introduction

Image segmentation has become an essential technique in fields from medical imaging1,2,3 and autonomous driving4 to robotic perception5 and image compression6,7,8. Through unsupervised segmentation of large data sets, trained algorithms can recognize and predict elements of new images. An appealing application of image segmentation is in the thickness identification of two-dimensional (2D) materials from their digital optical microscopy images. Current flake detection methods rely heavily on identification by trained researchers, a human-learning process, in which flakes are identified by their contrast difference on a substrate after significant trial and error. Automatic thickness identification would relieve this tedious, time-consuming screening process and possibly improve identification accuracy.

The easiest implementation of image segmentation for 2D materials is by thresholding. This is performed by analyzing image contrast, from reflectance or transmittance for example, and partitioning regions of an image based on contrast level difference. This technique has been widely and successfully employed in the identification and characterization of exfoliated 2D materials9,10,11,12,13,14,15,16,17,18,19. Thresholding techniques, while easily implemented, suffer from inaccuracy when contrast differences become relatively small, for example a single layer of graphene on a silicon/silicon dioxide (Si/SiO\(_2\)) substrate, and can be highly dependent on precise experimental conditions hindering universal application.

Recently, a variety of more advanced machine learning techniques have emerged to automate and improve the process of identifying exfoliated 2D materials20,21,22,23,24,25,26,27,28,29,30. These include techniques based on neural networks and data clustering but have been primarily applied to opaque substrates and almost entirely to standard Si/SiO\(_2\). Transparent substrates are commonly used for exfoliation or experiments on 2D materials11,31,32,33,34,35. A method that can be applied to identify the thickness of any material on any substrate is highly desirable.

Here we present an open source program36 written in Python 3 to automatically identify the thickness of exfoliated 2D flakes which can be universally applied to different materials and substrates. We combine three well-established clustering techniques to form a training script to segment layers of a flake, manually label the layers, and then use that training to test thicknesses of other flakes. The program presents roughly a 95% pixel accuracy for graphene and transition metal dichalcogenides on silicon/silicon dioxide and polydimethylsiloxane (PDMS) substrates. Importantly, no change to the program’s adjustable parameters are needed to identify different materials on different substrates, allowing simple and universal application to any material/substrate combination.

Overview of the program

An overview of the program applied to graphene on Si/SiO\(_2\) is shown in Fig. 1. The training stage begins with a set of optical microscopy images cropped to few-layer flakes whose thicknesses are determined using optical contrast methods9. Figure 1a shows a cropped optical image of a few-layer graphene flake on a Si/SiO\(_2\) (300 nm SiO\(_2\)) substrate with each layer thickness labeled (1–4 layers). A scatter plot of the red, green, and blue (RGB) channel values (normalized to a range 0–1) for each pixel in the image is shown in Fig. 1b. The data points have been colored according to their RGB values. At this stage, the scatter plot shows only broad features that can be generally associated with the substrate (pink) and few-layer graphene (purple) but no clear correspondence to individual thicknesses can be made. The raw image is preprocessed using a bilateral filter to reduce noise and a background normalization using a planar fit. The result, after compression to roughly 10,000 total pixels, is shown in Fig. 1c. Preprocessing reveals the individual clusters of data in the scatter plot (Fig. 1d) associated with the substrate and flake layers.

The location and distribution of these clusters in RGB space are found using a series of unsupervised clustering techniques summarized here and detailed further below. The centers are first located using mean shift and density-based spatial clustering. Once the centers are identified, fit characteristics such as the weight, mean position, and distribution of each cluster are found using a Gaussian mixture model. An image of the result of the fitting algorithm is shown in Fig. 1e. The pixels in the color plot have been colored according to the fit results and the scatter plot (Fig. 1f) shows these fit assignments in more detail. Once the cluster characteristics are extrapolated for several training images, a master catalogue is created that ties the fit clusters to the predetermined flake thicknesses in our training images.

To determined accuracy of the training, we test the master catalogue on a set of images with identified thicknesses. An example of the testing results for graphene on Si/SiO\(_2\) is shown in Fig. 1g–j. In the testing stage, untrained optical images (Fig. 1g,h) are first preprocessed using the same procedure as in training but without cropping. The preprocessed images are then checked against the master catalogue for flake thickness assignment given the pixel location in RGB space. Figure 1i shows the result (cropped to show detail of flake of interest) with layer thicknesses identified by the color bar. The associated scatter plot in Fig. 1j shows the corresponding clusters and layer assignments. In the following section we detail the implementation of the clustering algorithms before presenting results of our program performed on other materials and substrates.

Figure 1
figure 1

Overview of the program composed of training and testing stages. (a,b) Raw optical image of a few-layer graphene flake on 300 nm of SiO\(_2\) (a) and its corresponding scatter plot of each pixel in RGB color space (b). The scatter plot data points are colored to match their RGB value. (c,d) The result after preprocessing the image in (a). The preprocessing reveals well-defined clusters in the associated scatter plot (d). (e,f) A color plot of pixel-cluster association (e) and corresponding scatter plot (f). Each pixel is colored based on its most probable data cluster identity. (g,h) Raw optical image of few-layer graphene flake with unknown thickness (g) used for testing and its corresponding RGB scatter plot (h). (i,j) Crop of (g) around the flake of interest (i) and corresponding scatter plot after the testing stage (j).

Unsupervised clustering algorithms

Our training script incorporates three clustering algorithms to identify the center of the data clusters and fit their distributions. Without explicitly knowing the number of clusters (layers) in the image, the script begins with an unsupervised method of determining the seed number. We use mean shift37,38,39 and density-based40,41 algorithms to first find these cluster centers which are then fed to a Gaussian mixture model for fitting arbitrary ellipsoidal distributions.

Mean shift is an unsupervised machine learning algorithm that locates centers of high data density. The algorithm begins by populating color space with an array of points, referred to as “mean points” (\(\vec {\rho }_k\)). Figure 2a shows the same data as in Fig. 1d but with red closed circles indicating the initial positions of the equally spaced mean points (an \(8\times 8\times 8\) array). The next step groups all data points within a defined radius (\(\epsilon\)) of each mean point together. We define \(\epsilon\) to be just large enough to overlap with its nearest neighbors. The average location of the data pixels within \(\epsilon\) of a given mean point becomes the new position of that mean point (\(\vec {\rho }_k'\)) after one iteration of the algorithm. This is calculated by:

$$\begin{aligned} \vec {\rho }_k'=\frac{1}{M}\sum _i{\vec {x}_i} \text { for } |\vec {x}_i-\vec {\rho }_k|<\epsilon , \end{aligned}$$
(1)

where \(\vec {x}_i\) is the position of each data point and M is the total number of data points within \(\epsilon\) of \(\vec {\rho }_k\). In this way, each mean point gradually shifts towards higher densities of data. Figure 2b shows one iteration of the algorithm. Several points have moved to their new mean positions according to the data within \(\epsilon\) of each \(\vec {\rho }_k\).

Mean shift is computationally slow, having to calculate the distance between every data point (\(\approx\) 10,000 pixels) and every mean point (initially 512). To increase efficiency, mean points that have no data within \(\epsilon\) after the first cycle, and thus make no contribution towards a data cluster, are deleted. Additionally, mean points may approach their local maximum at different rates. Per mean point, as soon as the number of data points within their radius starts to decrease, they are turned off and no longer involved in future calculations. Figure 2c shows the final state of the algorithm where all mean points have converged to their local density maxima. After this, outliers are removed before moving to the next algorithm.

Once mean shift is complete, several mean points will themselves be clustered in color space and some mean points will have converged to outliers. Due to the ellipsoidal shape of the clusters in RGB space after preprocessing, the mean points will tend to lie along lines. An efficient algorithm for grouping these lines is Density-Based Spatial Clustering of Applications with Noise (DBSCAN)40,41. DBSCAN groups data together by following the trajectory of nearby points. The algorithm starts by “visiting” a random mean point. A radius (we find \(\epsilon /2\) works well) around it is checked for other mean points. If none are found, the starting point is labeled an outlier. If there are neighbors, they are grouped together. One of the other points in this group is visited next, checking the same radius around itself to find new points to add to the group. This repeats until no new points are added to the group and every point within the group has been visited. Once the group is finished, a new group starts at a randomly chosen mean point and the process repeats. The centers of each group are found by averaging their respective mean points. Figure 2d shows the result after running the DBSCAN algorithm on the mean points in Fig. 2c.

Figure 2
figure 2

Mean shift and DBSCAN clustering for identification of cluster centers. (a) The same data as in Fig. 1d with red closed circles showing the initial positions of the mean points in the mean shift algorithm. (b) Scatter plot after one cycle of mean shift; outlier mean points have been deleted and others have moved towards their local density maxima. (c) The final state of the mean points after they have converged to their maxima. (d) The RGB pixels plotted with the identified cluster centers (colored closed circles) after the DBSCAN algorithm.

The combination of mean shift and DBSCAN presents an unsupervised method of determining how many clusters are in a given image and their centers. Following this step, this information can be used to seed a more powerful clustering technique for data with ellipsoidal distributions. The popular K-means clustering technique42,43,44, for example, is undesirable here as it assumes spherical clusters. Instead, we use a multivariate Gaussian Mixture Model (GMM)2,45,46,47,48,49 that allows fitting of data with arbitrary normal distributions. This expands application of the program by automatically handling new materials and substrate combinations that may have different cluster distributions in RGB space.

In the GMM, each fitting ellipsoid has three characteristics developing throughout the process: the weight (\(\phi _k\), defining the number of data points near ellipsoid k), the centroid (\(\vec {\mu }_k\), defining the mean of the data points belonging to ellipsoid k), and the covariance matrix (\(\Sigma _k\), defining the shape and orientation of ellipsoid k in RGB space). These characteristics are used to calculate the probability \(\gamma _{ik}\) of a data point \({\vec {x}_i}\) belonging to ellipsoid k. This probability is given by:

$$\begin{aligned} \gamma _{ik}=\frac{\phi _k{\mathcal {N}}(\vec {x}_i,\vec {\mu }_k,\Sigma _k)}{\sum _{j=1}^K\phi _j{\mathcal {N}}(\vec {x}_i,\vec {\mu }_j,\Sigma _j)}, \end{aligned}$$
(2)

where K is the total number of clusters and \({\mathcal {N}}\) is the three-variable (for 3-dimensional RGB space) Gaussian distribution given by:

$$\begin{aligned} {\mathcal {N}}(\vec {x}_i,\vec {\mu }_k,\Sigma _k)=\left[(2\pi )^3 |\Sigma _k| e^{(\vec {x}_i-\vec {\mu }_k)^T\Sigma _k^{-1}(\vec {x}_i-\vec {\mu }_k)}\right]^{-\frac{1}{2}} \end{aligned}.$$
(3)

The weights, means, and covariance matrices used in these relations are calculated through:

$$\begin{aligned} \phi _k= & {} \frac{1}{N}\sum _{i=1}^N\gamma _{ik}, \end{aligned}$$
(4)
$$\begin{aligned} \vec {\mu }_k= & {} \frac{\Sigma ^N_{i=1}\gamma _{ik}\vec {x}_i}{\Sigma ^N_{i=1}\gamma _{ik}}, \end{aligned}$$
(5)
$$\begin{aligned} \Sigma _k= & {} \frac{\sum _{i=1}^N\gamma _{ik}|\vec {x}_i-\vec {\mu }_k|^2}{\sum _{i=1}^N\gamma _{ik}}. \end{aligned}$$
(6)

First we initialize each of the fitting ellipsoids by setting all initial weights to 1/K. The centroids are taken directly from the results of DBSCAN \(\vec {\mu }=\vec {\rho }\). The covariance matrices are initialized from the centroids using Equation 6 with \(\gamma _{ik}=1\). Figure 3a shows the initialization of the fitting ellipsoids for our example few-layer graphene data set from Figs. 1d and 2. The ellipsoids have been scaled to a 95% confidence level.

An unsupervised machine learning algorithm, referred to as expectation-maximization (EM), is used to further optimize the ellipsoid parameters and fit the data. The expectation step determines \(\gamma _{ik}\) based on the initialized weights, centroids, and covariance matrices calculated above. The maximization step uses these probabilities to re-calculate each cluster’s weight, centroid, and covariance matrix. These two steps iterate and gradually the ellipsoid parameters converge. Figure 3b shows the algorithm results after 2 cycles and Fig. 3c shows the results after 30 cycles. After 30 cycles, the ellipsoids resemble the distributions of the data with several small tight ellipsoids corresponding to the substrate and 1–4 layers of graphene, and two larger ellipsoids (purple and blue) accounting for noise. The max change of all cluster’s weights between maximization steps (\(\Delta \phi _k < 0.0001\)) is used to define convergence and end the algorithm. Figure 3(d) shows the results of the algorithm after convergence (total 61 cycles) for this data set. Note that the large purple and blue ellipsoids are a product of over-fitting the data (fitting 7 ellipsoids to 5 data clusters). These ellipsoids do not contribute to the master catalogue but are important for fitting data points associated with thicker layers (\(>4\)) and outliers. Additionally, the over fitting allows the primary ellipsoids to confine themselves to the core of their data clusters. Once convergence has been reached, only ellipsoids that fit well to known layer thicknesses are added to a catalogue.

The training process is repeated for multiple flakes of the same material and substrate, saving their ellipsoid characteristics into the same catalogue. A master catalogue is then created by averaging together the characteristics of ellipsoids with like-thickness. This master catalogue is the tool with which we can test other images to determine their flake layer thicknesses (Fig. 1i,j).

Figure 3
figure 3

Cluster fitting with a Gaussian mixture model (GMM). The scatter plot from Fig. 1d is superimposed with 95% confidence ellipsoids based on the fit characteristics of the GMM-EM algorithm. (a) The initialized ellipsoids show little correspondence to the underlying data. (b) After two cycles of expectation-maximization, the ellipsoids better resemble the data clusters. (c) After 30 cycles, some ellipsoids have nearly converged on their data clusters. (d) The convergence condition is reached after 61 cycles.

General application to other materials and substrates

Our script can be universally applied to the identification of other 2D material thicknesses on opaque and transparent substrates. This generality is achieved by analyzing all three dimensions of the color-space data and fitting the resulting clusters of arbitrary shape with our GMM-EM algorithm. Importantly, no change in the adjustable parameters (\(\epsilon\) or GMM convergence) are required for the following results.

Figure 4 displays the power of this generality by identifying the layer thickness of two additional materials, molybdenum disulfide (MoS\(_2\)) and molybdenum diselenide (MoSe\(_2\)), on opaque (Si/SiO\(_2\)) and transparent (polydimethylsiloxane (PDMS)) substrates. MoS\(_2\) on Si/SiO\(_2\) (Fig. 4a–e) presents clusters very similar to those of graphene on Si/SiO\(_2\) but they are separated further in RGB space (Fig. 4b). Further training for this material/substrate combination would improve our testing results which only identify layer thicknesses of 1 and 2 (Fig. 4d,e). From the covariance matrices we note that while all the data clusters are technically triaxial ellipsoids (none of the semi-axes are equal), the clusters for materials on Si/SiO\(_2\) are roughly prolate spheroids with one axis (blue) an order of magnitude larger than the other two semi-axes (red and green).

MoS\(_2\) on PDMS (Fig. 4f–j) presents clusters again extending along the blue axis, though not as strongly as materials on Si/SiO\(_2\). The clusters are similarly well-separated in RGB space as they are for MoS\(_2\) on Si/SiO\(_2\). Testing for this set identifies monolayer, bilayer and trilayer thicknesses (Fig. 4j). Finally, MoSe\(_2\) on PDMS presents the most spherical ellipsoids of our investigation still slightly extending along blue, (Fig. 4k–o) and mono- through trilayer thicknesses are easily identified (Fig. 4o).

Discussion

Our investigation focuses on the development of a program that can be universally applied to different 2D materials and substrates. This requirement invariably introduces computation time when compared with other recent segmentation methods22,50. The training time, for example, reported in Ref.50 for the entire program is roughly 31 h. Computation times for the training stage of our program depend on the image composition. A single layer image can take about 10 min but images with multilayer flakes (more clusters) take as long as 5 h. Our program results here are from training sets of roughly 10 images corresponding to about 10 h computation time. However, this is a single event time cost because once the master catalogue is trained for a particular material and substrate combination, it can then be used repeatedly in the testing step, which is more efficient.

Image testing requires roughly one minute to identify layer thicknesses of new images. Computation time is sufficiently short for testing because image pixels are simply compared with the master catalogue. This time would allow in-situ identification of flakes from images taken by human inspection of a substrate. The time spent scanning between images can take several minutes. The time may also be sufficient for an automated scanning system such as that presented in Ref.51. Improvements in computation time may be sought through further image compression or possibly reducing the testing step to two dimensions of the three-dimensional RGB space, possibly blue and either green or red, similar to algorithms presented in ref.23. Although, the dropped color dimension would have to be identified for a particular material/substrate combination.

For each material/substrate combination investigated in this study, the pixel accuracy was determined by creating a ground truth image and comparing it, pixel-by-pixel, with the testing images (see Fig. S1 in the Supplemental Materials for details). Pixel accuracy was slightly better for materials on PDMS but overall, the program achieves an average accuracy of 95% for the materials and substrates investigated in this study. This pixel accuracy is comparable to that achieved in studies based on much larger training sets. Reference50 reports pixel accuracy of 97% from a training set of 917 images. Based on these results, normalized confusion matrices for each combination were calculated showing the individual layer accuracy as well. Finally, we note that a clear advantage to our approach is the simplicity of our program which relies on well-known and proven clustering techniques with relatively high pixel accuracy from small training sets.

Conclusion

Summarizing, we have presented a code for the automatic identification of flake thicknesses that can be universally applied to a variety of 2D materials and substrates. The algorithm analyzes data clusters in RGB space of preprocessed optical microscopy images. It can accurately identify mono- and few-layer thicknesses with a pixel accuracy of 95%. We anticipate the program will be of use for a wide variety of materials and substrates for the continued interest and investigation into the properties and characteristics of 2D materials.

Figure 4
figure 4

General thickness identification of 2D materials on opaque and transparent substrates. (ac) An example training process for MoS\(_2\) flakes exfoliated onto Si/SiO\(_2\) substrates. (a) Raw optical image before preprocessing. (b) RGB scatter plot of the identified clusters with pixels colored according to the cluster they belong to. (c) Reconstructed image of the MoS\(_2\) flake in (a) after the training step. (d,e) Testing process for MoS\(_2\) on Si/SiO\(_2\). (d) Raw optical image of an MoS\(_2\) flake on Si/SiO\(_2\). (e) Layer identification after testing the image in (d). (fh) An example training process for MoS\(_2\)/PDMS. (i,j) An example testing process for MoS\(_2\)/PDMS. (km) An example training process for MoSe\(_2\)/PDMS. (n,o) An example testing process for MoSe\(_2\)/PDMS.