Rapid and flexible segmentation of electron microscopy data using few-shot machine learning

Automatic segmentation of key microstructural features in atomic-scale electron microscope images is critical to improved understanding of structure–property relationships in many important materials and chemical systems. However, the present paradigm involves time-intensive manual analysis that is inherently biased, error-prone, and unable to accommodate the large volumes of data produced by modern instrumentation. While more automated approaches have been proposed, many are not robust to a high variety of data, and do not generalize well to diverse microstructural features and material systems. Here, we present a flexible, semi-supervised few-shot machine learning approach for segmentation of scanning transmission electron microscopy images of three oxide material systems: (1) epitaxial heterostructures of SrTiO3/Ge, (2) La0.8Sr0.2FeO3 thin films, and (3) MoO3 nanoparticles. We demonstrate that the few-shot learning method is more robust against noise, more reconfigurable, and requires less data than conventional image analysis methods. This approach can enable rapid image classification and microstructural feature mapping needed for emerging high-throughput characterization and autonomous microscope platforms.


INTRODUCTION
Material microstructures govern the functionality of many important technologies, including catalysts, energy storage devices, and emerging quantum computing architectures. Scanning transmission electron microscopy (STEM) has long served as a foundational tool to study microstructures because of its ability to simultaneously resolve structure, chemistry, and defects at atomic-scale resolution for a range of materials classes [1][2][3] . STEM has helped elucidate the nature of microstructural features ranging from complex dislocation networks to secondary phases and point defects, leading to refined structure-property models 2,4,5 . Traditionally, STEM images have been analyzed by a domain expert manually or semi-automatically, utilizing a priori knowledge of the system to identify microstructural features of interest. While this approach is suitable for measuring a limited number of microstructural features in small data volumes, it is impractical for samples possessing high density, rare, or noisy features 6,7 . Moreover, manual and semi-automatic approaches are difficult to scale to include multiple data modalities and cannot be performed at high speed, hindering our ability to perform in situ, complementary or correlative studies harnessing the full potential of modern instruments 8 . At a more fundamental level, variability in how such measurements are conducted and a lack of standardized approaches contributes to the broader issue of reproducibility in experimentation 9 . Though these limitations apply to all materials classes, they are particularly pronounced for complex oxides, whose properties are heavily influenced by even trace amounts of unwanted defects [10][11][12] . Hence, there is an urgent need to develop approaches to characterize microstructural features with greater accuracy, speed, and statistical rigor than is possible with existing methodologies.
A central challenge in quantitatively describing microscopy image data (i.e., micrographs) is the wide variety of possible microstructural features and data modalities. The same instrument that is used to examine interfaces at atomic-resolution one day may be used to examine nanoparticle morphologies or grain boundaries the next. A common goal in any study employing electron microscopy, in particular STEM, is to extract quantitative and semantically-meaningful microstructural descriptors that can be linked to underlying physical models 13,14 . For example, estimating the area fraction of a specific phase or abundance of a feature through image segmentation is an important part of understanding synthesis products and phase transformation kinetics [15][16][17][18][19] . Although several image segmentation methods exist (e.g., Otsu 20 , the watershed algorithm 21 , k-means clustering 22 ), these are often not easily generalizable to different material systems, image types, and may require significant tailored image preprocessing.
Machine learning (ML) methods, specifically convolutional neural networks (CNNs), have recently been adopted for the recognition and characterization of microstructural data across length scales [23][24][25][26] . Classification tasks have been performed to either assign a label to an entire image that represents a material or microstructure class (e.g., dendritic, equiaxed, etc.) [26][27][28][29] , or to assign a label to each pixel in the image so that they are classified into discrete categories 25,[30][31][32] . The latter classification type, categorization of pixels in an image to identify local features (e.g., line defects, phases, crystal structures), is referred to as segmentation. Still, many challenges remain in the practical application of segmentation methods, such as the large data set size required for training and the difficulty of developing methods that are generalizable to a wide variety of data. In deep-learning and computer vision approaches, like a CNN, learning models of image categories has typically required large databases of labeled training examples 33 , such as the large image data set available through the ImageNet database 34,35 . However, recent research has demonstrated effective and lightweight learning techniques that require relatively few labeled training examples for the purpose of automatic segmentation and denoising 36,37 . The ability to analyze data sets on the basis of limited training data, as often encountered in microscopy 38,39 , is an important frontier in materials and data science. Unfortunately, even one labeled example oftentimes includes tedious manual annotation of thousands or even millions of pixels. The architecture developed by Pelt and Sethian 37 , while much less complex than existing networks and requiring fewer training points, still uses eight manually annotated images of 512 × 512 × 512 cubic pixel tomographic reconstructions. Additionally, the classification or segmentation task is also dependent upon these labels, i.e., new labels must be constructed in order to change the classification/ segmentation output.
Motivated by the ability of humans, and especially children, to learn novel visual concepts with sufficient previous knowledge 40 one-shot or few-shot approaches allow human-level performance with fewer and less intensively labeled images (i.e., shots) and little to no training 41,42 , but there are limited studies on such methods in the materials science domain 36 . While many characterization tools may provide just a few data points, a single electron micrograph (and potentially additional imaging/spectral channels) may encompass many microstructural features of interest. The one-shot or few-shot learning concept also has significant implications for the study of transient or unstable materials, as well as those where limited samples are available for analysis due to long lead-time experimentation (such as corrosion or neutron irradiation studies). In other cases, there exists data from previous studies that may be very limited or poorly understood, for which advanced data analysis methods could be applied 43 .
In this work, we present a rapid and flexible approach to recognition and segmentation of STEM images using few-shot machine learning. Three oxide materials systems were selected for model development (epitaxial heterostructures of SrTiO 3 (STO)/Ge, La 0.8 Sr 0.2 FeO 3 (LSFO) thin films, and MoO 3 nanoparticles) due to the range of microstructural features they possess, and their importance in semiconductor, spintronic, and catalysis applications 44,45 . While the three systems selected are all oxides imaged in the STEM-HAADF mode, these data encompass a wide range of morphologies, length scales, and microstructural features. It should also be mentioned that the features of interest in these systems and data sets (interfaces, nanoparticles) are commonly analyzed in many other material systems, such as metals and alloys. See Supplementary Note 1 for example applicability of this method to a broader data set. We demonstrate that with only 5-8 sub-images (referred to here as chips) that represent examples of a specific microstructural feature (e.g., a crystal motif or particular particle morphology), our model yields segmentation results comparable to those produced by a domain expert for all systems studied here. The successful image mapping can be attributed to the low noise sensitivity and high learning capability of few-shot machine learning in comparison to other segmentation methods (e.g., Otsu thresholding, watershed, k-means clustering, etc.). The few-shot approach rapidly identifies varying microstrutural features across STEM data streams, which can inform real-time image data collection and analysis. More broadly, our findings underscore the power of image-driven machine learning to enable Prototype comparison Fig. 1 Few-shot model architecture. The raw STO/Ge image (a) is broken into several smaller chips (b) and a few user defined chips are used to represent desired segmentation classes in the support set (c). Each chip then acts as a query and is compared against a prototype (d), defined by the support set, and categorized according to the minimum Euclidean distance between the query and each prototype, yielding the segmented image (e). Scale bar = 5 nm.
improved microstructural characterization for materials discovery and design.

RESULTS AND DISCUSSION
A deep-learning approach known as few-shot learning was developed for superpixel semantic segmentation of STEM images, i.e., the classification of superpixels in a STEM image. The premise of this few-shot learning model is to use very few labeled examples (<10) per class for the model to identify regions of an image that correspond to each class. This superpixel segmentation approach additionally allows us to leverage the repeating patterns typical in material microstructures and use sub-images for informing the few-shot network, bypassing the need for even a single fully labeled training image. The general approach to image segmentation using few-shot learning is schematically described in Fig. 1. This methodology involves breaking an input image into a grid of sub-images (referred herein as chips), model initialization, inference, and output of a segmented micrograph. The process of chipping relies on domain-specific knowledge of the materials microstructure, as indicated in the annotations in Fig. 1a. Most computer vision segmentation techniques provide pixel level classification, referred to as semantic segmentation. While technically more precise in terms of granularity, these methods can be more error prone, e.g., in noisy images or in instances of artifacts 46 . In cases where pixel to pixel variability ranges widely, a chip may better capture microstructural features of interest as a whole versus specific pixels. Additionally, recent developments in few-shot texture segmentation 47 may provide feasible routes to semantic segmentation for STEM images.

Preprocessing
To separate and measure distinct phases that have varying contrast in the STEM images, preprocessing of original image data was required. A histogram equalization (HE) technique designed to enhance local image qualities without introducing global artifacts termed contrast limited adaptive HE (CLAHE) 48,49 was selected for use in this work. The details of the CLAHE implementation are described in Table 1. CLAHE was first performed on original images and then the processed image was sectioned into a set of smaller sub-images, as shown in Fig.  1b. The chip size varied between 95 × 95 pixels and 32 × 32 pixels, however all chips are resized to 256 × 256 in the ResNet101 embedding module. The variable size allowed for each chip to be large enough to capture a microstructural motif and small enough to provide granularity between adjoining spatial regions, as shown in Fig. 1. The final preprocessing step is an enhancement technique 50 that marks the position and size of atomic columns using a Laplacian of Gaussians (LoG) blob detection routine 51 . This step was used on the LSFO system to enhance the extremely subtle differences between classes.

Model architecture
The few-shot model inputs the preprocessed STEM image, typically with high resolution on the order of 3000 × 3000 pixels, that has been broken down into a series of smaller chips, x ik , typically not larger than 100 by 100 pixels. A handful of these chips are used as examples, or a support set, to define each of one or several classes. While most image applications for few-shot learning use disparate x ik to define a support set for each class (S k ) 52-56 , here S k was created by breaking the original image into a grid of smaller sub-images ( Fig. 1b). A subset of chips were labeled for each class. The set of N labeled examples for k = 1, . . , K classes makes up the support set defined by: and y i is the corresponding true class label (Fig. 1c). A Prototypical Network 57 was selected in this work, given its lightweight design and simplicity. This model is based on the premise that each S k may be represented by a single prototype, c k . To compute c k , each x ik is passed through an embedding function f ϕ , which maps a D-dimensional image into an M-dimensional representation through learnable parameters ϕ. The transformed chip, or fϕ(x ik ) = z ik , then creates the prototype for class k as the mean vector of the embedded support points c k , as follows: After class prototypes are created, an untrained Prototypical Network classifies a new data point, or queryq i , by first transforming the query through the embedding function and then calculating a distance, e.g. Euclidean distance, between the embedded query vector and each of the class prototype vectors (Fig. 1d). After the distances are computed, a softmax normalizes the distance into class probabilities, where the class with the highest probability becomes the label for the query 57 . The final output of the model, for each q i , is the respective class label (Fig. 1e).

Model Inference
In order to quantify phase fractions in a STEM image (which can range from nm to μm in spatial dimension) each chip is used as a query point, q i , so that the entire set of query points, Q, makes up the full image. The size of Q is directly proportional to the size of each chip and the size of the full image, as shown in Table 2. All q i first go through the embedding function and distances to each prototype are computed using the selected distance function. The network then produces a distribution over each of the K classes by computing a softmax over the distances and assigns a class label according to the highest normalized value 57 . In the current implementation of the Prototypical Network, query chips must be assigned to one of the pre-selected class prototypes. To account for unknown features, the user is advised to add an additional support class. The model-specific implementation and parameters are given in Table 2. While the selection of model parameters is often tedious, specific model parameters in the few-shot context are generally straightforward, since it is often possible to leverage pretrained models for the embedding architecture. Here, a residual network with 101 layers, ResNet101 58 , was used as the embedding architecture. ResNet was specifically selected owing to its success in several related image recognition tasks 58 . Any number of networks may be used for encoding, including more lightweight architectures such as smaller MS-D-Nets 37 , U-Nets 59 , or highly specialized networks such as DefectSegNet 31 . ResNet is a popular model that is widely available in several programming languages with model weights available, making learned knowledge easily transferable to STEM-specific tasks. Model weights for ResNet101 are available from PyTorch 60 pytorch/vision v0.6.0, as trained on the image database ImageNet 61 . We note that other networks optimized for microscope images may yield performance gains in training and inference, as well as permit greater batch sizes for improved regularization. Additionally, the Euclidean distance metric was used, since this metric generally performs well across a wide variety of benchmark data sets and classification tasks 57 . These pretrained models come with specified parameters and trained model weights. However, any embedding architecture may be used, especially those well-suited for segmentation 62 .
The similarity module can be any few-shot or meta-learning architecture as well; however, Protonets are generally simple and easy to implement. Parameters not necessarily specific to the models-namely chip size and batch size-should take the size of each distinct micrograph into consideration in addition to computational memory capacity. A chip should generally encompass a single micrograph and may take trial and error depending on the size of the full image and magnification. The batch size is simply the number of chips to evaluate at once. Generally, a machine with at least 16 GB of RAM and 2.7 GHz of processing power can reasonably compute model predictions at a rate of about 1 chip per 0.5 s, with a batch size of 100 chips measuring 64 × 64 pixels. The compute time naturally depends on processing power in addition to the chip size and the number of parameters in the embedding module. In the case of training a few-shot model, rather than simple inference as shown here, at least one GPU is necessary and may take several days to reach convergence given a sizeable database, such as a typical image database like ImageNet 61 contains 14 million images. The scope of this manuscript will only discuss the former, using an untrained fewshot model and pure inference to make judgments about an image.

Classification
The segmentation output of few-shot classification using the Prototypical architecture for three oxide systems is shown in Fig. 2. The model output is a superpixel classification, i.e., every pixel that belongs to a chip receives the same label and corresponding color, much in the same way other computer vision applications approach segmentation 63 . Here, the support set classes define the set of possible output labels. The percentage of chips belonging to each class, shown in Fig. 2 (right), can be scaled from percentages to area using pixel scale conversions for a total area estimate for each distinct micrograph.
The STO/Ge system presents a particular challenge for most image analysis techniques in that the contrast varies irregularly across the whole image. The sample also contains multiple interfaces and is representative of typical thin film-substrate imaging data. The selected LSFO image shows a secondary phase in the perovskite-structured matrix, and the secondary phase appears to have a gradient from top to bottom, which drastically diminishes the very subtle differences between the two micrographs. Separation of the two interpentrating microstructural domains is necessary to understand the synthesis process and resulting properties, such as electrical conductivity. While preprocessing can adjust for some of these irregularities, traditional threshold-based segmentation techniques such as Otsu's Method 20 and watershedding 21 are not robust enough for a consistent solution and even adaptive methods can fail against a gradient and certainly when applied generally across multiple images. Another approach to phase discovery has been successfully demonstrated based on the use of a sliding FFT 18 . While powerful, this approach requires periodicity in the image and is less amenable to lower magnification images of the kind shown Fig. 2c. In this case, a convolution of contrast, edge, and sampling parameters makes straightforward interpretation of the FFT difficult. In addition, this approach is less well-suited to classifying amorphous materials, which may possess correlated but not periodic order that is difficult to quantify locally. The fewshot technique is not completely immune to these irregularities as seen in Fig. 2a, where the segmented micrograph contains a handful of misclassified chips. These misclassifications reveal some sensitivity to support set selection, which is an important topic for additional study. In general, we find that classification based on support sets containing canonical features (well-resolved, uniform) outperforms classification based on support sets containing outlier (noisy, incomplete) features. In the LSFO system, few-shot is slightly more inconsistent in identifying the (green) microstructural features, as shown in Fig. 2b. However, these issues may be corrected with a post hoc spatial smoothing. For example, chips completely surrounded by one label within some radius would weight the class probability, or adjustments in the chips that define the support set. Despite some irregularities, the fewshot method is much more robust to noise within the full input image when compared against other segmentation techniques. Additionally, this few-shot method is easily generalizable to several different material systems, since a single support set defined by one image can be applied without adjustment to multiple images of the same type for an unmatched time savings in the analysis of image series. This feature is particularly important in the case of large area mapping, as shown for the MoO 3 nanoparticles, where it is necessary to collect image montages to survey the wide variety of possible particle

morphologies.
Here again the few-shot method successfully distinguishes several nanoparticle orientations from the carbon support background, with minimal instances of inaccurate labeling. We note the ability of the few-shot approach to accommodate the visual complexity of S 1 seen in Fig. 2c, with a range of shapes, contrast, and sizes defining this flat category. While S 1 here is defined with several more chips than the others, the model is able to reasonably perform a segmentation task impossible using contrast based methods alone. Overall, we find that the model generalizes well to different material systems containing varying microstructural features.
Comparison to other methods Initially, several image analysis techniques were explored in an effort to quantify microstructral features of interest in specific micrographs, i.e., segmentation. It was immediately obvious that no single segmentation method would perform well in the absence of preprocessing steps, such as contrast adjustments, smoothing, and sharpening. Ideally, the aim of preprocessing in these analyses is to globally minimize artificial contrast textures and locally emphasize object edges, a critical noise reduction step for most segmentation routines 64 . Given that preprocessing and segmentation are often inseparable 65 , we examine comparable segmentation methods in the context of both segmentation and preprocessing together. In an effort to compare the few-shot approach with more widely-used segmentation methods, an example image from the STO/Ge system was analyzed using techniques with varying noise sensitivity and segmentation capabilities, with results shown in Fig. 3.
The simplest approach to segmentation falls under a family of thresholding techniques shown in the first row of Fig. 3. The three methods shown in the top row are designed to separate pixels in an image into two or more classes, based on an intensity threshold. The threshold in these methods is determined using information about the distribution of pixel intensities either globally (top row left) or locally using a neighborhood of pixels (top row center and right). The neighborhood methods are commonly more sensitive to noise, while Otsu's more global  technique appears to separate foreground pixels (light) from background (dark) relatively well.
Moving beyond simple thresholding, we begin to look towards separating pixels into classes other than background and foreground. The segmentation methods shown in Fig. 3 typically have the ability to separate intensities into multiple classes again defined by the distribution of pixel intensities in the image. Two classes are specified for these routines in order to demonstrate the premise that, ideally, the image could be segmented according to the two distinct micrographs. These approaches also typically involve blurring filters and/or morphological operations 66 in order to remove pixels that are not a part of a larger group or shape. While shape edges are more defined in the middle row of Fig. 3 than in the top row, we note that the resulting segmentation still appears to be background/foreground and misses the distinction between micrograph structures. One obvious limitation of a direct implementation of these methods is that the resulting classes will always be based on intensity and not on the size or shape of the  underlying micrographs. It may be possible to layer these methods with a shape detection routine where shapes of approximately the same size may be clustered into the same class. However, we found that clustering shapes post foreground/ background segmentation was not able to distinctly separate microstructural features in an unsupervised manner, i.e. without tedious and manual intervention.
Rather than adding a shape clustering routine to an already segmented image, we implemented cluster based methods on either the raw image, or neighborhoods of the raw image in the bottom row of Fig. 3. A common unsupervised K-Nearest Neighbors (KNN) clustering method is shown in the bottom row right, where we again see clustering results based on pixel intensity or background/foreground separation. Bottom row middle shows the first non-intensity based approach. An average structural similarity index measure (SSIM) is computed pairwise for 100 × 100 pixel non-overlapping neighborhoods as a measure of similarity between regions. The average SSIM for each neighborhood is a bimodal distribution that can be grouped into two classes as shown in bottom row center of Fig. 3. However, the cutoff in SSIM must be manually determined. Lastly, the few-shot segmentation technique described in this manuscript is shown in bottom row left, where we see perfect segmentation between the two regions of distinct micrographs. We have included few-shot as a clustering technique, since a neighborhood is compared to a prototype analogous to the way clustering techniques compare to a centroid.
The results presented here have been assessed qualitatively, given that no standard comparison data sets exist to address the fidelity of few-shot learning in standard quantitative practice. This gap also points directly to the high cost of manually labeling data and perhaps opens an opportunity for few-shot assisted annotation. We have attempted a comparison to hand-labeled data, as shown in Supplementary Note 4, which shows overall good classification with some variation at feature boundaries, but further work is needed to more rigorously assess performance. Additionally, we have shown only one type of few-shot architecture here, though a number of other architectures and configurations may well outperform the simplicity of the Prototypical Network, including Siamese Networks 67 , Relation Networks 68 , Conditional Networks 56 , etc. However, we have found that a Relational-Conditional Network may be more sensitive to noise and less adept to out of distribution samples. (See Supplementary Note 2 for a Relational-Conditional Network result).
In summary, here we developed a flexible few-shot learning approach to STEM image segmentation that can significantly accelerate mapping and identification of phases, defects, and other microstructural features of interest in comparison to more traditional image processing methods. We studied three different materials systems (STO/Ge, LSFO, and MoO 3 ), with varying atomicscale features and hence diversity in image data for model development. Segmented images using the few-shot learning approach show good qualitative agreement with original micrographs.
When compared to other techniques, we find that noise sensitivity and/or labeling capability remain challenges for adaptive segmentation and clustering algorithms. We note that in the present study we have not examined the robustness of the few-shot approach to image artifacts or high noise levels because of a lack of established ground truth for quantitatively benchmarking the present data sets. These topics merit additional future study. In addition, the effect of orientational disorder was not systematically considered, but recent work 69,70 has shown that rotationally invariant variational autoencoders may help address such disorder. However, analysis of a larger MoO 3 image set shown in Supplementary Note 3, demonstrates the qualitative robustness of the approach to both feature intensity variation and rotation. The few-shot techniques explored in this manuscript provide powerful resources to combat these issues and remain flexible enough to accommodate a suite of materials. While fewshot machine learning has been increasingly successful in rapidly generalizing to new classification tasks containing only a few samples with supervised information, it is a known problem that the empirical risk minimizer can be slightly unreliable 71 , leading to uncertainty in the reliability of the model for any given support set. We can mitigate some of this uncertainty with careful selection of the support set in order to avoid driving the model toward a non-optimal solution with mistakes in the support set, for instance. Another way we can alleviate some of the uncertainty in solutions is to actually train the few-shot model with large volumes of labeled data, when available, so that new classification tasks already have the benefit of optimization, even in a completely new and different material systems. A postprocessing routine of spatial structural statistics may also provide a means for smoothing spurious segmentation predictions from an otherwise uniform material region; however, it is possible that a true signal could be smoothed out as an artifact with such approaches.
Aside from uncertainties in performance, the benefits in model generalizability also limit the amount of physically meaningful information we can extract from the model itself. One route for future exploration is the use of multiple data streams, such as spectral (electron energy loss spectroscopy (EELS) and energydispersive X-ray spectroscopy (EDS)), as well as diffraction (4D-STEM) in a multi-modal few-shot model. Insights in the multimodal context would help to extract more physically meaningful information and likely enhance the segmentation performance. In theory, this same model should generalize to a wide variety of STEM images and our preliminary results on other systems are indicative of this. Yet still unknown are the effects of the size and variety of the examples in a support set for a given class. Simulation studies are being designed to help answer these questions and more, including the possibility of using other models for embedding and the benefit of model training. Recent work in few-shot for texture segmentation 47 also indicates that pixel level annotation may be possible, pushing the accuracy of this approach beyond chip-level resolution. Ideally, few-shot segmentation can also help curate and annotate these large training sets, something that is otherwise extremely costly to do manually and at scale. In summary, this approach offers a potentially powerful means to standardize and automate the analysis of materials microstructures for a single image, paving the way for high-throughput characterization architectures.

Experimental materials and methods
The three experimental systems were prepared as follows. SrTiO 3 films were deposited onto Ge substrates using molecular beam epitaxy (MBE), as described elsewhere 44 . La 0.8 Sr 0.2 FeO 3 films were deposited onto SrTiO 3 (001) substrates using MBE, according to a procedure described elsewhere 45 . Cross-sectional STEM samples of the thin films were prepared using a FEI Helios NanoLab DualBeam Focused Ion Beam (FIB) microscope and a standard lift out procedure. Bulk MoO 3 particles were drop cast onto a lacey carbon grid from suspension in ethanol. High-angle annular dark field (STEM-HAADF) images of the STO/Ge were collected on a probecorrected JEOL ARM-200CF microscope operating at 200 kV, with a convergence semi-angle of 20.6 mrad and a collection angle of 90-370 mrad. STEM-HAADF images of the LSFO and MoO 3 were collected on a probe-corrected JEOL GrandARM-300F microscope operating at 300 kV, with a convergence semi-angle of 29.7 mrad and a collection angle of 75-515 mrad. The original image data analyzed in this work varied between 3042 × 3044 pixels (for STO/Ge), 2048 × 2048 (for LSFO), and 512 × 512 for MoO 3 . Because of its beam sensitivity, the STO/Ge images shown were collected using a frame-averaging approach; a series of 10 frames were acquired with a 1024 × 1024 px sampling and 2 μs px −1 , then non-rigid aligned and upsampled 2 × using the SmartAlign plugin 72 . Tens of images were collected from each material system and a range of selected defect features were used in this study.

Computational methods
The specific implementation of the preprocessing techniques and parameters for the few-shot model are described in Tables 1 and 2, respectively. All methods were implemented using the Python programming language v.3.6 available at http://www.python.org. Each image was processed using a 16 GB RAM 2.7 GHz Intel Core i7 MacBook Pro.

DATA AVAILABILITY
The raw images, support sets, and query sets for each of our analyzed images, as well as additional classification of MoO 3 nanoparticles, are available on FigShare 73 .

CODE AVAILABILITY
The exact codebase for this work is unavailable to the public due to proprietary reasons. However, the Prototypical Network code 57 is available on Github (https:// github.com/jakesnell/prototypical-networks).