Introduction

The symmetry properties of crystalline and molecular systems associated with a long-range periodicity of their assumingly ideal ‘lattices’ serve as a cornerstone for deriving electronic, magnetic, and optical functionalities of technologically relevant materials. Experimentally, these properties are usually accessed via scattering techniques that provide information about spatially averaged (over the probing volume) site occupancies. The knowledge of average structural parameters underpins classical physical descriptions based on concepts of order parameters, average compositions, and symmetry averaged thermal and phonon properties. At the same time, there is growing realization that the effects of local structure exemplified by disorder can often lead to novel functionalities absent in averaged models.1,2,3,4,5 For example, one interesting scenario of disorder occurs when there is a distinction between local symmetry associated with individual building blocks and global symmetry imposed by underlying lattice.1, 2 A resultant interplay between local and average symmetry opens new pathways to understand and optimize optical, magnetic, electronic, and thermal properties of certain disordered systems.2, 5

Exploring and controlling different types of disorder both in periodic and non-periodic structures is therefore crucial for the applications and basic science alike. In the last decade, the progress in scanning probe and transmission electron microscopies (SPM and STEM) have allowed a real-space unit cell scale mapping of electronic and structural orders in materials making them the perfect tools for analyzing such distorted systems.6 For example, the subfields of SPM such as non-contact atomic force microscopy and scanning tunneling microscopy are known to provide an unprecedented, angstrom-resolved visual insight into a nature of chemical bonds7 and spatial behavior of electronic density of states on a surface,8 respectively. Such capabilities result in ever-growing stream of the vast amounts of high quality (resolution) experimental data that requires adequate analytical methods for extracting from it a relevant physical and chemical information.9

Concurrent with advances in real-space experimental microscopic measurements, contemporary theoretical, ab-initio modelling allows detailed study of atomic/molecular structures, their electronic, magnetic, and optical properties.10,11,12 However, many interesting functionalities in disordered molecular and/or atomic systems are defined on length scales at which the number of possible molecular or atomic configurations (and hence, computational cost) grow exponentially. Similarly, effects of local symmetry breaking and disorder are often manifested in minute (~ pm level) distortions of the bonding geometry or effective molecular shapes.13,14,15,16 This suggests a necessity for pathway to integrate an experiment and certain elements of theory that would allow an automated and highly efficient inspection and interpretation of experimental image consisting of a large number of individual atomic and/or molecular units (~102–103) in a fashion of full information extraction, linking both minute deviations in local structure and large-scale assembly properties in statistically significant manner.

Here we use an approach based on a synergy of Markov random field, convolutional neural network, and ab-initio simulations for performing a full decoding of various orders associated with symmetries of individual building blocks (molecules) on the underlying lattice (substrate). We apply this method to explore molecular interactions in 2D film of bowl-shaped sumanene molecules on gold substrate, where an individual molecule in each lattice point can reside in multiple (structural and rotational) configurations. The obtained full decoding at the nanoscale level allows us to directly construct both relevant pair density functions—a centerpiece in analysis of disorder-property relationship paradigm,1 as well as more complex structural descriptors. This in turn allows us to explore how individual blocks may form certain short-range orders, as well as to analyze potential (spatial) correlations between multiple order parameters, and to use the obtained information for constructing a reaction path for molecule conformational changes in the self-assembly.

Results

Model experimental system

As a model system, we chose a self-assembly of bowl-shaped π-conjugated sumanene molecules17 (hereafter, buckybowls) on a gold (111) surface. Unlike most of the planar molecules, the buckybowls are characterized by an additional structural degree of freedom associated with their bowl-up (U) and bowl-down (D) conformations (Fig. 1a, b).18, 19 The raw experimental STM image in Fig. 1c shows a nanoscale area of gold (111) surface covered with self-assembled ad-layer of buckybowls (see Supplementary Note 1). The global two-dimensional fast Fourier transform performed on the data from Fig. 1c reveals a presence of two hexagonal patterns (inset in Fig. 1c), that are rotated by 30° with respect to each other, with their lattice constants different by a factor of ≈\(\sqrt 3 \). Such reciprocal space structure indicates an alternation of STM tunneling current at every third molecule in the self-assembly. This can be explained by the formation of the so-called 2U1D structure18 in which every third molecule appears in the bowl-down state and is associated with an increase in STM tunneling current (see Supplementary Figure 1), whereas the rest of the molecules reside in the bowl-up state. The relatively weak intensity of FFT peaks associated with 2U1D structure (inner hexagon in the inset of Fig. 1c) suggests only a partial formation of 2U1D structure across the field of view. In addition to the degrees of freedom associated with U and D states, each buckybowl can preside in several azimuthal rotational states. A simple visual inspection of several randomly selected areas of the STM image (such as the one illustrated in Fig. 1d) as well as an application of more advanced statistical tools such as principal component analysis20 (Fig. 1e) suggests that a likely number of rotational classes needed to be considered for this dataset is four (see Supplementary Note 2 and Supplementary Note 4, as well as Supplementary Figure 2 and Supplementary Figure 3). While a presence of three rotational states on (111) surface was expected from earlier studies,18 we assign an occurrence of an additional rotational class to a presence of imperfections of the molecular film and/or underlying substrate.

Fig. 1
figure 1

Description of the system and physical priors. Schematics of a sumanene molecule on a gold substrate and b bowl-up (U) and bowl-down (D) conformational states. c Experimental STM image (raw data) of sumanene ad-layer structure on gold (111) surface. The image resolution is 910 px × 910 px. Inset shows a global Fast Fourier Transform of the STM image. The inner hexagon marked with yellow circles is due to formation of 2U1D structure in certain parts of the image. d Visual inspection of different conformational states (U and D) and rotational states from selected area of the STM image. The D states are marked by blue circles. The U states are marked by triangular whose orientation reflects a presence of different rotational states. e Principal component analysis on the dataset containing all the 938 molecules extracted from c. The first eigenvector is the mean value whereas the rest five eigenvectors correspond to the largest variances in the dataset (see also Supplementary Note 4) and can be linked to presence of different rotational classes (schematically indicated in the bottom right corner) observed e.g. in d

Due to the relative proximity of ground state energies in U and D conformations,18, 19 it is likely that certain perturbations will induce a transition between the two structural states yielding a deviation from ideal 2U1D periodic structure in the molecular film. Indeed, visual inspection depicted in e.g. Fig. 1d shows a presence of disorder (i.e., distortion of periodicity associated with 2U1D structure) in both U and D structural states, as well as in distribution of the molecules rotational classes. Recent experimental and theoretical studies suggest that a controllable formation of various architectures in 2D buckybowl ad-layer is feasible through a manipulation via the SPM tip19, 21 or physical adsorption of certain chemical species,18 which could potentially lead to a realization of information storage molecular device or systems for molecular level mechanical transduction. Furthermore, due to a presence of multiple rotational states in addition to bowl up/down structural conformations, such system can be viewed as an ideal playground for probing an interplay between multiple order parameters, the molecular analog of multiferroic systems.22 However, one of the critical issues in these efforts lies in being able to identify (‘read out’) all the individual building blocks in different molecular formations on a scale of hundreds and thousands of molecules. This requires tools that would enable a classification of non-strictly periodic structures in the STM images as well as extracting information on the ‘internal’ structure of individual units (molecules) in an automated and reliable fashion. Unfortunately, while average image analysis methods such as Fourier transform (FFT) and principal component analysis (PCA) of the STM data described above are useful in establishing physical priors, these methods alone are not sufficient for obtaining an accurate information on spatial distribution of the structural and rotational molecular states in such systems and on the possible spatial correlation between the corresponding order parameters. Below we demonstrate how adoption of Markov random field model and convolutional neural networks aided by density functional theory (DFT) simulations of STM images allows to classify bowl-up/-down structural states and different rotational classes in an automated and accurate fashion (Fig. 2).

Fig. 2
figure 2

Workflow overview. a Block chart diagram of the workflow for classification of U and D states, and different rotational classes. b Graphical Markov model structure used for analysis of a molecular self-assembly of buckybowls. Here Markov network on a regular lattice acts as a prior over hidden variables (associated with the state of the molecule) in a model which is coupled to an array z of experimental observations (STM signal). c Schematics of convolutional neural network (cNN). The 12 convolutions of a size (21px × 21px) are generated by applying 12 kernels of a size (5px × 5px), with sigmoid activation function, to the input image. The filters are shifted across the image with a step size of 1px. These convolutions are subsampled into 12 maps of a size (7px × 7px) using average pooling technique. The second convolution layer is formed by applying 6 kernels, with sigmoid activation function, to an input from the previous layer. At the end of the network, a fully connected layers contains four neurons corresponding to different rotational classes

Molecular self-assembly as Markov network

Markov Random Field (MRF) is a mathematical model that allows representing a long-range order of a system through defining only local interactions.23, 24 We illustrate its application using as an example the molecular system of buckybowls that can preside in different structural conformations (U and D states); later we also apply this scheme to the analysis of multiple azimuthal rotational states of buckybowls. The posterior probability distribution for the possible molecular states can be described using Bayes’s formula as:

$$P\left( {X = x{\rm{|}}Z = z} \right) = \frac{{P\left( {Z = z{\rm{|}}X = x} \right)P\left( {X = x} \right)}}{{P\left( {Z = z} \right)}}$$
(1)

where P(X = x|Z = z) describes the probability of molecule belonging to a state x given the observation z (information from the experimental image) and is proportional to likelihood P(Z = z|X = x) of the particular configuration leading to observed outcome multiplied by the prior distribution probability P(X = x) of such configuration in the absence of observation and based purely on our assumptions about the model. The P(Z = z) plays a role of a normalization constant. The priors P(x) can be represented via MRF, which makes use of an undirected graph G = (V, E), where V = {1, … n} is the vertex set associated with random variable X, and E is a set of edges joining pairs of vertices. The underlying assumption of Markov property is that state of an element is explicitly dependent only on the states of its neighbors,

$$P\left( {X(i){\rm{|all}}\,{\rm{lattice}}\,{\rm{points}}\,{\rm{except}}\,i} \right) = P\left( {X(i){\rm{|nearest}}\,{\rm{neighbors}}\,{\rm{of}}\,i} \right),$$
(2)

Importantly, the explicit Markov structure implicitly carries longer-range dependencies; hence, it directly translates into the physics of our problem. Note that these priors are directly linked to the fundamental physics of the system, namely the presence of short-range interactions in molecular assembly which are now explicitly taken into account during image analysis.

We represent our STM data on buckybowls in a form of graph in which each molecule is represented as a node (vertex), and edges are connections to each molecule’s nearest neighbors (Fig. 2b). The k-d tree method with Euclidean distance metric is used to identify up to 6 nearest neighbors for each molecule. The posterior distribution P(X = x|Z = z) of an MRF can be factorized over individual molecules such that25

$$P\left( {x{\rm{|}}z} \right) = \frac{1}{Z} * \mathop {\prod}\limits_{\left\langle {ij} \right\rangle } {{{\rm{\Psi }}_{ij}}\left( {{x_i},{x_j}} \right)} \mathop {\prod}\limits_i {{{\rm{\Psi }}_i}\left( {{x_i},{z_i}} \right)} $$
(3)

where \({{\rm{\Psi }}_i}\left( {{x_i},{z_i}} \right)\) represents unary potential given observation z, \({{\rm{\Psi }}_{ij}}\left( {{x_i},{x_j}} \right)\) are pairwise potentials over the connected neighbors, and Z is the partition function over the posterior MRF. The potentials are defined based on our knowledge about physicochemical properties of the molecular system. For analysis of molecules conformational changes, each node in our model can reside either in U state or D state. We then assign the unary potentials \({{\rm{\Psi }}_i}\left( {{x_i},{z_i}} \right)\) over molecular states based on the proximity of a particular molecule’s intensity in the STM image to the threshold value between the states T. The simplest threshold value is the mean value of all intensities after normalization and outlier removal (Supplementary Note 3). Therefore, node probabilities are calculated as

$${{\rm{\Psi }}_i}\left( {{x_i} = 1,{z_i} = {I_i}} \right) = \frac{1}{{1 + Exp\left[ {S * \left( {T - {I_i}} \right)} \right]}}{\rm{,}}$$
(4a)
$${{\rm{\Psi }}_i}\left( {{x_i} = 2,{z_i} = {I_i}} \right) = 1 - {{\rm{\Psi }}_i}\left( {{x_i} = 1,{z_i} = {I_i}} \right)$$
(4b)

where \({I_i} \in [0,1]\) is the intensity of a given molecule i, and S is a parameter that controls the growth rate of the logistic function. This results in two logistic functions, which classify the molecular intensities far away from the threshold as belonging to their corresponding class with probability of 1, but provide more flexibility in the region around the threshold value itself. We proceed to assigning pairwise potentials \({{\rm{\Psi }}_{ij}}\left( {{x_i},{x_j}} \right)\) for our molecular system. The optimal 2U1D configuration proposed above is characterized by six U molecules surrounding one D molecule, such that D molecule is never allowed to have the nearest neighbor in the same bowl conformation. As we are interested in the distortion of an ideal structure, this condition is relaxed by introducing a disorder parameter p. The new probabilities used in our MRF model are summarized in Table 1. Finding an exact solution to MRF model is intractable in our case as it would require examining all 2n combinations of state assignments, where n is the number of molecules, that is, about 1000 for examined images. However, one can obtain close approximate solution by using a max-product loopy belief propagation method, which is a message-passing algorithm for performing inference on MRF graphs, with unary and pairwise potentials as an input25, 26 (see also Supplementary Note 5). We note that by tuning a graph structure and/or form of the potentials one can easily apply this approach to other molecular order parameters (such as lateral rotations, as we will show later in the paper) or even different molecular architectures.

Table 1 Assignment of pairwise potentials based on our knowledge about the molecular system

Classification of azimuthal rotations via convolutional neural network

To determine an azimuthal rotational state of each molecule in the image in an automated fashion, we employ an approach based on convolution neural network (cNN).27 The cNN based image analysis has been successfully used in recent years in various areas of science and engineering ranging from cancer detection to satellite imaging,28, 29 but has yet to be applied to atomic-resolved and molecular-resolved imaging.30 The schematics of cNN adopted for the present study is shown in Fig. 2c. It consists of two convolutional layers interspersed with a subsampling/pooling layer, and a fully connected layer. The convolution layer is formed by running learnable kernels (‘filters’) of the selected size over the input image (or image in the previous layer), whereas the sub-sampling layer uses average pooling technique to reduce the size of the data. Fully connected layer at the end of the network contains as many neurons as the number of classes/states to be predicted. The learning of kernels is performed through a convolutional implementation of the backpropagation algorithm.31

The cNN is trained on a set of synthetic STM images (25,000 samples) obtained from DFT simulations of different rotational classes (see more details in the next paragraph). Note that an information on the bowl conformational states (U and D) is inferred from the MRF analysis and is not therefore treated by cNN (when treated, the adopted cNN scheme produces much poorer overall accuracy in U/D states classification compared to MRF analysis).

Generation of synthetic STM data

To ascertain the applicability and robustness of our machine learning and pattern recognition methods for general STM data, we start with constructing a synthetic dataset on a model system. We generate synthetic STM images by Markov chain Monte Carlo sampler using inputs from DFT calculations of electronic charge density distribution in the molecules. We work under the commonly adopted assumption that ‘realistic’, experimental STM image can be viewed as a ‘distorted’ DFT simulation of a charge density distribution in the system (Fig. 3a).32, 33 For the molecular system under consideration, one possible type of the distortion is an admixture of another azimuthal rotational state to a given structural configuration of an individual molecules. Indeed, if two (or more) states are separated by a relatively low activation barrier, such as in the case of buckybowl’s distinct rotational states,19 the system may switch between these states during the acquisition of STM tunneling current over the molecule. As a result, the STM image will be a dynamical average of two (or more) states.34, 35 This effect may be especially pronounced during the room-temperature measurements, small tip-sample separation distances, or high setpoint current density. In addition, the blurring effect associated with a presence of the STM tip and a signal-dependent Poisson noise were incorporated in our model (Fig. 3a). Here, blurring defines the convolution with the STM probe function, whereas Poisson noise is associated with the tunneling statistics.

Fig. 3
figure 3

Generating and analyzing synthetic data. a Schematics for generating synthetic data for each molecule. Left: DFT-simulated STM images of bowl-up and bowl-down configurations. Right: Same images corrupted by admixing a proximate in energy azimuthal rotational state, blurring (effect of STM tip) and Poisson noise. bd Application of graphical Markov model to synthetic data generated by Markov chain Monte Carlo sampler using inputs from density functional theory calculations of electronic charge density distribution in molecules. The number of molecules in synthetic dataset is 1225. b Intensity distribution for synthetic data. Two logistic functions overlaid (see text for details). c Real space distribution of molecules in synthetic dataset. d MRF based decoding of D and U states from image in c. The D and U states are denoted by red and blue, respectively. The total error (ratio of misidentified states) was 0.33 % or about 4 (out of 1225) molecules. One of the misidentified state is denoted by circle in the inset in d

Testing our methods on synthetic data

The MRF approach results in a remarkably accurate identification of molecular D and U states in scenarios where the distribution of the STM intensities in the synthetic data closely resembles the experimental one (note that different rotation angle with respect to the substrate results in a variation of STM signal intensity19). This is illustrated on synthetic dataset described in Fig. 3b, c for which only 4 out of total 1225 molecules are misidentified by our classification scheme (Fig. 3d). The overall total error rate (ratio of misidentified molecules) as a function of p-value and intensity distributions is shown in Fig. 4. Generally, this approach vastly outperforms simple mean-value and/or average value thresholding (Fig. 4a) and allows accurate classification of U and D states for a wide range of intensity distributions (Fig. 4b) where no estimations regarding the p-value is available apriori. In the case of data described in Fig. 3b, c, for example, increase of the p-value by a factor of 3 would result in total error increase by less than ~1% (see Fig. 4a).

Fig. 4
figure 4

Analysis of MRF error rate. a Comparison of error rates, determined as a proportion of misidentified bowl-up and bowl-down conformations, for MRF analysis with two different p values (7 and 20), median thresholding, and mean thresholding. b Error rate as a function of standard deviation of normalized STM intensity distributions and an optimization parameter (p-value). The arrow shows the value of these parameters in the synthetic data used in Fig. 3b, c

We proceed to extracting information about the azimuthal rotational state of the individual U and D units in the synthetic STM image (Fig. 5a). The dependence of classification accuracy on the hypothetical admixture ratios of a proximate rotational state (which are potentially located ‘in-between’ the rotational classes used in our classification scheme) and the cNN error are shown in Fig. 5b, d, respectively. It is easy to see from Fig. 5b that one can obtain a reliable classification of molecules rotational states even for relatively large ratio of the selected admixed state.

Fig. 5
figure 5

Convolutional neural network (cNN) decoding. a Convolutional neural network (cNN) produced decoding of rotational states. Thick-line tri-pointed stars describe the rotational classes of D states, whereas thinner lines show rotational classes of U states (both U and D states were classified via MRF prior to applying cNN). b The dependency of cNN accuracy (probability of a correct state assignment) on the admixed proportion of a different rotational state. The two rotational cases considered are shown at the top of the image c Comparison of error rate for cNN only and cNN + MRF approaches. d cNN learning error

The unique aspect of the proposed approach is that it is possible to incorporate certain physical constraints into the cNN-based analysis for obtaining more accurate decoding results. Here, we incorporate the effect of steric repulsion between molecules. In this case, we can use the cNN-calculated probabilities of azimuthal rotational states as prior probability distributions for another MRF model. Consider, for example, that two nearest-neighbor molecules are highly unlikely to have the same azimuthal rotational states if they preside in the same conformation state (either U or D) in the self-assembly.18 We may therefore assign 1% probability of each class to have a neighbor of its own class and equal 33% probabilities to have a neighbor of other 3 rotational classes (see Supplementary Table 1). Total probabilities are then normalized to sum to 1. Then, similar to earlier description, we perform decoding using loopy belief propagation in order to acquire more accurate solution (Fig. 5c).

Application to experimental data

Having verified that our algorithm is capable of working on synthetic data that mimics the ‘laboratory conditions’, we move to applying it to real experimental STM images of buckybowls on gold (111) substrate described in Fig. 1. The FFT mask with Hamming window is first applied to the STM image for removing a large-scale periodic contribution from the substrate. The MRF decoding of U and D states and the cNN-based decoding of azimuthal rotational states are summed in Fig. 6a. The physical priors and classes for MRF and cNN were taken from FFT and PCA analysis, respectively (see Fig. 1). A simple visual inspection of results in Fig. 6c can confirm a high accuracy of our method for experimental data.

Fig. 6
figure 6

Application of MRF and cNN to experimental data of buckybowls on gold (111). a Decoding of bowl-up/down states (p = 7) and rotational states for the experimental image described in Fig. 1c. For cNN analysis, an additional MRF refinement of rotational classes was performed using the same probabilities as for the synthetic data. b Histogram of STM intensities for all identified molecules. c Zoomed-in area from red rectangle in a where numbers denote an accuracy of state determination. d Spatial correlation analysis based on local Moran’s I statistics. The molecules are color-coded such that molecules in D and U states are presented as red and blue circles respectively; the size of the marker is scaled such that it is proportional to a value of local Moran’s I. eg Pair correlation function constructed from decoded experimental data for all molecular states e, only bowl-down states f, and one of the rotational classes g. See also Supplementary Figure 4

Once a full decoding is performed, it becomes possible to construct a pair distribution function (PDF) for molecular states of interest. In turn, these provide further insight into the nature of (dis-)order in molecular film, such as whether a disorder is correlated or random.1 The PDFs for all the molecular states, bowl-down molecular confirmations, and one of the rotational classes are shown in Fig 6e–g. The molecules clearly display a well-defined long range positional order, as evident from Fig. 6e. On the other hand, an analysis of PDF for different molecular bowl conformations suggests that neither long range 2U1D nor perfect long range 3U orientational orders suggested previously18, 19 are realized for a given system. Indeed, the former must result in a disappearance (or a very strong suppression) of a peak at ≈11 Å in Fig. 6f, whereas for the latter the PDF of U states would closely resemble all-molecules PDF in Fig. 6e; these, however, was not observed. Interestingly, our analysis also shows a close resemblance in a behavior of PDFs for structural and rotational states (Fig. 6f, g, respectively) within first several coordination ‘spheres’ implying certain correlation between the two associated orders, that is, bowl-up/down switching is associated with a formation of certain rotational (dis-)order in the inverted molecules.

We further explore a nature of disorder in the molecular self-assembly by searching for local correlations between molecule bowl inversion and azimuthal rotation of the neighboring molecules. To obtain such an insight, we construct a spatial correlation map describing a possible interplay between these two different orders. Specifically, we adopt a method based on calculation of the so-called Moran’s I that can measure a spatial association between the distributions of two variables at nearby locations on the lattice.36, 37 The presence of the spatial weight matrix in the definition of Moran’s I allows us to impose constrains on the number of neighbors to be considered (see Supplementary Note 6). The results for spatial correlation between bowl-up/down configuration and different rotational classes for the first ‘coordination sphere’ is shown in Fig. 6d where a different size of circles reflects different values of the Moran’s I across a field of view. Generally, the map in Fig. 6d implies a spatial variation in coupling between the two associated order parameters, which could also be sensitive to presence of defects. The average value of Moran’s I for all molecules is 0.310, whereas the average value for correlation of rotational classes with bowl-up and bowl-down molecular conformations are 0.246 and 0.426 respectively. This result indicates that a bowl-up-to-bowl-down flip associated with occurrence of an ‘additional’ D molecule requires a larger change in a rotational state of the neighboring molecules (compared to a flip in the reversed direction) in order to compensate for a formation of energetically unfavorable (‘extra’) bowl-down state.

Based on the findings above we propose a two-stage ‘reaction path’ that explains a different correlation values of rotational states with neighboring bowl-down and bowl-up structural conformations schematically depicted in Fig. 7. Specifically, in the first stage, a creation of ‘extra’ bowl-down state elevates the energy of the system, which is then relaxed in the second stage of the ‘reaction’ via adjustment of rotational states in nearby molecule(s). The latter is associated with the obtained values of Moran’s I. It is crucial to note that unlike previous studies which only considered a bowl inversion process for an isolated single molecule,18 our analysis allowed to obtain a deeper knowledge of local interaction processes that involve a lateral switching of neighboring molecules in the self-assembled layer. Observation of such an interplay between molecule rotations and its structural conformations provides important clues for understanding local degrees of freedom in the molecular ad-layer which is crucial in terms of its potential applications in multi-level molecular memory devices.

Fig. 7
figure 7

Schematics of the proposed reaction path for bowl inversion and its effect on the neighboring molecules. In the first stage, the inversion of the bowl-up (green circle) to bowl-down (large red circle) state elevates the energy of the system. In the second stage, the system gets relaxed by adjusting the rotational state of nearby molecule (shown by orange circle)

Discussion

To summarize, we have developed a multi-stage pattern recognition approach which encompasses ab-initio simulations, Markov random field and convolutional neural networks for a detailed characterization of surface molecular architectures in the typical field of view (~102–103 molecules) of STM experiment.

We now comment on several potential limitations of the methods and possible ways to overcome them. First, the physical priors used for input in both MRF and cNN could be in future extracted (in addition to, or even instead of, FFT and PCA analysis) from state-of-the-art ab-initio analysis and molecular dynamics (MD) simulations thus potentially providing more accurate decoding results. In this regard, it should be noted that low probabilities of class determination for certain molecules, if present, would suggest that some of molecular states in the experimental system were not accounted for by theory. In such case, one must return to the ab-initio modelling stage and reconsider the initial assumptions or adjust parameters. We envision that such process of adjusting (putting constraints on) ab-initio or MD parameters could be automated in future, although this would require an infrastructure capable of performing DFT/MD on-the-fly. Second, it would be also interesting to apply the so-called domain-adversarial training of neural networks38 which allows to alter theoretically predicted classes based on the observed data. The underlying idea of this approach is that the theoretical and experimental datasets are similar yet different in such a way that traditional neural networks may not capture correct features just from the labeled data. Finally, we foresee that in future a choice of the optimization value p in MRF analysis during an inference of bowl-up/down structural states, as well as during the refinement of cNN results, could be in principle optimized using a statistical distance approach.39

Regarding possible further applications of our method we note that in addition to analysis of individual static images of molecular structures, the same analysis can be applied to each individual frame in the STM “movies” of molecular motions (e.g. under external perturbation field) thus providing an invaluable input to molecular dynamics simulations. Furthermore, because our pattern recognition analysis is general in nature, it can be extended to microscopic measurements of structural, electronic, and magnetic orders, as well as their possible spatial correlations, in a variety of condensed matter systems such as, for example, skyrmion lattices.40

Data availability

All the relevant data is available from the authors upon request.