Abstract
The design and implementation of dopantbased silicon nanoscale devices rely heavily on knowing precisely the locations of phosphorous dopants in their host crystal. One potential solution combines scanning tunneling microscopy (STM) imaging with atomistic tightbinding simulations to reverseengineer dopant coordinates. This work shows that such an approach may not be straightforwardly extended to doubledopant systems. We find that the ground (quasimolecular) state of a pair of coupled phosphorous dopants often cannot be fully explained by the linear combination of singledopant ground states. Although the contributions from excited singledopant states are relatively small, they can lead to ambiguity in determining individual dopant positions from a multidopant STM image. To overcome that, we exploit knowledge about dopantpair wave functions and propose a simple yet effective scheme for finding doubledopant positions based on STM images.
Similar content being viewed by others
Introduction
Precise spatial placement of phosphorous dopants in silicon is essential for fabricating atomscale quantum devices in silicon with potential applications in quantum computing^{1} and quantum simulation^{2}. Dopant placement is achieved via hydrogenbased lithography, allowing dopants to be located in small patches of dangling (unpassivated) bonds, formed on the silicon surface with a scanning tunneling microscope (STM), and then covered by a protective silicon overgrowth. At the current stage of technology, neither the dopant position nor the number of dopants in a patch can routinely be controlled precisely enough. Detailed knowledge of the dopant positions in a fabricated device is necessary to understand the connection between device performance and dopant arrangement when aiming at systems of dopants, coupled dopants, and dopant clusters in chains and arrays^{2,3}. As demonstrated in recent works^{4,5,6,7}, the position of a single buried P dopant close to a Si surface can be determined from the structure of its STM image^{4}. However, modeling this structure is far from trivial and often ambiguous^{8}, even for a single dopant. Moreover, STM imaging of buried dopants needs to be extended to study multidopant structures^{1,2}, presenting further challenges. Such STM image simulations, obtained with a machine learning approach^{5}, have been presented recently. A more fundamental understanding of imaging multidopants is needed.
Double dopants are the simplest multidopant structures. In fabricated device structures, doubledopant structures in various unintended spatial configurations are expected to occur when another specific doubledopant structure is the intended target and even when single dopants and multi(more than two)dopants are the intended targets. Coupled, P dopant pairs buried in Si are a solidstate analog to diatomic molecules^{9}. This should provide a context for building a better understanding. However, unlike real diatomic molecules, doubledopant artificial molecules in Si will be fabricated with different separations and placement in the host crystal lattice. There will be many versions of the same doubledopant pair that must be imaged and identified. Moreover, the simple picture of a double dopant as a diatomic molecule is further complicated by the valley physics of the Si host, leading to strong oscillatory behavior of exchange integrals that couple the pair^{9}.
Here, we present the results of atomistic calculations for nearly two thousand random placements of P dopant pairs in the Si host. The quasimolecular wave function of the double dopant is then decomposed in the basis of wave functions of the two singledopants that make up the pair. This analysis reveals that the simple picture of a doubledopant ground state understood in terms of a symmetric combination of uncoupleddopant groundstate wave functions often fails, especially, but not only, for closely spaced dopants, due to nonnegligible contributions from higher excited, singledopant levels. This picture gets further complicated for doubledopant systems near the Si surface, as in fabricated, experimental devices. This, in turn, leads to a situation where the STMimage simulation based on combining contributions from the images of two singledopants fails to describe a doubledopant STM image. Thus, when using a simple model involving only the singledopant ground states, the complicated quasimolecular character of doubledopant states produces ambiguity in determining multidopant positions from STM images. However, when we extend the model to account for the first excited, dopant states, combined with additional optimizations discussed in the text, we overcome this problem. We propose a conceptually simple and computationally efficient yet effective algorithm to determine doubledopant positions from their STM images.
Results
To understand how well the doubledopant STM image can be represented by a combination of the STM images of the two dopants, we first investigate how well the doubledopant groundstate wave function can be represented by the symmetric combination of the groundstate wave functions of the two (uncoupled) singledopants. In Fig. 1, we show the doubledopant ground state decomposition into single (uncoupled) P dopant, orthogonalized, wave function components as a function of the interdopant separation distance (see “Methods”) for 561 different, random realizations of the double dopant corresponding to different spatial placements of P dopants buried in the Si host. The doubledopant groundstate wave function \({{\Psi _0}\rangle }\) is strongly dominated by contributions from ground states of separate dopant wave functions (\({\langle {\Psi _0\varphi _1^\prime }\rangle }^2\) and \({\langle {\Psi _0\psi _1^\prime }\rangle }^2\), where \(\varphi _1^\prime \) and \(\psi _1^\prime \) are the singledopant contributions and \(^\prime \) indicates the orthogonalized single dopant wave functions) altogether reaching close to 100% for large interdopant separations. Such a result strongly supports understanding a doubledopant wave function in terms of a simple combination of two groundstate dopant orbitals of A symmetry (e1 points in Fig. 1).
However, for reduced interdopant distance, contributions from these lowest components get reduced, typically varying from 75% to 95% for distances between 6 to 10 lattice constatns (l.c.) (i.e., 3.25 to 5.43 nm). At this range of interdopant distances, a nonnegligible contribution from excited singledopant states T\(_2\) (e2, e3, and e4) and E (e5, e6) starts to emerge, reaching over 20% in several cases.
We note that the contributions from the 2s manifold of singledopant states (e7…e12) do not play a significant role in any considered cases, although these states are important for manybody properties of dopant^{10,11}, and are studied here for completeness.
For even lower interdopant spacing (below 3 l.c. or 1.6 nm), singledopant groundstate contributions can be as low as 6570%. Moreover, the decomposition of the quasimolecular wave function into all considered (e1…e12) components reaches only about 95% of the total wave function. Therefore, even higher (3s, 4s, …) singledopant multiplets would be necessary to decompose the doubledopant wave function into singledopant states completely. However, the possibility of performing such analysis was constrained by the intrinsic limits of eigenvector solvers available to us (see the “Methods”).
To investigate the problem further, Figs. 2, and 3 show the charge densities for an interdopant distance of approximately 3.3 nm (6 l.c.), corresponding to the two extreme cases for overlap with the linear combination of singledopant ground states of 80% and 90% displayed in Fig. 1 for that dopant separation. In the top row of each figure, the charge densities of the individual uncoupled dopants are shown. The ground state A\(_1\) (e1), T\(_2\) (e2, e3, e4), and E (e5, e6) states are shown. The T\(_2\) and E states have similar densities. The A\(_1\) states are less spread out. These figures also compare the full tightbinding (TB) calculation involving two coupled dopants (\(\Psi _{12}\)) with the model assuming a linear combination of individual ground states of dopants only (\(\Psi _{LC}=\alpha \varphi _1+\beta \psi _1\), with \(\alpha \) and \(\beta \) obtained by fitting to a full model). Notably, both cases (Figs. 2 and 3) have very similar interdopant spacing (3.27 vs. 3.31 nm), differing by only 0.4 Å (related to a different dopant placement within the unit cell). In Fig. 3, the simplified model produces a small residual difference between the two approaches. However, Fig. 2 shows notable differences both in 1D (middle row) and in 2D (bottom row) plots. The difference is manifested in both the charge density plots in the region between the dopants and in the (squared modulus) wavefunction difference between the two models. The susceptibility of various properties^{9} of a doubledopant system to small changes of atomic position is a known phenomenon resulting from the complicated multivalley character of dopant wavefunctions. Here, it manifests itself by leading to the more complicated character of the quasimolecular doubledopant wave function. In some cases, the doubledopant wavefunction can be well described by a linear combination (\(\alpha \varphi _1+\beta \psi _2\)) of single dopant functions. This approximation is, however, inadequate in other cases (with nearly the same interdopant separation). In both figures, the difference between the charge densities of the simplified model and the full TB calculation is small at the dopant sites. However, the (squaremodulus) wavefunction difference can be significant at the dopant sites, again showing the complicated valley effects on the wavefunction interference. Fig. 2 illustrates that even a relatively large (80%) contribution of single ground dopant states to a full doubledopant wavefunction may produce a notably different outcome should other terms be neglected.
This will profoundly impact understanding doubledopant STM pictures, as discussed later. More generally, this means that higher energy singledopant states must be accounted for in the modeling on equal footing with singledopant ground states. The importance of higher singledopant levels is also apparent in manybody studies of dopant charging energies^{10,11}. The importance of excited singledopant states is also consistent with the recent work of one of our coauthors^{12} that presented calculations of dopant pairs, including interorbital couplings of excited singledopants. The knowledge of coupling between various pairs of orbitals will also be essential for constructing multiorbital models (such as a Hubbard model) for chains and arrays of multiple donors, thus going far beyond the context of STM simulations.
Nearsurface dopants
The deepburied dopant cases considered so far provide a better, more fundamental understanding of dopants in Si. However, in dopantbased quantum devices that STM can image, the dopants must be shallowplaced donors just below the surface and can be affected by surface proximity. Therefore, in Fig. 4, we repeat the analysis presented in Fig. 1. However, here, we performed the calculations for 1128 random placements of dopant pairs at most 3 l.c. (16.3 nm) below the \(2\times 1\) (dimer) reconstructed and hydrogenpassivated Si surface.
As shown in Fig. 4, the surface substantially modifies the wavefunction character of quasimolecular doubledopant ground states. Here, even for the large interdopant distance of 5 nm considered, A\(_1\) components comprise from 75% to 95% of the double dopant wave function, with most results grouping between 80% and 90%. Moreover, because the presence of surface breaks the bulk symmetry, one of the T\(_2\) states (e2 in Fig. 4) makes an important contribution, reaching up to approximately 10% to 15%. (We emphasize that states of dopants close to the surface no longer possess exact bulk A, T, or E symmetry.) Since this decomposition is performed with a basis defined by single, shallow dopants, the proximity of the surface modifies not only singleparticle contributions to dopant states but also the basis expansion (spectral composition) of doubledopant states.
STM image relevance
From the eigenstate decomposition, we know that the largest contributions to the doubledopant quasimolecular groundstate wave function typically are the A\(_1\) terms originating from the singledopant wave functions. Based on the above, one could expect that STM image simulation of a doubledopant system could be adequately reconstructed from two singledopant A\(_1\) wave functions. However, Fig. 5 shows this may not necessarily be true. To simulate an STM image from a dopant wave function, following Chen’s^{13} approach, the STM image is effectively created as a functional of the wave function on a plane above the Si surface. STM images strongly depend on the orbital character of the STM^{4,8} tip. Specifically, the STM image is built on a plane at the tip distance from the Si top surface and is obtained by combining contributions for s, p, and d tiporbitals (see the “Methods” section). In particular, the contribution of p and d tip orbitals can and will affect the resulting pictures significantly^{4,8}. Here, we take the tip composition with a notable contribution from p and d orbitals, as this combination has been found in our previous work^{8} to reproduce very well the experimental STM image of a single dopant.
Figure 5 shows a decomposition of the doubledopant wavefunction; however, using values of the STMlike wave function functional near the top 2D surface, which is relevant for the image construction, instead of the 3D wave function, as was done before. Effectively, this is a projection restricted to the 2D STMimage plane to determine the overlap relevant to the STM image. This difference is shown schematically in Fig. 6. We reiterate that, in Figs. 1 and 4, the dopant wave function has been projected in the basis of single dopant states. This projection captures the overlap everywhere. Here, in Fig. 5, we emphasize that a twodimensional wavefunction image (i.e. before applying absolute square in Eq. (1) from “Methods” Section) is decomposed into terms originating from the single dopant states capturing only the contribution from the image plane.
Comparing Figs. 4 and 5 shows that there is a broader spread in the contributions from the singledopant ground states. This is especially noticeable for small dopant separation. The contribution can be almost as low as 50% even though the sum more completely exhausts the 2D projection. The presence of tiporbitals with nontrivial (other than s) spatial dependence apparently further enhances the role of T\(_2\) states and higher spectral components, reaching over 20% of the STM image weight in many cases. Again, this is especially true for small dopant separations, suggesting a contribution that limits the STM resolution.
STM image ambiguity
The above findings profoundly affect STM image simulation, as shown in Fig. 7. The top row on the left part of Fig. 7 demonstrates how a doubledopant image can be created from two (a and b) separate, singledopant (ground – e1) wave functions to form a simulated image (c) that matches with very high (pixelbypixel 92.3%) accuracy (see the “Methods” section for the definition of the pixelbypixel comparator) the STM simulation done with a full tightbinding calculation for the doubledopant system and shown in (d). It must be emphasized here that (c) was not obtained by combining STM images (a, b) of individual dopants but by combining underlying wave functions of individual dopants to obtain the best possible fit to a full doubledopant simulation (Fig. 6). Thus, this allows for the onset of complicated interference patterns that occur in Si due to its multivalley character.
The correct dopant positions can lead to a good match with the exact image. However, other choices for the dopant positions can give a better match. As shown in the bottom row of Fig. 7, one can find another erroneous, spatial combination of singledopant positions (e,f) that will lead to an STM simulation (g) with even better pixelbypixel (94.4%) accuracy with respect to the doubledopant STM image (d). Importantly, the STM image from full TB calculations involving alternative positions bears no resemblance to (c), (d), or (g). Thus, combining singledopant, groundstate wave functions into a doubledopant wave function may lead to ambiguity as it can produce an STM image that better matches the target picture despite the incorrect dopant positions. As shown by previous work^{5,8}, dopants occupying different spatial positions can have very different STM images. This can be seen by comparing (a), (b), (e) and (f). However, this sensitivity to lattice position is obscured in the double dopant STM images found using the groundstate wave functions of the two individual dopants.
Spatial metrology of double dopants
Determining doubledopant positions from their STM images is a formidable task. Even assuming that calculated pictures accurately match experimental positions^{8}, the most straightforward approach, i.e., running calculations for all possible cases of dopants occupying, e.g. 10 nm \(\times \) 10 nm \(\times \) 5 nm box, would involve approximately 25 \(\times 10^3\) atoms, and thus lead to a prohibitive number of over 6 \(\times 10^8\) full tightbinding calculations. Recently, a solution utilizing machine learning has been proposed^{5} that reduces this complexity severely but still involves \(10^5\) full calculations at the training stage (although this number can be reduced using spatial symmetries). Dramatic efficiencies can be achieved, in terms of many fewer TB calculations if the wave functions of the individual dopants can be used to simulate doubledopant STM images. Even greater efficiencies could be achieved for imaging multi(i.e., greater than 2)dopants. As we have shown, significant issues arise if only the groundstate singledopant wave functions are used to simulate doubledopant wave functions and images. We now show that significant improvement in simulating doubledopant STM images with singledopant wave functions can be achieved by including contributions from singledopant excited states.
To study the problem further and aim for a practical, computationally efficient way to determine the doubledopant position from their images, we use a fitting approach combining not only the singledopant ground states (e1) but also higher (e2–e6) dopant states to obtain the best possible fit. Fig. 8 (e1) shows the efficiency of such fits measured as the success ratio that a fit for the correct positions also produces an STM image that best matches the doubledopant image. 665 randomly selected, shallowly placed, double dopant systems are studied. The dopant positions are randomly selected from a uniform distribution centered at the middle of the computational box to avoid boundary effects, with interdopant separations varying from 3 to 12 lattice constants, i.e., from 1.6 to 6.5 nm. A threshold is used in Fig. 8 to limit the separation of doubledopants, e.g., 5 corresponds to an interdopant distance equal to or larger than five lattice constants. As shown in Fig. 8, due to problems discussed earlier, fitting the STM image with single dopants (e1) states is successful for only 65% of all considered cases (including small and large interdopant separations), and at most 82% for the largest separation shown on the plot.
Based on Fig. 5, we propose a simple, bruteforce scheme in which one first aims to find a solution with singledopant ground states (e1) only. A full tightbinding calculation verifies the result to exclude the possibility of finding a spurious solution discussed earlier in Fig. 7. At this stage, more than 64% of dopant positions are correctly resolved, consistent with earlier discussions. If no match is found with e1 states, the rest of the cases (i.e., 35.6%) are processed/searched with another fit, that now includes both e1 and e2 states (we note that e2 is nondegenerate due to the presence of the reconstructed surface). Finally, a potential result is accepted (or rejected) by a second full tightbinding calculation.
Including the e2 state in the fitting process (at a price of a moderate increase in computational complexity), i.e., fitting a total 4 expansion coefficients, can be done straightforwardly because the nearby surface lifts the degeneracy of of the T\(_2\) states. Adding the e2 state to the fitting leads to significantly increased efficiency reaching over 90% threshold for interdopants distances equal to 5 l.c. (2.7 nm) or more. It must be emphasized here that these fits are obtained without multiple tightbinding calculations performed for a doubledopant system but by combining only several dozen^{8} wave functions (precomputed) from a singledopant calculation. The timeconsuming doubledopant tightbinding calculation is thus performed only twice, and for validation reasons only (accuracy check calculation after the fitting process).
Because this is important, let’s rephrase it: we start with a target STM image of a double dopant. Next, we do a fit using the two e1 states to generate possible STM images. We adjust the coefficient in the simulation to get the best fit of the simulated image to the target. We do this for multiple possible dopant positions to get the best of the best fits. This gives us a best guess for the dopant positions. Next, we check with a full TB calculation using the best guess dopant positions. If this exact calculation for the best positions matches the target (using a pixelbypixel comparator), we call it a success. If the best guess for positions does not match, then we add the e2 states and repeat. We can use test cases to guide the choice of a matching criteria for success.
Such a scheme, despite the large search space, is computationally effective since it avoids multiple timeconsuming tightbinding computations for many doubledopant pairs and should provide >90% efficiency, with only two full doubledopant tightbinding calculations (as well as clear information on whether it succeeded or not) for experimentally relevant cases of doubledopants with spacings greater than 2.7 nm. Although the inclusion of higher dopant states (e3, e4, e5, and e6) can somewhat improve the accuracy (especially for closelyspaced systems), we found it comes with prohibitive computational complexity (especially when e5 and e6 are included).
The presented algorithm can be further optimized: even with the simple fit using only e1, most mismatched cases are expected to lie close to the correct positions. In fact, a significant fraction of incorrectly assigned test cases happened when only one of the dopants was misassigned (see Supporting information). This happens, on average, in over 1/2 of all missed cases and in about 80% of the cases when the dopants are separated by more than 5 l.c. When we know that one dopant has been misassigned, our algorithm can utilize this information by performing secondary (including both e1 and e2) fits starting in the vicinity of the dopant wrongfully assigned in the simple fit, thus considerably speeding up a search. Our scheme is also much simpler than an alternative machine (deep) learning approach involving complicated neural networks^{5} and resource greedy learning process, although at a price of somewhat smaller efficiency.
We also note that the range of distances studied in Fig. 8 forms the biggest challenge. For even larger distances and (effectively) decoupled dopants, the theoretical accuracy starts to approach 100%, yet with all the limitations and pitfalls occurring for single dopants.^{8}
For the smallest distance (\(\le \)2 l.c.) interdopant spacings, where the quasimolecular wave functions are the most complicated, the search space is reduced and is relatively small. It is also further reduced substantially by taking into account underlying (reconstructed) lattice symmetries^{5} and at most \(10^4\) full TB calculations should be performed to build a simple library of images of closely spaced dopants that could be directly compared with experiment to resolve dopant positions. Although large, such a number is still smaller than the number of learning cases (\(10^5\)) necessary at the stage of neural network learning, which must handle both large and small interdopant separation. Such a library approach for closely spaced dopants could work provided problems with spatial metrology of single dopants^{8} are resolved, that is, matching theory with experiment, and provided factors like noise, etc., can be effectively handled.
To conclude, if an inverse approach is used to determine the positions of dopant atoms forming a double dopant, and such an approach is based on utilizing singledopant groundstate wave functions, erroneous results can occur. This problem stems from the complicated, multiorbital character of the doubledopant wave function. The accuracy of the solution can be increased significantly by adding e2 states to the modeling at a moderate computational cost.
Discussion
To summarize, we performed a large series of STM simulations for double phosphorous dopants in silicon using a stateoftheart, empirical tightbinding approach with dorbitals. Our model achieves highquality STM simulations for single dopants, in excellent agreement with recent theoretical and experimental works. Here, we used this theory to study a statistically meaningful ensemble of 1689 double dopants placed at different sites of a host Si lattice. We aimed to understand how the doubledopant ground state, the quasimolecular wave function, can be reexpressed in terms of singledopant wave functions. As a result, already for the buried dopant (deep below the Si surface), we found that the doubledopant wave function has components originating from excited dopant T and E\(_2\) states. This result immediately suggests that modeling a multidopant system^{2} using a theory neglecting the presence of higher dopant states may be inaccurate for some cases.
This effect is even more pronounced for shallowburied, double dopants that are affected by the presence of reconstructed and passivated silicon surfaces. In this case, the doubledopant wave function has up to 20% contribution from singledopant excited states. The surface modifies not only the individual dopant wave functions but also the expansion coefficients of the quasimolecular ground state calculated in the basis of singledopant states. The simplest approach, which uses singledopant ground states to simulate the STM image of a doubledopant, can fail routinely. It is possible to find incorrect solutions corresponding to erroneous positions of two dopants that match the doubledopant STM image better than the image obtained for the correct placement of two dopants. However, the image simulation can be improved, especially for distantly spaced dopants, by including the first excited (e2) dopants states in modeling, resulting in over 90% success rate for dopants separated from each other by more than 5 l.c. (2.7 nm) and in a success rate of more than 80% for all dopant separations studied. Most of the improvements in simulating STM images and extracting dopant positions can be achieved by including only the e1 and e2 singledopant states. The computationally expensive inclusion of higher states (e3e6) into the fitting, especially for closely spaced dopants, does not allow for reaching the 90% accuracy threshold.
Doubledopant wave functions have complicated multiorbital character, which stems from mixing between ground and excited singledopant states occupying the different sites of the double dopant. This is important for simulating STM images and accurately extracting dopant positions for doubledopant pairs. In other contexts, this further indicates the importance of carefully modeling interorbital hopping integrals, consistent with our recent work^{12} and using such results to build accurate FermiHubbard models for carrying out analog quantum simulation on dopant arrays.
Methods
Tightbinding calculations
The ground state of single and double dopants is obtained with the nearest neighbor, empirical tightbinding method accounting for dorbitals^{14,15,16,17} with reconstructed surfaceatom positions^{18}, and with explicit surface passivation that accounts for the presence of hydrogen atoms^{19}. Here, for Si we use the sp\(^3\)d\(^5\)s* parametrization of Boykin et al.^{20}, accounting for multiband and multivalley couplings. The details of the sp\(^3\)d\(^5\)s* tightbinding calculations were discussed thoroughly in our earlier papers^{15,16,21,22,23}.
The computational domain is a cubic box of 30 lattice constants (approximately 16.2 nm) in each spatial direction, which is large enough for the STM simulations to converge. It uses a relatively small (0.22 million) number of atoms in the computational box. Since we found that the STM image simulation does not depend on spinorbit interaction, we neglect the spinorbit mixing term in the Hamiltonian, allowing us to work with a real Hamiltonian matrix with significant benefits in terms of computational efficiency and time. Thousands of separate atomistic simulations were performed on a 128 CPUcore system using JacobiDavidson solver as implemented in the SLEPc/PETSc library.
Each phosphorous dopant is represented by a dynamicallyscreened electrostatic potential \((\varepsilon (r)\,r)^{1}\) with centralcell correction values tuned so the energy levels of the lowest six dopant bound states match the respective experimental values. We have used the dynamic dielectric screening^{4,24,25} model of Ref.^{25} with \(\epsilon _{\infty } = 11.4 \epsilon _0\), with centralcell correction equal to \(3.755\) eV, reproducing the binding energy (\(45.585\) meV), in excellent agreement with the experimental value of \(45.58\) meV^{26}. We have also incorporated separate centralcell shifts of p and d orbital energies (\(\Delta E_\text {p} =1.195\) eV, and \(\Delta E_\text {d}=1.211\) eV respectively) to reproduce better the energies of excited dopant levels, again with excellent (within several µeV) agreements with experimental values (T\(_2=33.9\) meV, E\(=32.6\) meV). Additionally, we accounted for strain introduced by incorporating phosphorus into the silicon lattice, which causes the extension of the SiP bond by 1.7%. The effect of strain was incorporated in the Hamiltonian by rescaling the SiP hopping matrix elements using Harrison’s law. We note, however, that neither the screening model nor the inclusion of strain has any visually discernible effect on resulting STM images. Although this leads to a different conclusion as compared to Ref.^{27}, this should not come as a surprise since a static screening model with \(\epsilon _{\infty } = 11.4 \epsilon _0\) and without strain (with central cell correction of \(3.689\) eV, and \(\Delta E_\text {p}=1.146\) eV \(\Delta E_\text {d}=1.099\) eV) also provides excellent agreement with experimental dopant levels energies. We emphasize that in any (static or nonstatic) screening model, the actual choice of \(\epsilon _{\infty }\) seems to play a crucial role. Moreover, contrary to a model studied in Ref.^{28}, our static screening model already provides a good value of a squared magnitude of the ground state wave function at the donor nuclear site \(\left \psi \left( r_0\right) \right ^2\) equal to \(0.495 \times 10^{30}\,\text {m}^{3}\), as compared to \(0.43 \times 10^{30}\,\text {m}^{3}\) given in the experiment^{29}. Including dynamic screening and strain provides a result even closer to the experiment and equal to \(0.466 \times 10^{30}\,\text {m}^{3}\). Finally, we note that allowing for a separate centralcell shift of s* orbitals (with a central cell correction of \(4.418\) eV, \(\Delta E_\text {p} = 1.858\) eV, \(\Delta E_\text {d} = 1.874\) eV, and \(\Delta E_{\text {s}*} = 1.839\) eV) allowed us to reproduce the experimental result exactly, emphasizing the need for future studies to tightbinding dopant models. However, this work does not focus on modeling hyperfine properties. For consistency with our previous work (Ref.^{8}), we use the dynamic screening model with strain but without s* optimization throughout this paper.
2D STM image simulation
For STM image simulation (as shown in Fig. 7), we have augmented the tightbinding basis^{4,23} with Slatertype orbitals (STO) to model the atomic orbitals^{30}. We modify the s* orbital exponent, resulting in excellent agreement with experimental images from Ref.^{4}. Finally, we note that an STM image is simulated by summing up contributions from the STOs associated with atoms on the silicon surface and below, with a cutoff radius of 2 nm.
A single tip orbital cannot capture all features in the experimental STM image^{4,8}. The STM image value \(I(\textbf{r})\) from a general tip orbital and the dopantstate wave function in the imaging plane \(\psi (\textbf{r})\) with contributions from s, p\(_z\) and d\(_{z^2\frac{1}{3}r^2}\) tip orbitals, according to Chen’s approach^{13}, is directly proportional to
where contributions from s, p\(_z\) and d\(_{z^2\frac{1}{3}r^2}\) orbitals, are defined as \(c_\text {s}^2\), \(c_\text {p}^2\) and \(c_\text {d}^2\), respectively, with \(c_\text {s}^2 + c_\text {p}^2 + c_\text {d}^2 = 1\) and z is the vertical direction perpendicular to the surface. Parameter \(\kappa \) quantifying the vacuum decay of the Slater orbitals is assumed to have a constant value of 1.3 Å\(^{1}\) = 0.013 pm\(^{1}\), in agreement with the methodology presented in Ref.^{4}, and our earlier work^{8}.
2D charge density simulation
Based on the wave function in LCAO form, i.e. \(c_{i\alpha }\) (where i is the index of an atom and \(\alpha \) is the index of the orbital in sp\(^3\)d\(^5\)s* basis set) it is possible to calculate the charge corresponding to each atom as \(\sum _\alpha c_{i\alpha }^2\). For strainfree systems analyzed in this paper (apart from the surface reconstruction effects), one can superimpose a regular threedimensional grid on the diamond cubic lattice of silicon atoms^{11}. By combining the charge corresponding to the grid points along the z axis, this approach allows to calculate a twodimensional charge density map without using any auxiliary orbital set, as shown in Figs. 2 and 3.
Wave function representation
Each single or multidopant eigenstate can be associated with a representation. In our analysis, we will use two different representations.

1.
Tightbinding coefficients of given eigenstate. This way (used in Figs. 1, 2, 3, 4), each eigenstate is represented as a vector of coefficients of size equal to the number of orbitals per atom × the number of atoms. The scalar product is defined as with regular vectors.

2.
Functional of the wave function on a given surface. This way (used in Figs. 5 and 7), for each eigenstate, we calculate the values of the wave function and its derivatives on a regular grid on a given surface perpendicular to the z direction. The resulting representation consists of the values of the specific functional corresponding to the STM tip of mixed s, p\(_z\), and d orbitals (squared coefficients of 14.6%, 72.5%, and 12.9%, respectively), according to Chen’s approach^{13}. As a result, the STM image can be obtained by taking the squares of the representation values. The scalar product is defined as pixelbypixel multiplication and summation over the entire surface.
Both representations allow us to model eigenstates’ charge densities on twodimensional surfaces. In representation 1, as described in section 2D charge density simulation above, atoms of the strainfree system can be overlaid on a regular, threedimensional grid^{11} and the charge (sum of squared TB coefficients corresponding to each atom) summed in a direction perpendicular to a surface on which we need to visualize the charge distribution (bottom rows of Figs. 2 and 3). In the second representation, as described in section 2D STM image simulation, the charge is calculated as squared values of STMlike functional associated with Slater orbitals. The latter variant can be visualized on a grid of arbitrary resolution (as in Fig. 7).
Given two 2D images I(x, y) and J(x, y) corresponding to the same representation and \(L^2\)normalized (\(\sum _{(x,y)} I(x,y)^2 = \sum _{(x,y)} J(x,y)^2 = 1\)), the accuracy, or difference between two images can be calculated using a pixelbypixel least squares comparator i.e. \(\sum _{(x, y)} \left( I(x,y)  J(x,y) \right) ^2\).
Contribution analysis of twodopant systems
Regardless of the representation chosen, one can compare the ground state of the twodopant system with the eigenstates of two systems consisting of individual dopants. The problem is, since the latter comes from two separate diagonalizations (one for each dopant position), these are not pairwise orthogonal. As a result, coefficients calculated as scalar products between the twodopant ground state and singledopant eigenstate, squared, would not add up to one and, therefore, would not be suitable for a contribution analysis.
Therefore, an orthogonalization scheme must be used and with this in mind, the doubledopant representation \(\Psi _0\) can be approximated as a linear combination of n representations of each dopant
where \(\varphi '\) and \(\psi '\) are GramSchmidt orthogonalized representations corresponding to consecutive eigenstates of the first and second dopant, respectively:
where \(\varphi '_n\) and \(\psi '_n\) are renormalized after the nth step.
The similarity between \(\Psi _0\) and \(\Psi \) is then calculated as \({\langle {\Psi _0\Psi }\rangle } = \sum _{i=1}^n {\langle {\Psi _0\varphi _i'}\rangle }^2 + \sum _{i=1}^n {\langle {\Psi _0\psi _i'}\rangle }^2\), giving quantitative contributions of \(\psi _i'\) in the twodopant ground state \(\Psi _0\). In the same way, one can calculate the similarity between \(\Psi _0\) and any given pair of nth states \(\varphi _n\) and \(\psi _n\). The results are the same as would be obtained with any equivalent method, e.g., based on inverting the covariance matrix formed between all eigenstates. Moreover, the apparent asymmetry between \(\varphi \) and \(\psi \) in Eq. (3) does not affect the results, as we always account for both dopants for each n.
The resulting contributions are shown for many different spatial dopants configurations in Figs. 1 and 4 (representation 1) as well as in Fig. 5 (representation 2).
Data availibility
The data supporting this study’s findings are available within the article. Further requests can be made to the corresponding author.
References
He, Y. et al. A twoqubit gate between phosphorus donor electrons in silicon. Nature 571, 371–375 (2019).
Wang, X. et al. Experimental realization of an extended FermiHubbard model using a 2D lattice of dopantbased quantum dots. Nat. Commun. 13, 6824. https://doi.org/10.1038/s4146702234220w (2022).
Kiczynski, M. et al. Engineering topological states in atombased semiconductor quantum dots. Nature 606, 694–699. https://doi.org/10.1038/s41586022047060 (2022).
Usman, M. et al. Spatial metrology of dopants in silicon with exact lattice site precision. Nat. Nanotechnol. 11, 763–768. https://doi.org/10.1038/nnano.2016.83 (2016).
Usman, M., Wong, Y. Z., Hill, C. D. & Hollenberg, L. Framework for atomiclevel characterisation of quantum computer arrays by machine learning. npj Comput. Mater. 6, 19. https://doi.org/10.1038/s4152402002820 (2020).
Brázdová, V. et al. Exact location of dopants below the Si(001): H surface from scanning tunneling microscopy and density functional theory. Phys. Rev. B 95, 075408. https://doi.org/10.1103/PhysRevB.95.075408 (2017).
Sinthiptharakoon, K. et al. Investigating individual arsenic dopant atoms in silicon using lowtemperature scanning tunnelling microscopy. J. Phys. Condens. Matter 26, 012001. https://doi.org/10.1088/09538984/26/1/012001 (2013).
Różański, P. T., Bryant, G. W. & Zieliński, M. Scanning tunneling microscopy of buried dopants in silicon: Images and their uncertainties. npj Comput. Mater. 8, 182. https://doi.org/10.1038/s4152402200857w (2022).
Koiller, B., Hu, X. & Das Sarma, S. Exchange in siliconbased quantum computer architecture. Phys. Rev. Lett. 88, 027903. https://doi.org/10.1103/PhysRevLett.88.027903 (2001).
Tankasala, A. et al. Twoelectron states of a groupv donor in silicon from atomistic full configuration interactions. Phys. Rev. B 97, 195301. https://doi.org/10.1103/PhysRevB.97.195301 (2018).
Różański, P. T. & Zieliński, M. Exploiting underlying crystal lattice for efficient computation of coulomb matrix elements in multimillion atoms nanostructures. Comput. Phys. Commun. 287, 108693. https://doi.org/10.1016/j.cpc.2023.108693 (2023).
Gawełczyk, M. & Zieliński, M. Bardeen’s tunneling theory applied to intraorbital and interorbital hopping integrals between dopants in silicon. Phys. Rev. B 106, 115426. https://doi.org/10.1103/PhysRevB.106.115426 (2022).
Chen, C. J. Tunneling matrix elements in threedimensional space: The derivative rule and the sum rule. Phys. Rev. B 42, 8841–8857. https://doi.org/10.1103/PhysRevB.42.8841 (1990).
Jancu, J.M., Scholz, R., Beltram, F. & Bassani, F. Empirical spds* tightbinding calculation for cubic semiconductors: General method and material parameters. Phys. Rev. B 57, 6493–6507. https://doi.org/10.1103/PhysRevB.57.6493 (1998).
Zieliński, M. Including strain in atomistic tightbinding hamiltonians: An application to selfassembled InAs/GaAs and InAs/InP quantum dots. Phys. Rev. B 86, 115424. https://doi.org/10.1103/PhysRevB.86.115424 (2012).
Zieliński, M. Valence band offset, strain and shape effects on confined states in selfassembled InAs/InP and InAs/GaAs quantum dots. J. Phys. Condens. Matter 25, 465301 (2013).
Chadi, D. J. Spinorbit splitting in crystalline and compositionally disordered semiconductors. Phys. Rev. B 16, 790–796. https://doi.org/10.1103/PhysRevB.16.790 (1977).
Craig, B. I. & Smith, P. V. The structure of the Si(100)2\(\times \)1: H surface. Surf. Sci. 226, L55–L58. https://doi.org/10.1016/00396028(90)90144W (1990).
Tan, Y. P., Povolotskyi, M., Kubis, T., Boykin, T. B. & Klimeck, G. Tightbinding analysis of Si and GaAs ultrathin bodies with subatomic wavefunction resolution. Phys. Rev. B 92, 085301. https://doi.org/10.1103/PhysRevB.92.085301 (2015).
Boykin, T. B., Klimeck, G. & Oyafuso, F. Valence band effectivemass expressions in the sp\(^3\)d\(^5\)s* empirical tightbinding model applied to a Si and Ge parametrization. Phys. Rev. B 69, 115201. https://doi.org/10.1103/PhysRevB.69.115201 (2004).
Jaskólski, W., Zieliński, M., Bryant, G. W. & Aizpurua, J. Strain effects on the electronic structure of strongly coupled selfassembled InAs/GaAs quantum dots: Tightbinding approach. Phys. Rev. B 74, 195339. https://doi.org/10.1103/PhysRevB.74.195339 (2006)
Zieliński, M., Korkusinski, M. & Hawrylak, P. Atomistic tightbinding theory of multiexciton complexes in a selfassembled InAs quantum dot. Phys. Rev. B 81, 085301. https://doi.org/10.1103/PhysRevB.81.085301 (2010).
Różański, P. T. & Zieliński, M. Linear scaling approach for atomistic calculation of excitonic properties of 10millionatom nanostructures. Phys. Rev. B 94, 045440. https://doi.org/10.1103/PhysRevB.94.045440 (2016).
Nara, H. Screened impurity potential in Si. J. Phys. Soc. Jpn. 20, 778–784. https://doi.org/10.1143/JPSJ.20.778 (1965).
Pantelides, S. T. & Sah, C. T. Theory of localized states in semiconductors. I. New results using an old method. Phys. Rev. B 10, 621–637. https://doi.org/10.1103/PhysRevB.10.621 (1974).
Ramdas, A. K. & Rodriguez, S. Spectroscopy of the solidstate analogues of the hydrogen atom: Donors and acceptors in semiconductors. Rep. Prog. Phys. 44, 1297–1387. https://doi.org/10.1088/00344885/44/12/002 (1981).
Usman, M., Voisin, B., Salfi, J., Rogge, S. & Hollenberg, L. Towards visualisation of centralcelleffects in scanning tunnelling microscope images of subsurface dopant qubits in silicon. Nanoscale 9, 17013–17019. https://doi.org/10.1039/C7NR05081J (2017).
Usman, M. et al. Donor hyperfine stark shift and the role of centralcell corrections in tightbinding theory. J. Phys. Condens. Matter 27, 154207. https://doi.org/10.1088/09538984/27/15/154207 (2015).
Feher, G. Electron spin resonance experiments on donors in silicon. I. Electronic structure of donors by the electron nuclear double resonance technique. Phys. Rev. 114, 1219–1244. https://doi.org/10.1103/PhysRev.114.1219 (1959).
Slater, J. C. Atomic shielding constants. Phys. Rev. 36, 57–64. https://doi.org/10.1103/PhysRev.36.57 (1930).
Acknowledgements
P.R. and M.Z. acknowledge support from the Polish National Science Centre based on Decision No. 2015/18/E/ST3/00583.
Author information
Authors and Affiliations
Contributions
M.Z. supervised and acquired funding for the project and performed initial calculations. P.R. implemented the method for data analysis and performed the calculations, and M.Z. developed the bruteforce method. P.R. and M.Z. prepared the figures. P.R., G.W.B., and M.Z. wrote the manuscript and participated in the discussion.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License, which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/byncnd/4.0/.
About this article
Cite this article
Różański, P.T., Bryant, G.W. & Zieliński, M. Challenges to extracting spatial information about double P dopants in Si from STM images. Sci Rep 14, 18062 (2024). https://doi.org/10.1038/s4159802467903z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4159802467903z