Certification of quantum states with hidden structure of their bitstrings

The rapid development of quantum computing technologies already made it possible to manipulate a collective state of several dozens of qubits, which poses a strong demand on efficient methods for characterization and verification of large-scale quantum states. Here, we propose a numerically cheap procedure to distinguish quantum states which is based on a limited number of projective measurements in at least two different bases and computing inter-scale dissimilarities of the resulting bit-string patterns via coarse-graining. The information one obtains through this procedure can be viewed as a ‘hash function’ of quantum state—a simple set of numbers which is specific for a concrete wave function and can be used for certification. We show that it is enough to characterize quantum states with different structure of entanglement, including the chaotic quantum states. Our approach can also be employed to detect phase transitions in quantum magnetic systems.


INTRODUCTION
Theoretical description of objects invisible to human eye represents one of the challenging but, at the same time, most intriguing problems in physics through its history.For example, despite incessant improvement of optical instruments and the ability to look into more and more distant corners of the Universe, in many cases one can conclude on the existence of a planet only in an indirect way by analyzing its tiny influence on the orbits of neighboring visible planets 1 and stellar brightness 2 .
In the opposite limit of the atomic scale, the situation is even more complicated.When the object of our principle interest is a many-body quantum state-wave function or density matrix-we should conclude on its existence and properties indirectly on the basis of measurements.Moreover, in contrast to observation of celestial objects whose collective motion could be completely described with laws of classical mechanics, a measurement in quantum world does not provide a complete information about a system due to the uncertainty principle 3 , and characterizing quantum matter from such limited probes represents a non-trivial methodological and technical problem.
The conventional technique to analyze quantum state of a multi-component physical system is quantum tomography, which is based on the idea of complete 4 or partial 5 reconstruction of the wave function or density matrix from a number of measurements.Complexity of the tomographic procedure is mainly related to the number of qubits involved and the complexity of the quantum state itself, about which one might or might not have some prior expectations.In many cases, it could be non-trivial to choose a set of observables which is tomographically complete (or sufficient for partial reconstruction) and, at the same time, experimentally accessible 5 .
The main fundamental limitation of quantum tomography is that one needs to store and manipulate the to-be-reconstructed quantum state on a classical computer, which makes characterization of systems that comprise more than a few dozens of qubits unfeasible.Taking into account that quantum states of 53 qubits can already be generated on modern quantum devices 6 , and a significant increase of this number is expected in the coming years, seeking an approach that overcomes this limitation appears to be a problem of high importance.
A natural way to reduce the memory required for state reconstruction is to store it in an implicit form of a compact variational ansätz.One of the most promising approaches of this kind is the recently proposed neural-network version of quantum tomography 7,8 , which represents the wave function as a Neural Quantum State 9 and reconstructs it via the learning procedure.While this approach has many benefits such as very high expressibility of neural-network ansätze 10,11 , it does not resolve all the problems of quantum tomography.Some quantum states, such as defined by wave functions with random or uniform distributions of amplitudes over the Hilbert space basis, require exponentially large number of measurements (of the order of the Hilbert space dimension) for reconstruction.The situation cannot be improved by employing neural networks, since there are no features that the neural network can detect in the measured data, learn and generalize 7 .
Thus, it is important to find a way to bypass the resourceconsuming routine of conventional quantum tomography at least in certain contexts.A typical problem, when there is a chance not to get engaged in this procedure, is certification of a state prepared on a quantum information processing device.In this case, there are strong prior expectations of what this state should be.Thus, instead of its complete reconstruction, one could hope to read out simple signature serving as a fingerprint of the manybody state-in a spirit similar to hash functions in computer science 12,13 -to make sure that the state is, with high probability, indeed the correct one (see ref. 14 for the usage of hash functions in quantum tomography).
In this paper, we introduce such a signature that can be constructed by means of a reasonable number of simple von Neumann measurements of the quantum state and does not require computing correlation functions.Ideologically, this can be viewed as going along the line of the very recent approach of classical shadow tomography 15,16 , though the signature we employ is different.To accomplish that, we heavily rely on the concept of multi-scale structural complexity of classical patterns that has been recently defined by some of the authors of this paper 17 .To avoid possible terminological confusions with the well-established notion of quantum complexity, here we call it dissimilarity (since it is based on counting how much different spatial scales of an object differ from each other).The detailed description of the protocol is given in the Methods section, and here we outline the main idea.
Assume, we have access to a many-body quantum state.To do benchmark tests, in this paper we use both numerical wave functions (e.g., resulting from exact diagonalization) and physical quantum states generated on the IBM quantum simulator 18 .With no loss of generality, we will be considering spin-1/2 systems.A single-shot projective measurement of such a state results in a string of bits of length N-measured spin projections on a chosen direction: S i j i ¼ 0110 010 j i(0 for spin-down and 1 for spin-up) -where N is the number of qubits.Performing the measurement many times (denote this number with N shots ) and collecting the outcomes in a string, we obtain a bit-string array of length L = N × N shots .This array can then be viewed as a one-dimensional pattern, and its inter-scale dissimilarity can be computed.For that, we do several steps of coarse-graining (we label the steps with index k) and for each pair of subsequent scales compute how distinct the corresponding coarse-grained strings are.The distinction is assumed to be large if overlap of arrays at two subsequent scales is small.For two neighboring scales, we call these measures partial dissimilarities, D k , and their sum over all scales D ¼ P k D k gives the total inter-scale dissimilarity.Different schemes of coarsegraining can be employed, and here we resort to the simplest option: we fix filter of width Λ (usually Λ = 2), and at step k we substitute all the pixels within a window of size Λ k with the average value of pixels in this window at the previous step, Fig. 1.Despite probabilistic nature of the measurements, in all the tested cases dissimilarity turns out to be a statistically robust signature of the state.
If this procedure is performed in a single basis, it does not reveal any information on the phase structure of the quantum state, since measurement outcomes are defined solely by probability distribution on the Hilbert space basis |Ψ(S i )| 2 .Also, unique characterization of a many-body quantum state with a single number is clearly impossible.However, if such bit-string arrays are constructed in two or more different Hilbert space bases, one obtains a sequence of numbers that implicitly contains information on both amplitude and phase structure of the state.The more bases are involved, the less it is likely that two different quantum states would share the same dissimilarity signature (in a different context, the tomographic advantage of using several bases was discussed in ref. 19 ).
In this paper, we do not go beyond measurements in two bases, and this seems enough to characterize several important families of quantum states.As a warm up, we consider the families of Dicke and Schrödinger cat states which have compact analytical representations, and demonstrate how the concept of bit-string inter-scale dissimilarity can be used for dimensional reduction and visualization of specific signatures of wave functions.We also reveal the connection between the dissimilarity measure and the von Neumann bipartite entanglement entropy which plays a central role in quantum information theory.Then, we test our approach by using it for certification of random quantum states characterized by complete delocalization in the Hilbert space, which we do both numerically and analytically.We also show that the proposed approach scales nicely and requires the same experimental efforts to certify 16-qubit and 53-qubit states.Using the transverse-field Ising, the Shastry-Sutherland, and the bondalternating XXZ Heisenberg models as examples, we show that the inter-scale dissimilarity can be used as a universal tool for detecting quantum phase transitions in many-body systems, including the topological ones.Finally, we discuss how the concept of inter-scale dissimilarity can be used for dimensional reduction and visualization of many-body quantum states.

Notable entangled quantum states
To demonstrate the idea of bit-string arrays and inter-scale dissimilarity, we begin with the Schrödinger cat states defined by superposition of merely two basis vectors in the Hilbert space Parametrized by angle θ, this family of states interpolates between trivial product state 0 j i N at θ = 0 and the famous Greenberger-Horne-Zeilinger (GHZ) state These states can be realized with quantum circuit 20 shown in Fig. 2a.First, with rotational gate U θ one prepares cosð θ 2 Þ 0 j i þ sinð θ 2 Þ 1 j i state of one of the qubits in the system and takes it as a control qubit to perform controllable-NOT operation on the second qubit.This operation results in a two-qubit entangled state cosð θ 2 Þ 00 j i þ sinð θ 2 Þ 11 j i. Repeating it N − 1 times, one eventually entangles all the qubits and obtains the target Schrödinger cat state.
Fig. 1 Protocol for computing dissimilarity of a quantum state.a First, one prepares a state on a quantum device and chooses the measurement basis by applying rotational gates U 0 to individual qubits.b In this paper, we work with σ z and random bases whose Bloch sphere representations are shown in the picture.We say that the set of measurements is performed in a random basis if, for each shot of measurement, a random vector belonging to the highlighted sector of the Bloch sphere is uniformly sampled and the corresponding parameters of gate U 0 are applied.c A number of measurements is performed and their outcomesbitstrings of length Nare then stacked together in a one-dimensional binary array of length N × N shots that serves as a classical representation of the quantum state.d The array is coarse-grained in several steps (indexed with k).Different schemes can be employed, but here we use plain averaging with fixed filter size Λ.In the picture, blue and white squares in the top line correspond to '0' and '1' bits in the array shown in (c), and black rectangles depict the blocks where averaging occurs at every step of coarse-graining.Overlap-based dissimilarities D k between subsequent arrays are computed and summed up to the overall dissimilarity D. See Methods section for more details.
In the σ z -basis, projective measurements of such states can only result in either 0000…0 or 1111…1 bitstring.Clearly, first steps of coarse-graining (Fig. 2b) affect only internal content of individual bitstrings of length N, where it simply maps 0000…0 → 0000…0 and 1111…1 → 1111…1.Thus the randomly assembled array of bitstrings remains intact, and partial dissimilarities D k 0 for k such that Λ k < N (for k < 4 when we take N = 16 and Λ = 2).At Λ k ≥ N, the coarse-graining flow starts mixing individual bitstrings, and non-trivial contributions to the dissimilarity emerge.In random basis (Fig. 2c), D k take finite values at all scales k, though due to the trivial structure of basis vectors defining Ψ θ partial dissimilarities do not depend on θ at Λ k < N.
Importantly, each state presented in Fig. 2b, c reveals a distinct set of D k which can be used to distinguish states from each other.Schrödinger cat states are the simplest example of many-body entangled wave functions, but in what follows we will show that the same idea can be exploited when dealing with much more complex states.It has to be stressed out one more time that, while individual bitstrings are assembled into array in a random order set by the outcomes of consecutive projective measurements, the partial dissimilarities and their total sum are robust upon repeatedly performing the set of measurements.
Another type of entangled states that are instructive to consider is the family of Dicke states 21 , where the sum goes over all possible permutations of qubits.By increasing D from 1 to N 2 , one increases the number of basis vectors involved into the quantum state.Recently, these states have been experimentally realized 22 , and their verification is a challenging task if the number of qubits is large 23 .As a proof of concept, in this paper we study Dicke states of 16 qubits, and initialize them on quantum simulator using the Least Significant Bit procedure 24 .
Partial dissimilarities of the 16-spin Dicke states computed in σ zbasis and in the random basis with filter size Λ = 2 are shown in Fig. 3.One can see that two different bases encode information about two ranges of scales.For any given parameter D, when bitstring arrays are constructed from measurement in the σ z basis, D k take non-zero values only for k < 4, which follows from the fact that all the Hilbert space basis vectors possessing non-zero amplitudes have equal amount of spin-up entries, and after 4 steps of averaging every bit string reduces to exactly the same number, and all the patterns are destroyed.Contrary, in the random basis, states with different D can be distinguished from D k at larger spatial scales, k ≥ 4.
Since both families of states smoothly interpolate between the regimes of low and high entanglement, it is interesting to study if there are any relations between the introduced measure of interscale dissimilarity and quantum correlations.To do that, we consider the von Neumann entanglement entropy where the system is divided into two equal parts A and B of N/2 qubits, compute its dependence on either θ or D (depending on the family), and plot it alongside the inter-scale dissimilarity of bitstring arrays computed in the σ z basis.The result is shown in Fig. 4.While the Dicke and the Schrödinger cat states are quite different in the regard that variation of parameter D modifies the structure of the wave function support in the Hilbert space basis, and θ only changes the balance between two basis vectors bearing non-zero amplitudes, in both cases dissimilarity nicely captures dependence of entropy on the parameters labeling the state within the family.Although the precise analytical correspondence between these two concepts is still to be revealed, it could be a good indication that it is possible to employ dissimilarity to estimate entanglement entropy, which is generally very difficult to reconstruct from experimental measurements, especially when dealing with multiqubit systems inaccessible to quantum tomography.In a certain way, it is similar to the approach proposed in ref. 25 , where it was shown that, with the help of neural networks, entanglement can Here, Λ = 2.The peak at k = 4 is due to the fact that the coarse-graining window becomes of the size of the system at this scale, Λ k = N. d Visualization of bit-string arrays.In these images, individual bitstrings are horizontal lines of 16 bits that are concatenated to form a long string (stacked here vertically for the purpose of presentation).Left picture shows an example of array sampled from a cat state with θ ¼ π 2 in the σ z basis, and the right onemeasured in the random basis.Here k = 0 represents texture of the measured array per se, and k > 0 show its evolution upon coarse-graining.
be reconstructed from visual pattern representations of quantum states.

Random quantum states
Our next goal is to demonstrate that the dissimilarity measures can serve as a signature not only of highly structured states with simple analytical representations, but of rather generic many-body states.To do that, we consider Haar-random wave functions uniformly sampled from the Hilbert space and characterized by the Porter-Thomas distribution of bit-string probabilities p = |〈x 1 , …,x N |ψ〉| 2 that have recently been used to demonstrate quantum supremacy 6 .These states play an important role in studying quantum chaos theory 26 , quantum information theory 27 and information processing, including research domains of superdense coding of quantum states 28 and data hiding 29 , and even transport phenomena 30 .While complete tomography of a given random state is an extremely complicated task since the minimal number of measurements to be performed to reconstruct a random quantum state should be of order of the Hilbert space dimension 4,7 , here we show that to certify if a state belongs to the Haar-random class one can resort to computing inter-scale dissimilarities of relatively short bit-string arrays.
As it was shown in refs. 6,30,31, random quantum states can be initialized with shallow pseudo-random circuits that can differ in the number and types of gates, and practical realization of these circuits on a real quantum device depends on its architecture.In this work, we generate random quantum states of a 16-qubit system on the IBM quantum simulator with the protocol proposed in ref. 30 , which guarantees an accurate approximation of the Haar-random state with a compact circuit shown in Fig. 5a.More specifically, the circuit is formed in cycles, each having one-and two-qubit gate layers.Within the first layer, for each qubit in system one randomly chooses from ffiffiffi X p , ffiffiffi Y p and T gates, where ffiffiffi X p ( ffiffiffi Y p ) are π/2 rotations around the x-axis (y-axis) of the Bloch sphere, and the non-Clifford gate T = diag(1, e iπ/4 ).In turn, the second layer comprises controlled-Z gates, diag(1, 1, 1, −1), whose topology is randomly chosen from the set of configurations with fixed couplings between qubits, as described in ref. 30 .
In both σ z and random bases, the inter-scale partial dissimilarities of the array generated by sampling 8192 bitstrings from a random quantum state follow the same decaying profile, Fig. 5b.Such a profile is a robust signature of typical Haar-random states.It can be shown that, for a chosen filter size Λ, the dependence of D k on the step index k obeys a simple analytical law in the averaging coarse-graining scheme: To derive this law from Eq. ( 9), the central limit theorem must be employed as elaborated in the "Methods" section.This dependence is easy to reconstruct from a limited number of simple projective measurements, and it serves as a signature of the class of typical Haar-random states.
To go beyond the simple 16-qubit case and perform an ultimate test of the method, we have applied it to the real experimental data generated on the Google Sycamore quantum processor 6 .For systems of 16, 32 and 53 qubits, we have taken 8192 bitstrings measured in the σ z basis and calculated partial dissimilarities, which turned out to perfectly fit Eq. ( 4).The result for the prominent example of 53-qubit system is presented in Fig. 5c.
In a real-world scenario, the bit-string arrays are clearly a subject to the gate errors and other sources of noise, and we have to understand how these imperfections are reflected in the dissimilarity signatures of the state.Previous studies 6,31 have demonstrated that random quantum states are hypersensitive to the gate errors, which is considered to be a defining property of quantum chaos.When the error rates increase, the distribution of probabilities of the bitstrings generated by a random circuit deviates from the Porter-Thomas law PrðpÞ ¼ 2 N e À2 N p and converges to equal probabilities of all the bitstings: Pr(p) = δ(2 −N − p).To quantify this deviation, the authors of refs. 6,31have introduced the cross-entropy benchmarking procedure.It allows to estimate with a limited number of measurements how close a sampler-a given quantum circuit-to one of the two limiting cases: the ideal random quantum circuits with Porter-Thomas distribution of probabilities and uniform sampler with identical probabilities p(x 1 , …, x N ) = 2 −N .In this respect, it is important to distinguish between outputs of quantum circuits with the Porter-Thomas and the uniform probability distributions by calculating the inter-scale dissimilarity.
To elaborate on this point, we prepared a quantum circuit consisting of only the Hadamard gates that generates a 16-qubit state with uniform probabilities in the σ z basis: . The obtained dissimilarity profile of the generated uniform state fully coincides with that obtained for the chaotic quantum circuits (Fig. 5b), with the overall dissimilarity D z ¼ 0:25.Thus, from σ z basis measurements we cannot distinguish these two states that are fully delocalized in the Hilbert space.However, in the random basis they have different profiles of D k and overall D. While the chaotic quantum circuit is characterized by an isotropic character of the dissimilarity that is independent on the measurement basis (so, the D k profile is exactly the same in the random and the σ z bases), the X j i state in the random basis reveals its trivial nature.Namely, its D r k È É and the resulting dissimilarity D r = 0.204 coincide with that obtained for 0 j i 16 in the random basis (as shown in Fig. 3b).This suggests that the inter-scale dissimilarity can be used to quantify deviations from a truly chaotic quantum states, which would be interesting to verify experimentally.

Phase transitions in magnetic systems
Since the inter-scale bit-string dissimilarity appears to be a rather unique signature of many-body state, it is natural to expect that it should be sensitive to crossing phase boundaries in the parametric spaces of many-body quantum systems.If so, one can hope that it can be used as a sensitive indicator of phase transitions and directly used for constructing quantum phase diagrams, which is a crucial task in understanding phenomenology of correlated materials and designing materials.The common practice is to distinguish different phases of a quantum or classical many-body system by calculating the order parameter 32 and loworder correlation functions such as susceptibility, scalar chirality and others.However, in many cases devising the order parameter is a non-trivial analytical problem, especially in the case of topological phases 33 .Besides that, a quantum system may have a rich variety of different electronic and magnetic phases depending on internal (interactions) and external (temperature, pressure, magnetic field) parameters, and there could be no universal operator that can probe the whole phase diagram.
To overcome this problem, a lot of effort has been put into designing alternative approaches based on neural networks [34][35][36] , unsupervised machine learning techniques 37 , and quantum information theory concepts 38,39 .These methods usually rely on manipulating eigenstates of the quantum system on a classical computer, which puts natural limitations on the size of systems that can be studied in this way.Also, it can be time-and resourcedemanding to conduct, e.g., the learning procedure.
The progress in developing quantum simulators and quantum computing devices suggests a distinct way for large-scale representation of a quantum systems and analysis of their phase diagrams.Instead of solving the Hamiltonian numerically, one can imitate it in an, e.g., optical experiment.For example, by varying depth of the potential in optical lattices, one can change the ratio between hopping integrals and on-site Coulomb interaction in the simulated strongly-correlated electronic or bosonic system, and scan through its parametric space in this way.Recent advances in this field include simulation of the electronic metal-to Mott insulator transition 40,41 and destruction of the antiferromagnetic long-range order with temperature and doping 42 .Analysis of such experiments is then conducted by means of a limited set of siteresolved measurements performed on the system, and the relevant information should be extracted from these measurements, whose number is much smaller than the Hilbert space dimension.We refer the reader to refs. 43,44for an interesting machine learning-based approach to the analysis of optical lattice experiments, and in what follows we discuss how the concept of bit-string arrays and their inter-scale dissimilarity can enter the game and aid reconstruction of phase diagrams of simulated quantum matter.
As some of us have shown in ref. 17 , the classical prototype of inter-scale dissimilarity-the structural complexity of patternscan be used to detect phase transitions in classical systems without any prior knowledge of the order parameter, and in an extremely numerically cheap unsupervised manner.Now, we will show how it can be extended onto the quantum case and help reconstruct quantum phase diagrams of many-body systems from simple projective measurements.We will be using the transversefield Ising and the Shastry-Sutherland models as examples.
The simplest example of a quantum phase transition is the paramagnet-to-ferromagnet transitions in the ferromagnetic Ising model in the transverse magnetic field given by the Hamiltonian Fig. 5 Preparation of chaotic states.a Fragment of a quantum circuit generating chaotic quantum state according to the protocol proposed in ref. 30 .b Partial dissimilarities (red circles) of bit-string arrays resulting from 8192 projective measurements of a 19-layer-deep quantum chaotic circuit with 16 qubits in the σ z basis (in the random basis, the profile of D k is exactly the same).Here, filter size Λ = 2. Dashed line shows the analytical fit with (4).c Partial dissimilarities of bit-string arrays in the σ z basis resulting from 8192 projective measurements of the state produced by the 53-qubit Sycamore quantum processor by Google.These data were taken from ref. 6 , and different filter sizes Λ were used to compute D k .Dashed lines correspond to Eq. ( 4).Random basis measurements are not available in this case.
where J and h are the exchange interaction between nearest neighbor spins and the external magnetic field along x-axis, respectively, and we consider the case of one-dimensional chain of 16 spins with periodic boundaries.The critical value of magnetic field is known to be h c = 0.5|J|, and to reproduce this value is the first benchmark test for our method before we consider more sophisticated examples.
In the regime of weak magnetic field, the 16-spin system's ground state obtained with the exact diagonalization approach 45 is a superposition of two fully polarized states " j i N and # j i N , which is nothing but the entangled GHZ state discussed above.In the σ z basis, the bit-string array generated by projective measurements is a random sequence of 000…0 and 111…1 blocks.In turn, at very high magnetic fields the qubits are pointing in the same direction along x axis, and the state is just a trivial product state that can be obtained from 0000:::0 j iby rotating all the qubits with the same Hadamard gate.
Figure 6 shows the overall dissimilarity as a function of the magnetic field.One can see that in both σ z -and random bases, the dissimilarity steadily decreases with increasing h, and the corresponding derivative D 0 ðhÞ reveals the well-known transition point at h = 0.5 (we take J =−1).The phase transition is also reflected in the partial dissimilarities D k corresponding to individual renormalization steps.At low magnetic fields, the state is close to GHZ and there is clearly little inter-scale dissimilarity at small k: on the fine scale, coarse-graining of 0000:::0 j idoes not bring any dissimilarity,-and the main contributions to D come from larger k, i.e. from the spatial scales covering several N-qubit blocks.Contrary to that, at larger fields finer scales start playing more important role.For each k, the phase transition at h = 0.5 is visible in the derivative D 0 k ðhÞ.A much less trivial test of the method is to check whether it can reveal transition points in highly-frustrated spin systems with richer phase diagrams.For that, we consider the Shastry-Sutherland model 46 with competing antiferromagnetic interactions on the orthogonal dimer lattice, which plays a crucial role in understanding physical properties of the SrCu(BO 3 ) 2 system [47][48][49] .The corresponding Hamiltonian contains intra-and inter-dimer interactions, which are denoted J 1 and J 2 correspondingly (Fig. 7): As it was previously shown, the system features a gapped singlet ground state at J 2 = 0, gapless long-range antiferromagnetic Néel state at J 2 ≫ J 1 , but also a plaquette phase in-between, in the range of 0.67 < J 2 /J 1 < 0.76.While, strictly speaking, the quantum phase transition is defined in the thermodynamics limit of infinite lattices, its precursor could be detected already in a small system 38 .For example, in the case of Shastry-Sutherland model it has been suggested that by analyzing spin gap and spin-spin correlation functions one can extract the singlet-plaquette and plaquette-Néel transitions from exact diagonalization studies of small clusters 50 .We are going to show that it can also be done with the inter-scale dissimilarity measure, which is agnostic about the nature of phase transition and much easier to implement on quantum simulators and quantum computers.
We have performed exact diagonalization study 45 of a 16-spin Shastry-Sutherland supercell-the smallest cluster on which the model can be defined.Its energy spectrum is presented in Fig. 7.One can see that up to J 2 = 0.66J 1 the ground state of the system is the singlet state separated from the first excited state with a non-zero spin gap, and its energy is independent on the interdimer coupling value J 2 .At J 2 = 0.66J 1 a quantum phase transition takes place.The previous studies 50 have shown that increasing the supercell size does not change the position of the critical point.The inter-scale dissimilarity (Fig. 8) naturally captures this transition: for J 2 < 0.66J 1 , D of the ground state computed from 8192 measurements is a constant, D ¼ 0:25, and an abrupt transition occurs at the critical point in both the σ z and the random bases.The corresponding partial dissimilarities at J 2 = 0 and J 2 = J 1 are shown in Fig. 8c.
In the thermodynamic limit, the cases of J 2 = 0 and J 2 = 1 correspond to the magnetic phases with and without spin gap between the ground and the first excited state.In the finite-size system, it means that non-trivial signatures of phase transitions could be encoded not only in the ground state, but also in the  excitation spectrum.At J 2 < 0.55J 1 , the first excited state has threefold kind degeneracy: it is of triplet type with total spin values S z = 0, ±1.Above the transition point, it is replaced with a two-fold degenerate state with zero total spin.This state reconfiguration causes the difference in magnetization profiles for the inter-dimer order parameter above and below the point of J 2 = 0.55J 1 when the external magnetic field is applied.According to the previous studies 51 , the magnetization features a plateau at 1/8 of the full moment for J 2 = 0.65, but not for J 2 = 0.4.
At the point of J 2 = 0.76J 1 (Fig. 7), the plaquette-Néel phase transition takes place.Stability of this point upon varying the system size was previously confirmed by different methods 47,50,52 .
From Fig. 8, one can see that all three transitions-at J 2 = 0.55J 1 , J 2 = 0.66J 1 , and J 2 = 0.76J 1 -are accurately reflected in the inter-scale dissimilarity of bit-string array sampled in σ z and random bases from the first excited state of the Shastry-Sutherland model.We also show that the partial dissimilarities of the ground state calculated for J 2 = 0 and J 2 = 1 have specific distinguishable profiles.We believe this to be a strong argument in favour of universality of the suggested approach to automatic construction of phase diagrams of many-body systems simulated on quantum devices.
While dissimilarity is computed for one-dimensional bitstrings, to account for the two-dimensional nature of the system, we group the spins in 2d clusters before flattening, as shown in the inset of Fig. 8a.We have also tested that several ways to enumerate sites and form bitstrings do not reveal any additional phase transitions in comparison with results presented in Fig. 8, while the concrete shape of the dissimilarity curve as a function of the model parameters can be different.We admit that, in a general case, the approach to flattening and concatenation can in principle affect the results.
So far we have been computing inter-scale dissimilarity of arrays composed out of 8192 measured bitstrings.However, it can be shown that in fact a much smaller number of measurements would suffice to complete the task of detecting phase transition points in many-body quantum systems.We found that, in the σ z basis, partial dissimilarities D k of the Ising model ground states remain almost the same when we do 256 measurements instead of 8192.In the random basis, the minimal number of measurements that allows to reveal the ferromagnetic-paramagnetic transition is about 1024.In turn, the abrupt changes in the inter-scale dissimilarity of the Shastry-Sutherland model states could be revealed with mere 16 measurements.Thus, the method we propose allows one to accurately reconstruct phase diagrams of quantum spin Hamiltonians by using small-size supercells and a limited number of measurements.

Topological quantum phases
While the interscale dissimilarity seems to be sensitive enough to detect quantum phase transitions, it should be tested whether it also can be used to identify boundaries of topologically non-trivial phases.This task is more challenging since transitions of this type are governed by changes in the global properties of the system and often cannot be probed on the level of local correlation functions.An archetypal example of a system hosting topological order is the one-dimensional bond-alternating XXZ model 53,54 , which exhibits time-reversal, inversion, and rotational symmetries.The corresponding Hamiltonian has the form where O and E, correspondingly, denote the odd and the even bonds of the one-dimensional spin chain with open boundaries (denoted with red and blue in Fig. 9a).Depending on the value of the anisotropy parameter δ and the ratio of the exchange interactions, J 0 =J, the XXZ model was shown to host three different phases 53 -trivial, topological and antiferromagnetic.Each phase is defined by the corresponding value of the partialreflection many-body topological invariant.The latter is a nontrivial correlation function, which can be constructed only with some prior knowledge of the properties of the system.In this regard, the recent development of unsupervised approaches 54 to detect different phases of topological quantum systems is of timely interest.
Here, we consider the 16-spin chain, sample bitstrings of length 8 from the central part of the system, and use them to compute the dissimilarities.In this way, we manage to reproduce the phase diagram of the model, which has been previously obtained by calculating the topological invariant introduced in ref. 53 .The measurements of only the central spins are aimed to minimize the boundary effects caused by the small chain length.In contrast to the Ising and the Shastry-Sutherland models, for which D z and D r were found to behave similarly at all critical points, in the case of the bond-alternating XXZ model we observe that each of the two dissimilarities reveals its own phase transition.The random basis dissimilarity D r indicates the transition between the trivial (I) and the antiferromagnetic phase (III).In turn, the σ z -basis dissimilarity D z dissimilarity senses the boundary between the antiferromagnetic (III) and the topological (II) phases.Here, to obtain the smooth and accurate phase boundaries, we took N shots ~10 6 .It is much larger than what was required in the previously considered cases.Still, from the point of view of real experiments 6 , such number of measurements is absolutely realistic.
Interestingly, to reconstruct the phase diagram using the classical shadow tomography, which is in principle a more complete protocol than the inter-scale dissimilarity approach, additional steps of post-processing are required (such as the Principle Component Analysis), and it is still problematic to precisely identify the phase boundaries 54 .It indicates that the dissimilarity approach could be more advantageous in the context of studying phase transitions in experiments with programmable quantum simulators.

Multi-basis dissimilarity map
So far, we have analyzed a number of distinct examples of quantum states and demonstrated that their inter-scale dissimilarities (both overall and partial) computed in different measurement bases can be regarded as easily measurable signatures.To make this discussion more concise, it is natural to consider all the states within a single unifying context.
To accomplish that, we shall introduce the concept of dissimilarity map.For the sake of nicer visualization, assume that we characterize each quantum state with only two numbers-its overall dissimilarities D z and D r measured in the σ z and random bases correspondingly.Each state is then represented by a point in two-dimensional space.Figure 10 shows several classes of states plotted on such a map.One can see that states belonging to different families nicely group in recognizable lines.The dissimilarity map can be then thought of as an approach to dimensional reduction that embeds higher-dimensional data in a plane (if more bases were used, it would be a three-or fourdimensional space instead).Some states still share the same location on the map, like the singlet state of the Shastry-Sutherland model Ψ s and the chaotic state Ψ Haar both having D r = D z = 0.25.This is not unexpected, since a many-body state cannot be uniquely represented with only two numbers.However, taking into account also their partial dissimilarity profiles (Figs. 5 and 8) we can distinguish the states.This way, D and D k computed in several (two or more) different bases altogether form a hash of quantum state.

DISCUSSION
In this paper, we have shown that bit-string arrays resulting from projective measurements of many-body quantum systems should be viewed as objects possessing internal hidden structure that contains important information about the measured quantum state.By computing inter-scale dissimilarities of the arrays, it is possible to define a specific characteristic of the state which serves as its 'hash' that can be then used to certify the state and to estimate its closeness to the desired target state.
Two measures have been introduced: the overall dissimilarity D of the array in a chosen measurement basis, and the scaledependent set of partial dissimilarities D k , which are building blocks of the quantum state signature.Since the bit-string array in a fixed basis is defined only by the probability distribution over the Hilbert space basis |ψ(S i )| 2 , it does not distinguish between two wave functions with the same set of amplitudes but different structure of phase.Thus it is important to compute D and D k in two or more different bases.Since the procedure of performing projective measurements and computing dissimilarities is experimentally simple and numerically cheap, it is easy to repeat this procedure in several bases and construct a hash consisting of several numbers.
We would like to stress out that, in fact, the use of at least two measurement bases to characterize quantum system is not only practical, but also an important conceptual requirement directly related to Bohr's complementarity principle 55,56 .According to this principle, when observing a quantum system one gains information not about the quantum state per se but rather about the results of its interaction with a classical measuring device.Formally, the result of this interaction is described by the von Neumann theory of measurements 57 as a projection of the system density matrix with only diagonal elements surviving in the basis dictated by the device.The use of at least two noncommutative of the chain denoted with the dashed line are measured to calculate the dissimilarities.b Phase diagram constructed by using the dissimilarity calculated in the σ z and the random bases.The phase boundaries are defined as maximal values of the dissimilarity derivatives with respect to J 0 for a given δ.c Dissimilarities in the σ z (blue squares) and the random (red circles) bases calculated at the anisotropy value δ = 1.75.The vertical lines denote the positions of the maximum of dissimilarity derivatives with respect to J 0 .Fig. 10 Dissimilarity map.Low-dimensional representation of the 16-qubit quantum states studied in this work with respect to their dissimilarity calculated in σ z and random bases.Ψ 0 , Ψ s , Ψ Haar denote the trivial 0 j i N , the singlet and the random quantum states, respectively.
projection operators corresponding to two complementary measurement devices is a necessary prerequisite of quantumness, as follows from a general 'separation-of-conditions principle' 58 .The latter dictates a description of quantum quantities by, at least, two-index matrices rather than 'classical' strings.
It has to be admitted that uniqueness of this signature is not guaranteed, and one can not exclude the possibility that two distinct quantum states have similar sets of D and D k .However, if the number of involved measurement bases is large enough, such a coincidence seems highly unlikely.Here, we have constructed merely two-dimensional dissimilarity maps for bit-string arrays obtained from measurements in the random and σ z bases, and this was already enough to characterize several important families of many-body quantum states.In the cases, when two different wave functions were indistinguishable on the map (like the singlet and the chaotic states), they could be distinguished by their D k sets.If one is concerned about issue of non-uniqueness, the method can be used as a cheap preprocessing scheme within a larger framework of certification.First the dissimilarity signature is computed, and if it strongly deviates from the target state signature, the prepared state can be discarded right away.And only if the two states appear close enough, more advanced analysis should be performed.
An important advantage of the proposed approach is its scalability.Due to simplicity of computing the inter-scale dissimilarities, this procedure can be conducted for a large number of qubits.By using a classical computer, one could potentially characterize states of quantum systems of several thousands qubits which goes far beyond the abilities of available intermediate-scale quantum devices.For example, if one uses 128 Gb RAM, the estimated sizes of quantum systems that can be characterized in this way lie in the range from 8192 to 1048576 qubits, if the number of bitstrings in the array is taken to be 2 20 or 2 13 , correspondingly.
In this paper, we have analyzed two potential applications of the inter-scale dissimilarity signature-certification of quantum states and construction of phase diagrams.However, other research lines can be initiated, and we would like to briefly discuss them.
An important problem in quantum computing is to devise a quantum circuit that represents the desired target state.Usually, it is accomplished by optimization of the circuit architecture (topology, choice of gates) with overlap between the circuit and the target wave function being the objective function.For a large number of qubits, computing overlap at every iteration of optimization could be quite costly.Instead, one can aim at achieving the desired dissimilarity signature D target and minimize the norm jjD target À D circuit jj which, as discussed before, does not require significant resources to be computed even for a large system.
Another possible application of this concept could be in the domain of quantum optics experiments in which observer's eyes play the role of photons detector 59,60 with a minimal detection threshold of single photon 61 .Such a fascinating sensitivity of human eyes to the light has already become a basis for different scenarios of experiments 62 aimed at detecting entanglement.Such experiments require accumulation of statistics over 'seen' and 'not seen' events.Since human eyes are much slower in counting light pulses than real photon detectors, collecting large amounts of data in such a setting is challenging, and a method that allows to harvest information from limited data could come handy.Representing two possible outcomes of a single measurement, 'seen' or 'not seen', as binary digits, one can construct an array that can be analyzed from the inter-scale dissimilarity point of view.As has been exemplified with Dicke and Schrödinger cat states, the latter can be used to estimate entanglement entropy of the state.
Finally, it should be highlighted that by constructing the lowdimensional dissimilarity map for a number of quantum states one, in fact, performs automatic dimensional reduction and visualization of a high-dimensional dataset-a common task in machine learning which is often solved in unsupervised manner by employing such methods as self-organized Kohonen map, t-distributed stochastic neighbor embedding (t-SNE) 63,64 , principal component analysis or uniform manifold approximation and projection algorithms 65 (see ref. 66 for a primer of how the latter can be used in the context of many-body quantum physics).These algorithms usually require some notion of distance between the original higher-dimensional data points and try to approximately preserve the relative distances when projecting points onto a lower-dimensional space (usually, two-or three-dimensional).By computing and visualizing dissimilarity signatures using two or three complementary measurement bases, one effectively solves the same problem for a dataset consisting of many-body quantum states.While it is possible to use the conventional dimensional reduction methods to classify and visualize quantum states by defining fidelity-based distance between them 37 , this would require storing and manipulating many-body states on a classical computer.Thus, using dissimilarity maps could be an easy to implement alternative that does not require much resources.Although it is not directly related to the distance between quantum states in the Hilbert space, it nevertheless consistently and neatly clusters quantum states belonging to different families without even relying on any optimization scheme.

Calculating inter-scale dissimilarity of bit-string arrays
To assign a characteristic hash function to a quantum state we perform three steps (Fig. 1): (i) initialization of the quantum state on a real quantum device or simulator, (ii) a number of projective measurements in at least two different bases, and (iii) computing the inter-scale dissimilarities of the resulting bit-string arrays.
The initialization of a quantum state may be done by different means.For instance, one can use variational approaches [67][68][69] and adiabatic algorithms [70][71][72] to approximate the target state on a quantum device.When dealing with a some small-scale quantum system, like the 16-qubit states studied in this paper, it is possible to initialize a state by taking the wave function coefficients obtained with exact diagonalization and employing the Least Significant Bit procedure 24 that features one-by-one disentanglement of qubits.Some particular quantum states can be directly generated with known quantum circuits, which is the case for the quantum chaos and the Schrödinger cat states.In this work, all the manipulations with quantum states were performed with the Qiskit package 18 .
Once a quantum state is initialized on a device, we measure it in two or more bases.Here, we refrained to projective measurements in the σ z basis and the random basis, though using more bases can be beneficial for constructing unique hashes of many-body states.In other words, we sample N shots basis vectors represented by bitstrings {x i } from the probability distribution p(x i ) = |ψ(x i )| 2 , where N shots is a reasonably small number of measurements (16-8192 in the studied cases), and by doing this in two bases we should have access not only to the amplitudes, but also to the phases of the wave function.The measurement outputs in each basis are then arranged into one-dimensional sequence of bitstrings which can be regarded as a binary array of length L = N × N shots .Random basis measurements are performed in the following way.Prior to every shot i of measurement, rotational gate U ðiÞ 0 parametrized by randomly generated angles θ i , ϕ i and λ i is applied to each qubit (Fig. 1a).For the next shot, values θ i+1 , ϕ i+1 and λ i+1 are sampled, and a rotational gate U ðiþ1Þ 0 is applied.The angles are generated in such a way that, once the procedure is repeated many times, the single-shot gates uniformly cover a segment of the Bloch sphere: θ 2 ½0; π 2 , ϕ 2 ½0; π 2 and λ 2 ½0; π 2 .The reason why we choose one of the bases to be random in the aforedescribed sense is that it is expected to be the most unbiased one if we apply this protocol to diverse quantum states with completely different structures.
Having constructed the bit-string arrays, we analyze their structure using the concept of inter-scale dissimilarity.Recently 17 , some of us have suggested a notion of structural complexity of classical patterns based on the idea of quantifying differences between distinct spatial scales of a pattern obtained with a multi-step renormalization (coarse-graining) protocol.Here, we formally apply this procedure to the bit-string arrays viewing them as one-dimensional patterns.
Let us denote such an array as vector b 0 of length L. At every step of coarse-graining k, a vector of the same length is constructed as where square brackets denote taking integer part.This means that at each iteration the whole array is divided into blocks of Λ k size, and elements within a block are substituted with the same value resulting from averaging all elements of the block.Initially those elements are either 0 or 1, and for k > 0 they take real values (in fact, for the sake of nicer normalization in our calculations we assumed that '0' bits have values equal to −1).Index l enumerates elements belonging to the same block.
For simplicity, we usually assume that the bit-string length is an integer power of filter size Λ: log Λ N 2 N. Dissimilarity between scales k and k + 1 is then defined as where O m,n is the overlap between vectors at scales m and n: There are two quantities of our principal interest: D k that contains scaleresolved information on the pattern structure of the generated bit-string array and overall dissimilarity, D ¼ P k D k , where the sum goes over all the renormalization steps.D and D k f g computed in several bases together comprise the hash function of quantum state that can be used for its certification.

Convergence of dissimilarity with the number of measurements
The number of measurements required for dissimilarity to converge depends on the state to be characterized.As mentioned in the main text, for the random Haar-typical states, it is enough to make about 2 13 = 8192 measurements even if the Hilbert space dimension is really large (53 qubits).If the state is more structured, a larger number of measurements (relative to the Hilbert space dimension) might be needed.
For example, for the D = 8 16-qubit Dicke state, the dissimilarity converges after about 2 10 measurements, Fig. 11.In general, we expect that dissimilarity tends to converge with the size of the sampled set as a typical observable computed by Monte Carlo sampling from a sign-definite wave function (though it is not an observable itself), meaning that about Ñ ⋅ 10 4 samples drawn from the probability distribution should be enough.

Dissimilarity of the random quantum state: analytical derivation
Inter-scale dissimilarity of bit-string arrays resulting from projective measurements of random quantum states Eq. ( 4) can be estimated analytically.First, let us note that O k,k = O k,k−1 if the averaging-based coarse-graining scheme (8) is adopted.Indeed, within n-th window of size Λ k : where b k ðnÀ1ÞÁΛ k þi are equal to each other for all i within the window, and thus this multiplier can be taken out of the sum over i.Once summed up over all windows, l.h.s. of this identity gives O k,k , and the r.h.s.-O k,k−1 .
Thus, the expression for partial dissimilarity D k can be rewritten as For a random state, O k,k can be evaluated in the assumption that binary elements in the bit-string array b 0 i are sampled from some random distribution p 0 (x) (with x = 0 or 1) and not correlated.In this case, the coarse-graining procedure can be viewed as follows.In step k = 1, the renormalized probability distribution at every position in the array is defined over x 1 = 0, 0.5, 1 with p 1 ð0Þ ¼ p 2 0 ð0Þ, p 1 (0.5) = 2p 0 (0)p 0 (1), p 1 ð1Þ ¼ p 2 0 ð1Þ.Repeating this for several steps, one can notice that probability distribution p k (x k ) is defined over random variables which are obtained by averaging of the original uncorrelated random variables x, and according to the central limit theorem p k !N ðμ; σ 2 Λ Àk Þ as k → ∞.Here N ðμ; σ 2 Λ Àk ÞðxÞ is a normal distribution with μ and σ 2 being the mean and variance of the original distribution p 0 (x) correspondingly, and normalization factor Λ −k is due to the used scheme of averaging.
Noticing that, on average, product of a site value on itself is where the integral symbolically denotes discrete finite sum at finite k, we can approximately rewrite O k,k as: which leads us to In this way, we obtain for k > 0: Although the central limit theorem formally holds for k → ∞, it turns out that this estimate reproduces the numerically computed partial dissimilarities already starting with k = 1.For k = 0 it should be computed separately.Given O 0,0 ≃ 〈x 2 〉, we obtain: On the continuity of dissimilarity upon deformations of the quantum state Here, we would like to provide an idea of why the partial dissimilarities D k f g are continuous functions of the parental quantum state.As shown in the previous section, in order to calculate the dissimilarity, the only object needed is the overlap O k,k : where b k i is the average of the spin values in the i-th block.In turn, overlap O k,k can be approximated as a certain expectation of the coarse-grained probability distribution p k (x):

Fig. 2
Fig.2Dissimilarity results for Schrödinger cat states.a Quantum circuit generating Schrödinger cat states.b, c Partial dissimilarities D k of 16-qubit Schrödinger cat states calculated in the σ z and the random bases correspondingly.Here, Λ = 2.The peak at k = 4 is due to the fact that the coarse-graining window becomes of the size of the system at this scale, Λ k = N. d Visualization of bit-string arrays.In these images, individual bitstrings are horizontal lines of 16 bits that are concatenated to form a long string (stacked here vertically for the purpose of presentation).Left picture shows an example of array sampled from a cat state with θ ¼ π 2 in the σ z basis, and the right onemeasured in the random basis.Here k = 0 represents texture of the measured array per se, and k > 0 show its evolution upon coarse-graining.

Fig. 3
Fig. 3 Dissimilarity results for Dicke states.Partial dissimilarities of Dicke states with different D index calculated in the σ z (a) and the random (b) bases.The trivial state ( 0 j i 16 ) profiles (dashed red lines) are given for comparison.

Fig. 4
Fig. 4 Comparison of entanglement and dissimilarity.a Entanglement entropy S (blue circles) and overall dissimilarity D z (white squares) of the Schrödinger cat states as functions of angle θ. b The same characteristics of the Dicke states as functions of index D.

Fig. 6
Fig. 6 Ising model results.a Dissimilarity of the Ising model ground state as a function of the transverse magnetic field in the σ z and the random bases; the inset shows derivative of the dissimilarity in the σ z basis with respect to h. b Partial dissimilarities D k in the σ z basis at different coarse-graining steps k = 1…6.The global change in the trend between k = 3 and k = 4 is related to the coarse-graining window reaching the size of the system, Λ k = N, so that the averaging starts mixing different bitstrings.

Fig. 7
Fig. 7 Exact diagonalization results for the Shastry-Sutherland model.Upper right inset: schematic representation of the Shastry-Sutherland model 16-spin supercell used in this work.Main plot and the inner inset: low-energy part of its spectrum as a function of the inter-dimer exchange interaction J 2 /J 1 .Arrows denote transitions between quantum states.The green line represents the ground state.

Fig. 8
Fig. 8 Dissimilarity results for the Shastry-Sutherland model.Dissimilarity of the ground and the first excited states (with zero total spin) of the Shastry-Sutherland model as a function of the inter-dimer coupling in the σ z (a) and the random (b) bases.The inset in panel (a) shows how spins are grouped to obtain flattened bitstrings.c Comparison of the partial dissimilarity profiles obtained for the singlet (J 2 = 0) and the Néel (J 2 = 1) states in the σ z basis.

Fig.
Fig. Phase diagram of XXZ model.a Schematic of the bond-alternating 16-spin XXZ model.spins in the central of the chain denoted with the dashed line are measured to calculate the dissimilarities.b Phase diagram constructed by using the dissimilarity calculated in the σ z and the random bases.The phase boundaries are defined as maximal values of the dissimilarity derivatives with respect to J 0 for a given δ.c Dissimilarities in the σ z (blue squares) and the random (red circles) bases calculated at the anisotropy value δ = 1.75.The vertical lines denote the positions of the maximum of dissimilarity derivatives with respect to J 0 .

Fig. 11
Fig. 11 Dissimilarity convergence as a function of the number of measurements.(a) and (b) represent dissimilarities obtained for σ z and random bases.N shots is number of measurements.The D = 8 16-spin Dicke state is considered.In each basis, two independent runs are performed.