A local and scalable lattice renormalization method for ballistic quantum computation

A recent proposal has shown that it is possible to perform linear-optics quantum computation using a ballistic generation of the lattice. Yet, due to the probabilistic generation of its cluster state, it is not possible to use the fault-tolerant Raussendorf lattice, which requires a lower failure rate during the entanglement-generation process. Previous work in this area showed proof-of-principle linear-optics quantum computation, while this paper presents an approach to it which is more practical, satisfying several key constraints. We develop a classical measurement scheme, that purifies a large faulty lattice to a smaller lattice with entanglement faults below threshold. A single application of this method can reduce the entanglement error rate to $7\%$ for an input failure rate of $25\%$. Thus, we can show that it is possible to achieve fault tolerance for ballistic methods.


I. INTRODUCTION
Several physical platforms are aiming at achieving quantum computing [1]. For example, a qubit can be implemented using superconductors [2], silicon [3,4], trapped ion systems [5,6], or using linear optics [7,8]. Major advances in the fidelity of these qubits have made the application of errorcorrection codes, such as the surface code [9], feasible. The surface code is of particular interest due to its high threshold [9] and 2D nearest-neighbor layout. While the surface code is suitable for qubit implementations which are relatively easy to control, linear-optics quantum computation [10] is based on a slightly different principle, where photons are entangled in a cluster state which is then consumed during the computation. This quantum one-way computer was proposed by Raussendorf et. al. [11,12]. A high-level implementation for such a quantum computer can be divided into three steps [13,14]: 1. Photon sources: delivers GHZ-triplets 2. Entangling layer: generates the cluster state.

Measurements: measurements in different bases allow
for universal computation.
While cluster states have also been studied for solid state qubits [15][16][17], this scheme is better suited for photonics because it prioritizes measurements over the sequential application of quantum gates, and thus utilizes the ability to generate photons continuously. The original proposal [11,12] was not protected against errors but a similar approach can introduce fault tolerance [18]. This approach uses the Raussendorf lattice as an underlying resource which protects the logical state both against noise and photon loss [19]. * daniel.herr@riken.jp There are several approaches for creating a cluster state for quantum one-way computation [19]. A drawback of linear optics is that the process of creating entanglement is nondeterministic. Thus, there remains a non-zero probability that each entanglement operation fails and the resulting lattice misses edges. Some approaches like [20] try to remedy the probabilistic nature by adding redundancy to the entangling procedures. However, these approaches require many switches which rely on the outcome of previous entangling operations and add more noise to the system. Another approach is to just use these non-deterministic gates and generate a faulty lattice. This ballistic approach [8] of linearoptics quantum computation recently gained attention due to improved theoretical entangling operations [21,22] which fail with 25% probability. It has been pointed out that the error rate is still below the percolation threshold (37.3%) calculated in [23]. Therefore, information can be transported from one end to the other given a large enough faulty lattice [24].
In order to build a large scale quantum computer, this ballistic approach to linear-optics quantum computing should generate the Raussendorf lattice which then allows for faulttolerant computation. Classical control software must now be developed in order to cope with 25% of faulty entanglement operations, while still retaining these error-correction capabilities. In this paper, we provide an example of such an algorithm which is based on Ref. [24] and that acts as a preprocessing step. One should note that the approach in [24] has several limitations: (i) only in 1D; (ii) it is global; (iii) it is not faulttolerant. Our approach has none of these drawbacks when it is combined with the usual Raussendorf lattice error-correction schemes [25][26][27]. It should be noted that the preprocessing is not inherently fault-tolerant which results in a trade-off between the rate of missing bonds and an accumulation of errors due to imperfect measurements. The accumulation of these errors, however, only shifts the threshold of the Raussendorf lattice and can be remedied by higher fidelities of the experimental setup. This is the first purification procedure for a 3D fault-tolerant lattice, with previous work in 1D being done in [24]. Thus, we expect future improvements to the algorithm using better heuristics that will relax the requirements for the experimental setup.

II. BACKGROUND
In the following we will review relevant concepts. We will give the definition of graph states and explain how they can be modified. Then we will review the Raussendorf lattice, its creation, and its error-correction capabilities with a focus on faulty edges.

A. Graph States
Graph states [28] are a generalization of cluster states [29] and can be described using a undirected graph G = (V, E) with vertices V and edges E. Each vertex corresponds to a physical qubit initialized in the |+ -state. On each edge, a controlled-phase gate is applied. This results in the final state of: Using measurements, the graph can be modified to another graph. The modification rules have been discussed in Ref. [28]. Our proposal will only rely on two particular easy measurement operations: 1. Z-measurement on qubit a: Remove {a} from the graph and break all connections it was involved in.
2. Y -measurement on qubit a: Remove {a} and add connections between neighbors. This method can be used to generate long-distance edges.
An example of these two rules on a square lattice is shown in Figure 1.

B. Creation of the lattice
A special graph state with error-correction capabilities is given by the Raussendorf lattice [18]. It is a 3D lattice whose unit cell is shown in figure 2.
To create the Raussendorf lattice in a ballistic way, GHZ states are needed as a resource. Three of these GHZ states can be entangled to a micro-cluster, using two probabilistic fusion gates [30]. There are two ways to apply these fusion gates, with different additional resources. The fusion gate given in Ref. [21] requires an additional pair of maximally entangled photons, whereas Ref. [22] requires four single photons. The creation of the micro-clusters follows [30] and all possible outcomes of the micro-cluster generation are shown in Figure 3.
The central node of each micro-cluster will correspond to a node in the final lattice, while its surrounding nodes are consumed in additional fusion operations to connect clusters with each other. It can be seen in Figure 3 that a failure during the generation of these micro-clusters results in non-local entanglement. However, it becomes exponentially unlikely for edges with larger distances.
After the creation of the micro-clusters, each of these needs to be entangled to its neighbors on the large lattice. This is where our proposal deviates from [30], since the underlying lattice we try to implement is the Raussendorf lattice and not the diamond lattice from the original proposal.
In Figure 4 the generation of this lattice is shown. Each micro-cluster will correspond to a single node after all fusion gates have been performed. The fusion of these micro-clusters happens with a probability of 75%. Thus, 25% of the time the creation of bonds in this lattice fails. Figure 3. This figure shows the creation of micro-clusters as described in [30]. The areas shaded in grey indicate which two qubits are used for the fusion gates. Depending on the measurement outcome of these fusion gates, the structure will take one of the shapes on the right.

C. Error correction
Error detection and correction can be done using parity checks on particular nodes on the Raussendorf lattice. As an example, the total parity in the X-basis of the qubits colored in red ( Figure 2) is conserved unless there has been an error. A self-similar lattice which is shifted by half a unit-cell uses the photons shown in white to perform Z-parity checks.
These parity checks together enable the error-correction capabilities of the Raussendorf lattice. Furthermore, with the method of Ref. [26] the Raussendorf lattice is also protected against photon losses. The main idea in this approach is to use the linearity of the parity checks to form super cells and perform parity checks on these. This resulted in a trade-off between the error rate due to faulty measurements and the rate of photon loss. The best photon loss rates that could still be corrected were around 25% [26].
A lattice with faulty edges can be translated to a lattice with missing nodes, by deliberately losing one of the photons at the end of a faulty edge. This is done by performing a measurement in Z-basis on one of these photons. A recent paper [27] described this as an adaptive correction scheme where the measurement basis needs to be changed depending on the error. This adaptive scheme can tolerate a loss rate of 14.5% of all edges. Another approach is to keep measuring as usual and then in the classical tracking software treat both qubits that are involved in the faulty connection as lost photons. There, still correctable loss rates lie at around 6.5%. Unfortunately, neither approach can correct for error rates of 25% and, thus, preprocessing in some form has to be performed.

III. GRAPH PURIFICATION: GENERAL IDEA
The general idea of our graph purification proposal is to develop a measurement scheme that translates a large Raussendorf lattice with many faults into a smaller Raussendorf lattice with fewer faults. Our procedure is based on Ref. [24] which investigated how path-finding procedures can help for quantum computation on a faulty lattice. It is not a quantum error correcting code, such that errors will accumulate during this step. Nevertheless, after this preprocessing, the original lattice has been translated to a lattice with fewer faults such that a general error-correction procedure can be used.
The main requirements for such an algorithm are 1. The algorithm should be local, i. e. the algorithm's corrections should only rely on faults in the vicinity of the lattice. This is important since the lattice is generated continuously and only a part of it is physically available at any time.
2. The algorithm should give the corrections fast. This is important, because photons are fast and delays in computation translate to large sizes of the quantum computer with long optical fibers. Figure 5. Here, all possible orientations of a single structure are shown. The node in the center will be used as the node for the purified lattice.
3. The algorithm should require as little overhead as possible in terms of photons.
4. Scalability: adding more photons should be possible (e.g. the algorithm should be parallelizable).
We will now present our scheme to purify the faulty lattice. It is based on the idea that while a 25% error rate is very high, it is still below the percolation threshold of the Raussendorf lattice. For a larger lattice, the probability to find paths from one node to another increases. Nodes from the large and faulty Raussendorf lattice are chosen and will make up the purified Raussendorf lattice. These nodes are then connected by finding paths through the faulty lattice. All photons on such paths have to be measured in the Y -basis and will therefore create edges between the chosen nodes. All other qubits are measured in the Z-basis and are thus removed from the lattice.

IV. IMPLEMENTATION
The code to this description is open-source and hosted on Github [31].
The implementation divides the original Raussendorf lattice into boxes. Each box has the same number of qubits along its edges. This size will be referred to as box size throughout the paper. In each box, one of the structures from Figure 5 is chosen. Each structure contains four handles which are the start or end points for the path-finding algorithm. To improve the performance of the algorithm, we added a heuristic method to select the best structure position: Of all possible structure positions the one with the highest number of neighbors from its handles is chosen. It should be noted that due to non-local connections, the neighbors of the structure are not necessarily nearest neighbors physically, but only neighbors due to the underlying graph.
After all the structures have been found, a path-finding algorithm needs to connect neighboring structures with each other. For example, our implementation uses the A* algorithm [32] with the Manhattan-norm as its heuristic function. This heuristic should give a decent estimate on the remaining distance but its distance estimate is not strictly smaller than   the actual distance because of non-local entanglement due to the fusion gates. Therefore, A* is not guaranteed to find the shortest path, but for our algorithm finding the shortest paths this is not needed.
The code is written in C++ and was logically divided into three classes of which one implements the lattice of a single box, another class combines all boxes to the larger lattice, and the last class finds paths between different structures.
The lattice implementation for each box is given by the class Graph. It implements the graph as a std::deque, whose key is a unique identifier for the individual node and the value is a std::vector of all neighbors. These neighbours are stored as a std::pair where the first value is the id of the box and the second value gives the id of the node inside that box. Further important functions are Graph::generate_connections which randomly generates the lattice using the rules for the fusion operations and Graph::find_structure which looks for a suitable position for the structure.
The large lattice class, Parallel, contains a std::vector of the class Graph. This vector contains all the information related to the lattice. The class handles all high-level operations, such as output of the purified lattice, and calculations for the statistics. It further determines between which structures a path needs to be found. While our implementation is not yet parallel, the parallelization should be straightforward to implement in this class. Finally, the class Astar, implements the path-finding algorithm A*.

V. RESULTS
In order to see how well our code performs we ran this algorithm on lattices created with different success probabilities  for the fusion gates, using different box sizes, and compared the rate of faults after our algorithm ran. The behavior of the output error rate with increasing size is plotted in Figure 6. One can see that for an initial failure rate of 25% it is possible to reach an output error rate of about 7% for the purified lattice. This is below the threshold rate of 14.5% [27] of the Raussendorf lattice. Thus, it should be possible to use this code as a preprocessor for fault-tolerant ballistic quantum computation. In Figure 7 the relation between input error rate and output error rate is shown. Every data point below the black curve shows an improvement over the input error rate. Thus, it makes sense to use this algorithm for input error rates below 32%. Figure 8 shows a histogram of the length distribution of the paths. The average path length is larger than the box size because the shortest possible path is not always possible due to missing edges on the graph. It is possible to obtain shorter paths due to non-local interactions and differences in structure positions. The average path length for a box size of 20 is given by 28.42 ± 0.05. We will use these values in the following analysis to estimate the effects of errors.
We performed a simple timing analysis by running the algorithm on a single core of an i7-4558U (2.8 GHz) CPU, to give a rough estimate on the speed of the algorithm. The results are plotted in Figure 9. For a box size of 20 and a 5 × 5 × 3 lattice of boxes, the algorithm needs on average 1.34 ± 0.05 s. However, it should be noted that not much effort was put into optimization and better performance can be expected from optimized implementations. The scaling of this algorithm is polynomially both in box size and number of boxes.

VI. DRAWBACKS OF THIS METHOD
The purification process is not inherently fault-tolerant so errors can accumulate. In the following, we want to discuss the effects on the error rate that our approach has.
To analyse these sources of error it makes sense to discuss the measurement procedure. A qubit state is encoded as a spatial mode, which can be measured using a pair of photon detectors. There are two ways of how a Hadamard operation can be implemented 10 to enable a change in measurement basis. Because our proposal requires to change the measurement basis depending on the lattice that has been created, the ability to add a Hadamard operator in reasonably short time has to be guaranteed. A simple Hadamard operation can be implemented by bringing together the two wave-guides for a length of π/2. In order to add a choice of measurement basis, one can use switches as shown in the first approach of Figure 10. The other approach divides the Hadamard operation into two by bringing together both wave-guides for a length of π/4 both times and adds a gate that creates a phasedifference between the two wave-guides. If this phase difference is zero, the Hadamard operation is performed, but if the phase-difference is π one will obtain the identity operation.
There are several sources of errors that can occur in such setups. One source of errors comes from imperfections due to switches which will lead to photon loss. Another source of errors are imperfect rotations and Hadamard operations. These will lead to a shifted measurement basis and a small rotation in the state, which is teleported to the next qubit. The last source of errors is at the detectors. They could show false positives (the detector detects a non-existent photon) as well as false negatives (a photon is not registered at the detector).
For linear-optics applications the implementation of Hadamard operations is precise [33,34] and using the second approach of Figure 10, switches are not needed during measurement. Thus, in our analysis we will neglect both of these types of errors and concentrate on errors at the detectors.
All first-order errors at the detectors result in a nonsensical measurement: Both photon detectors are triggered at the same time or neither of the photon detectors is triggered. If such a case happens, it is clear that an error occurred but it is impossible to know the nature of the error. A second order error will result in the opposite measurement outcome. Obtaining the wrong measurement error will result in a Pauli error because, in measurement-based quantum computation, a by-product Pauli operation has to be applied depending on the measurement outcome.
One possible correction scheme for first-order errors is to choose a measurement outcome randomly. With 50% this measurement is incorrect and due to the rules of measurement-based quantum computing a wrong by-product Pauli operator is applied. In the end, additional Pauli errors appear on the purified lattice and the Raussendorf lattice has to locate and identify them.
The total error rate for each path can be calculated using: At box size 20, the mean length of a path is L = 29 and given a detector fidelity f = 0.9999 the resulting error rate is P err = 0.57% for each bond. The factor 2 in the equation comes from the fact that each measurement involves 2 photon detectors.
To obtain the error rate per node we assume that if a measurement error occurs we attribute it to the node in the same box. For a single node there are on average 4 L/2 qubits for all 4 paths whose length in a single box corresponds to L/2 each. Using this in the exponent the resulting error rate on each node is P NodeErr = 1.15%. However, due randomly applying one of two correctional gates for first-order errors this error rate can be halved. The effective error rate per node is P NodeErr ≈ 0.58%.
At box size 20, the rate of failed connections is 10%. To be below threshold the remaining measurement errors need to be below 0.6%, which can be achieved with a fidelity of about f = 0.9999. This is a very strong requirement for the measurement setup but with improvements in the preprocessing algorithm it can be relaxed.

VII. POSSIBLE IMPROVEMENTS
The algorithm seems to depend heavily on the type of structures and their position. We already used a heuristic that maximizes the possibilities for the first step of the path-finding algorithm but more advanced heuristics might improve the error rate of the purified lattice even further. Furthermore, a clustering algorithm should help in choosing good structure positions at the cost of an increased runtime. The effects of this should be included in future analysis. Changes to the distance heuristic for the A*-search might also affect the performance of this proposal, but it was not investigated here.
This method is easily parallelize-able. Each processor could have its own set of boxes. Only information about the direct neighboring boxes needs to be exchanged with other processors. In Figure 11 the black box needs information about the boxes colored in gray only. Every process needs to find a structure position in each of its boxes. Each process needs to send the position of its qubit-structures which lie on the boundary surface to the process on the left, and down (opposite direction). The box in the back will be treated by the same process so no communication is required. After every box received the information of its two neighbors, it can continue to find three paths in the right, up, and back directions. The overhead of communication scales with the surface and not the volume and each process only needs to know a small part of the whole lattice, such that memory problems can be avoided. Figure 11. This shows how to parallelize the algorithm for a big lattice. The process responsible for the calculation of the black box only requires information about the three nearest-neighbour grey boxes. Only this information needs to be exchanged.

VIII. WORKFLOW
In the introduction we mentioned a high-level design of a linear-optics quantum computer. Here, we want to refine on it, with the inclusion of our purification step. To this end we show in Figure 12 a possible quantum and classical flow of information and the actions that are taken due to that information. To the left of the figure three-qubit GHZ-states are created and used by fusion gates. It should be noted that additional GHZ states and photons are needed in that step but for simplicity they are not shown here. Using the measurement results of the fusion gates, the classical path-finding algorithm can infer the graph state that was generated and find paths connecting nearest-neighbor structures. All photons are measured out except for the center nodes of each structure. The routing of these center nodes requires a switch each and their measurement can be adjusted due to measurement errors on the path. Finally, the last measurements are the actual measurements needed for the Raussendorf lattice, where syndrome extraction and the actual fault-tolerant quantum computation happen.

IX. CONCLUSION
We have presented a way to purify the 3D lattice obtained from the ballistic procedure proposed in [30] using ideas from [24]. This purification process can suppress entanglement errors due to probabilistic fusion gates and bring the error rate from 25% down below the Raussendorf lattice threshold. This procedure, however, has the cost that errors along generated paths can accumulate and requires higher precision in measurement operations. Nevertheless, this approach shows that fault-tolerant quantum computation using ballistic lattice generation is possible. Looking back at the requirements we posed, we can see that our proposal fulfils several of them: 1. The algorithm is local. Due to the exponential decay in large distance edges all connected nodes are located in the same box or neighbouring boxes.
2. The algorithm scales polynomially in lattice size, but our implementation should still be improved in terms of absolute speed.
3. The overhead in terms of qubits could be better: each box consists of about 20 3 nodes, which are all consumed to generate one node in the purified lattice. Errors also accumulate, with larger sizes.
4. The algorithm is easily scalable, with only little communication required by different processes.
While our code works, many improvements can be made to this preprocessing step, such as using different measurement schemes to create entanglement with X-basis measurements. Thus, it is likely that the output error rate and therefore resource requirements are further reduced. Then, fair comparisons between different ways to generate the lattice such as [20] and the ballistic approach with preprocessing should be made in terms of overhead for the Raussendorf lattice.