A hierarchical approach for building distributed quantum systems

Davarzani, Zohreh; Zomorodi, Mariam; Houshmand, Mahboobeh

doi:10.1038/s41598-022-18989-w

Download PDF

Article
Open access
Published: 14 September 2022

A hierarchical approach for building distributed quantum systems

Zohreh Davarzani^1,2,
Mariam Zomorodi^1,3 &
Mahboobeh Houshmand⁴

Scientific Reports volume 12, Article number: 15421 (2022) Cite this article

2421 Accesses
4 Citations
2 Altmetric
Metrics details

Subjects

Abstract

In this paper, a multi-layer hierarchical architecture is proposed for distributing quantum computation. In a distributed quantum computing (DQC), different units or subsystems communicate by teleportation in order to transfer quantum information. Quantum teleportation requires classical and quantum resources and hence, it is essential to minimize the number of communications among these subsystems. To this end, a two-level hierarchical optimization method is proposed to distribute the qubits among different parts. In Level I, an integer linear programming model is presented to distribute a monolithic quantum system into K balanced partitions which results in the minimum number of non-local gates. When a qubit is teleported to a destination part, it can be used optimally by other gates without being teleported back to the destination part. In Level II, a data structure is proposed for quantum circuit and a recursive function is applied to minimize the number of teleportations. Experimental results show that the proposed approach outperforms the previous ones.

Scalable distributed gate-model quantum computers

Article Open access 26 February 2021

Quantum circuit optimization using quantum Karnaugh map

Article Open access 24 September 2020

Quantum networks with neutral atom processing nodes

Article Open access 16 September 2023

Introduction

In the recent decade, the rapid growth of science and the engineering of quantum devices have led to the advancement of quantum computation from single isolated quantum devices toward multi-qubit processors¹. As such, quantum computation has witnessed rapid growth with high performance in many areas. The standard approach in quantum computing is to design them as monolithic circuits.

Nowadays, quantum computing has many advantages over classical ones. One of them is that quantum computers can exponentially act better than classical ones for many computational problems². Yet, due to implementation complexity, there are many challenges to design a large-scale quantum computer. The computing power of a quantum system increases exponentially with the number of embedded qubits³. A problem with greater qubits is more challenging for a quantum computer to solve.

Though advantageous, quantum computers have many shortcomings. One of these shortcomings is that the information of qubits may encounter errors before applying fault-tolerant approaches. This is due to the qubits, interconnected by the outside world which may lead to decoherence^4,5 and when the number of qubits increases, the quantum information becomes more fragile and more susceptible to errors⁶. The error could also be due to the application of an operation on a quantum state⁷ which can be solved by separating qubits from their surroundings. As qubits establish the communication and some reading or writing operation, this solution is not reasonable. There are many solutions for these challenges. Physical implementations such as systems of trapped atomic ions can be accurately controlled and manipulated. A large variety of interactions and measurements of relevant observables can be engineered with high precision^8,9. Also, superconducting qubit modality has been used to demonstrate prototype algorithms in the noisy quantum channel to have non-error-corrected qubits in quantum algorithms. Currently, this is one of the approaches for implementing medium and large-scale quantum devices and quantum coherent interactions with low noise and high controllability^10,11. Another technology used to design a large-scale quantum system is photonic quantum computing. Quantum entanglement, teleportation, and quantum key distribution are derived from this technology because photons present a quantum system with low noise and high performance¹².

One way to resolve these challenges is to divide quantum systems into some limited-capacity quantum systems, with qubits distributed on them, which is referred to as a Distributed Quantum System^13,14,15.

Distributed quantum circuit

Distributed quantum system consists of several independent quantum units with limited-capacity that appear as a single quantum system to the users. The units might be different from each other, in terms of hardware and software. For hardware limitations of each quantum unit, there is a connected graph called coupling map in each unit. The purpose of these limitations is to preserve and control qubits from decoherence and noise⁷.

Minimizing communications among quantum units of a distributed quantum system is very essential in reducing the cost of the whole system. On the other hand distribution of qubits among different subsystems leads to some non-local gates and to execute these non-local gates, it is essential to bring all qubits into a single subsystem. According to the no-cloning theorem, independent copies of qubits is not allowed in a quantum system. To this end, we can use teleportation protocol in order to move qubits between subsystems¹⁶. This protocol requires an entangled pair of qubits between two nodes in order to teleport the state of a qubit from one node to the other. This operation is expensive, and can lead to substantial latency due to the stochastic nature of underlying processes^17,18,19. Therefore, minimizing communications among quantum units of a distributed quantum system is very essential in reducing the cost of the whole system. A proper distribution algorithm could decrease the communications between quantum units dramatically. With this in mind, this paper proposes the optimized distribution of quantum systems.

An abstraction of Distributed Quantum Computing is shown in Fig. 1 which is described through a set of (logical) layers, with the higher depending on the functionalities provided by the lower ones⁷. Starting from the top, there is the quantum algorithm in the form of quantum circuit. This algorithm is completely independent and unaware of logical and physical hardware constraints. In the second layer, there is a distribution algorithm. This algorithm implements the circuit of the previous layer in a distributed way. This layer consists of two parts called load balancer and optimizer, as follows:

The qubits must be distributed well-balanced in some the limited-capacity quantum units. Therefore, a load balancing problem must be performed at this level.
Non-local operations require qubits to communicate with qubits on other units. Hence, a teleportation protocol is needed for units to communicate. Minimizing the number of teleportations among these units is required at this level.

At the next level, quantum units communicate with each other via classical and quantum channels remotely. Both local and non-local operations can be executed at this level. The local operations execute on the qubits stored within the same quantum units and non-local operations execute on the qubits stored on different quantum units. As mention above, a quantum teleportation protocol is necessary for communication units with each other. This protocol consists of some phases such as, e.g. EPR pair generation, local operations, measurement and classical communications¹⁵. Each teleportation comprises two qubits stored on different units. These two qubits that are entangled together are called an entanglement pair. Each qubit of entanglement pairs is used to communicate a single qubit to another quantum device. Therefore, at the very bottom level, a hardware for generating entanglement pairs is required to communicate units with each other²⁰. Each quantum device may have its own hardware to create an entanglement pair, or a separate device may generate this pair centrally²⁰.

In this work, a two-level hierarchical optimization model is proposed to design a large-scale distributed quantum circuit. Hence, a monolithic quantum circuit is distributed to K quantum units. As such, minimizing communication between the K partitions is the objective. In Level I, an integer linear programming approach is proposed to distribute the qubits to K parts in a well-balanced manner. This minimizes the number of communications among these units. In this level, each non-local gate requires two teleportations because after a qubit is teleported to the destination, it is teleported back to its source. However, by teleporting one qubit of a non-local gate from the source to the destination, it may be used optimally in the destination by other non-local gates before being teleported back to its source. After the optimal utilization of the teleported-qubit, it can be returned to its source. Applying this concept can improve and minimize the number of teleportations. To this end, a recursive approach is proposed to consider for that in Level II. Therefore, through this hierarchical model, the required number of teleportations becomes fewer than the number of non-local gates.

The reminder of this paper is organized, as follows. “Related work” presents an overview of prior work. “The proposed algorithm” provides the proposed method in detail. Finally, “Experimental results” presents and discusses the experimental results.

Related work

Distributed Quantum Computing (DQC) has been studied for many years. Scaling small-sized quantum systems to large-scale ones has been the main goal of these studies. The first study on DQC was reported in^21,22,23. In that study, the author proposed the some quantum systems have physically located far from each other and sent the required information to a base station. He showed that the overall computation time is faster, in proportion to the number of such distributed quantum systems.

Moreover, DQC has been used in many applications. In²⁴, the authors considered two black boxes as two quantum devices and they were prevented from communicating with others and designed trusted quantum cryptography to share a random key with security, based on quantum physics. A practical application for quantum machine learning (QML) was presented in²⁵. In this application, a distributed secure quantum machine learning was considered for the classical client to delegate a remote quantum machine learning to the quantum server with data privacy. In¹³, two main approaches, i.e. teledata and telegate were discussed. In the telegate approach, teleporting gates enables them to be executed remotely without requiring qubits to be nearby. In teledat, qubits transfer their states to other systems without moving them physically.

Squash²⁶ proposed a gate partition method by using METIS²⁷ as the partitioning tool. Moghadam et al.²⁸ used the min-cut approach presented in²⁹ to divide the graph of the quantum circuit into smaller units. In³⁰, the authors used the modified version of the graph partitioning algorithm of³¹ to minimize interaction between qubits. The authors of³² presented an architecture for DQC. They partitioned the quantum circuit by the multilevel k-way hypergraph algorithm presented in³³.

Most recently, one strategy to scale up the number of qubits has been the quantum internet^34,35,36. Quantum internet is a network of quantum systems which are able to interconnect with each other remotely via quantum and classical links. Distributed quantum computing is used in this network. In fact, the quantum internet is considered as a virtual machine consisting of several qubits and is scaled with the number of quantum devices in the network. This concept may indicate the possibility of an exponential speed-up quantum computing power^3,35. In³, authors considered the challenges and open problems of Quantum internet design. They highlighted the differences between quantum and classical networks and discussed the critical research and challenges in designing quantum communication networks.

At first Yimsiriwattana et.al., in³⁷ showed, for any contiguous non-local CNOT gates in which have common control qubit, the control line needs to be distributed only once, because it can be reused. This idea allows the number of communications reduce.

An automated method for distributing quantum circuits to K balanced partitions was investigated in²⁰. They reduced the problem to hypergraph partitioning. Their algorithm consisted of two steps: pre and post processing for improving circuit distribution. It implements any number of contiguous non-local CNOT gates that execute on the same control qubit with target qubits in the same partition. They noted the consecutive non-local CNOT with the mentioned character can be executed with one teleportation.

Zomorodi et.al., presented several works^{38,39,40,41,42,43,44,45} for optimizing and partitioning of quantum circuits. Davarzani et.al.³⁸ presented a dynamic programming approach to distribute a quantum circuit to K parts to minimize the number of communications. Their approach consisted of two steps. In the first step, the quantum circuit was converted into a bipartite graph. And in the next step, the bipartite graph was distributed to K parts by a dynamic programming approach. In that study, they tried to minimize the number of non-local CNOTs by converting the problem into minimum K-cut problems.

In another study³⁹, an algorithm was proposed for DQC, consisting of two separated and long-distance quantum systems. They examined different configurations for the execution of non-local gates. Also, they ran their proposed algorithm for each configuration to reach the number of required teleleportations. The minimum number of communications was found among all the configurations. But, their proposed method had an exponential complexity.

An approach based on genetic algorithm has been used in⁴⁰ to distribute a quantum circuit into two partitions. The main purpose of the algorithm was to determine which qubit of a non-local gate should be teleported to the other system and when the teleported qubit should be returned back to its home partition. Also, in our another work⁴¹, we presented a two-phase algorithm based on NSGA-II to bi-partition the qubits in the first phase and suggested two heuristics to optimize the number of non-local gates in the second phase. The authors in^42,44 also discussed the issue of reducing communication cost in a distributed quantum circuit composing of up to three-qubit gates and presented a new heuristic method to solve it.

An automated Windows-Based method was proposed in⁴⁶. In that study, the gate and qubit teleportation concept were combined with each other to minimize communication cost efficiently.

The proposed algorithm

In this paper, we consider the problem of optimally distributing a given quantum circuit for evaluation over a set of subsystems and propose a two-level optimizer to reach a large-scale monolithic quantum circuit with the minimum number of required communications. The proposed method consists of two levels:

Level I: In this step, the number of subsystems and the quantum circuit are given as inputs and the labelling $P:\{q_{i} |i\in \{1,...,N_{q}\}\}\rightarrow \{1,2,...,K\}$ of qubits to subsystems as output. Here, we partition the given circuit across distributed quantum circuits to reach near-balanced partitions of qubits. For this reason, an integer linear programming model is proposed to partition the quantum circuit into K parts. After distribution of qubits, some gates be non-local and each non-local gate requires two teleportation to the forward and backward qubits from source to destination units. Therefore, the number of communications is double equal to the number of non-local two-qubit gates obtained by this partitioning model.
Level II: At this level, the obtained partitioning of Level I is considered as input and the minimum number of required teleportations is reached as output. As mentioned above, in the previous level, for each non-local gate, two teleportations are needed for forwarding and backwarding communications. When one of the qubits of non-local gates is teleported to the destination partition, more gates are executed by teleporting this qubit without the need of teleporting it back immediately. In this turn, the number of teleportations reduces. In this level, this idea is considered to optimize the number of teleportations. The details of these levels are as follows. Also, we use the notation of Table 1 in this paper.

Table 1 The notation of the proposed algorithm.

Full size table

Level I: the partitioning of quantum circuit

In this section, a K-way partitioning method is proposed to distribute a quantum circuit to K balanced partitions. This problem is a NP-hard problem and defined as follows:

Definition 1

Consider the undirected and weighted Graph $G = (V, E)$, where V denotes the set of n vertices and E the set of edges. The balanced graph partitioning problem takes the Graph G(V, E), Parameter K as the number of partitions and Parameter $\omega$ known as the load balance tolerance as inputs. We wish to partition the graph into K balanced disjoint parts or sub-graphs $(V_{1}, V_{2},...,V_{K})$ so that $V={V_{1} \cup V_{2}\cup ... \cup V_{K }}$. Two criteria must be satisfied as follows:

Minimum number of cuts: the number of cuts among all the different sub-graphs is minimized as Eq. (1):
$$\begin{aligned} min \sum \limits _{k=1}^{K} \sum \limits _{l=k+1}^K \sum \limits _{v_{1}\in V_{k},v_{2}\in V_{l}} C({v_{1},v_{2}}) \end{aligned}$$
(1)
where $C({v_{1},v_{2}})$ is the weight of edge $(v_{1},v_{2})$.
Load-balance: for all $k=1,2,...,K$:
$$\begin{aligned} |V_k|\le \frac{(1+\omega )|V|}{K} \end{aligned}$$
(2)

As a combinational problem, many heuristic approaches are mostly used to the graph partitioning to the need acceptable computation time. We reduced the problem of balanced distribution of quantum circuit to the problem of balanced graph partitioning so that qubits and gates are the nodes and edges in graph respectively. We proposed an integer linear programming model for $K-way$ partitioning of quantum circuits. Let the quantum circuit consist of two sets, i.e. $Q=\{q_{i} |i \in {1,...,N_{q}}\}$ and ${\mathcal {G}} =\{g_{j} |j\in {1,...,N_{2qubit}}\}$ where set ${\mathcal {G}}$ is the set of two-qubit gates. Each $g_j$ operates on two Qubits $q_{i_{1}}$ and $q_{i_{2}}$ and has been shown as $g_j(q_{i_{1}},q_{i_{2}})$. The binary variables of the proposed mathematical model are as Eqs. (3) and (4):

$$\begin{aligned} f_{j}= & {} {\left\{ \begin{array}{ll} 1&{} \text {if }g_j\hbox { is\, a\, non-local\, gate}\\ 0&{} \text {otherwise} \end{array}\right. } \end{aligned}$$

(3)

$$\begin{aligned} p_{i,k}= & {} {\left\{ \begin{array}{ll} 1&{} \text {if }q_{i}\hbox { has\, been\, located \,on \,Part\, }k\\ 0&{} \text {otherwise} \end{array}\right. } \end{aligned}$$

(4)

The binary variable $f_{j}$ is set to one when a two-qubit Gate $g_{j}(i_1,i_2)$ is a non-local gate and Qubits $i_1$ and $i_2$ have been located on the different parts and zero otherwise (local gate). Also the binary Variable $p_{i,k}$ is determined whether $q_{i}$ be located to the Part k or not. The proposed model is given in Eqs. (5) to (9):

$$\begin{aligned}&\min \sum \limits _{j=1}^{N_{2qubit}} f_{j} \end{aligned}$$

(5)

$$\begin{aligned}&\sum \limits _{i=1}^{N_{q}} p_{i,k} \le (1+\omega )|Q| /K \quad \forall k=1,...,K \end{aligned}$$

(6)

$$\begin{aligned}&\sum \limits _{k=1}^{K} p_{i,k}=1 \quad \forall i=1,...,N_q \end{aligned}$$

(7)

$$\begin{aligned}&f_{j}\ge p_{i_2,k}-p_{i_1,k} \quad \forall k=1,...,K, \quad g_j(i_{1},i_{2}) \in \mathcal {G} \end{aligned}$$

(8)

$$\begin{aligned}&f_{j}\ge p_{i_1,k}-p_{i_2,k} \quad \forall k=1,...,K, \quad g_j(i_{1},i_{2}) \in \mathcal {G} \end{aligned}$$

(9)

S.t $f_{j} \in \{0,1\}, p_{i_1,k}\in \{0,1\} \quad \forall i=1,...N_{q}, j=1,...,N_{2qubit}, k=1,...,K$

Equation (5) determines the objective function. In this problem, the number of non-local gates is considered as the objective function. Load balancing criteria is considered in Eq. (6). Equation (7) ensures that a qubit is assigned to exactly one unit. Equations (8) and (9) guarantee that non-local gates are correctly accounted.

The proposed model distributes the quantum circuit into K balanced units. This distribution involves mapping the qubits of circuit into K subsystems. The output should be a labelling $f: Q \rightarrow \{1,...,K\}$ of qubits to satisfying two criteria given in Eqs. (1) and (2). This function maps the qubits to a set of labels $P=\{p_1,p_2,...,p_K\}$. These labels are as input of Level II.

Level II: the optimization level

After partitioning of Level I, qubits are distributed into K units according to the obtained labeling of the previous level. As stated earlier, each non-local gate needs two teleportations for executing. It is clear that in many cases, teleporting a qubit from its source partition to the destination partition, known as the migrated qubit, makes it optimally available to use by other gates without the need to teleport it back to its own partition. After that, the migrated qubit is teleported back to its home partition. At this level, we propose a recursive approach to implement this issue and minimize the total number of teleportations.

In this level, we present a data structure for representing quantum circuits. This structure is a two-dimensional matrix called $C_{N_{q}\times N_{g}}$ with $N_q$ rows and $N_g$ columns and defined as follows:

Qubits are located on the rows and numbered from one to $N_{q}$, where the ith row indicates Qubit $q_{i}$.
Gates are located on the columns and are numbered in the order of their executions in the quantum circuit.

Element $C_{i,j} \quad (1\le i\le N_{q}, 1\le j\le N_{g})$ consists of two components: (index, label). index is the qubit that communicates with ith qubit in jth gate and label is the type of this qubit in which is ‘control’ or ‘target’ in two-qubit gates or ‘non’ in one-qubit gates. These elements are constructed as follows:

For each two-qubit gate $g_{i} (q_{t},q_{c})$, $C_{t,i}=(q_c,$‘c’) and $C_{c,i}=(q_t,$‘t’).
For each one-qubit gate $g_{i} (q_{j})$, $C_{j,i}=(q_j,non)$.
Other elements are quantified by zero.

For example, we consider a quantum circuit with 4 qubits and 7 gates in Fig. 2a so that its corresponding matrix is given in Fig. 2b.

Algorithm I presents the main algorithm. In this algorithm, we used an array called run with Size $N_{g}$ in which run[i] indicates the status of the ith gate in which it has/has not been executed. The algorithm starts from the first gate or column of C (Index s). It may indicate one of the following three conditions:

Column s indicates a local two-qubit gate.
Column s is a one-qubit gate.
Column s indicates a non-local gate.

In the first two cases, no teleportation is required and these gates are executed and run[s] is set to one (Lines 5–6 of the main algorithm). Otherwise, Gate $g_s$ is a non-local gate and a teleportation is required for the executing of $g_s$ . Then the teleportation cost is increased by two (Line 10) in which one additional teleportation must be accounted for transferring the qubit back to its source part. Then Function $Find\_qubits(g_s)$ finds two qubits of Gate $g_{s}$ called Qubits $index_{1,s}$ and $index_{2,s}$. One of these qubits called $q\_teleport$ which led to the minimum number of teleportations, is selected (Line 12). This qubit is teleported from its own part to the destination to execute gate $g_s$. The algorithm tracks the whole circuit to find the gate that can be executed without returning $q\_teleport$ to its source. This means the teleported qubit is optimally used by the other gates which require $q\_teleport$ and can be executed.

Let Gate $g_{d}$ in Column d be the first local two-qubit gate called $g_{d} (index_{1,d},index_{2,d})$ in whis has common Qubit $q\_teleport$ with Gate $g_s$. This gate must be considered whether it can be executed or not. Function $Execute(g_{s},g_{d},q\_teleport, run)$ is a recursive function and considers by teleporting $q\_teleport$, Gate $g_{d}$ can be executed or not. This function is shown in Algorithm II. Three states may occur in this function as follows:

The function returns False when there is at least a non-executed and non-local gate between $g_{s}$ and $g_{d}$ which has not been executed before $g_{d}$ and the execution of $g_d$ depends on it. Let Column $k (s< k <d)$ as $g_{k} (index_{1,k},index_{2,k})$ be the first non-executed and non-local gate before Column d in which has a common qubit with Gate $g_d$. This column has two non-zero rows $index_{1,d}$ and $index_{2,d}$. This function returns False ( Line 11 of Algorithm II) and stops due to the following condition:
$$\begin{aligned}& index_{i,k} = index_{j,d} \quad \& \& \\& P_{index_{\{1,2\} - i,k}} \ne P_{index_{\{1,2\}-j,d}} \quad \& \& \\& \biggl (C[k,index_{i,k}].label \ne C[d,index_{j,d}].label\quad \Vert \\& C[k,index_{i,k}].label = C[d,index_{j,d}].label==`t`\biggr ) \exists i,j \in \{1,2\} \end{aligned}$$
(10)

Equation (10) indicates one of the qubits of $g_{k}$ is the same as the qubits of $g_{d}$ with a different label or the same Label ‘t’ and another qubit of $g_{k}$ and $g_d$ has been located on the different partitions. In this case, another teleportation is required to execute $g_{k}$ and the function returns False, as a result. Figure 3a shows this concept. In this example, $q_{1}$ is teleported from $P_{1}$ to $P_{3}$ to execute $g_s$. By this teleporting, executing of Gate $g_{d}$ should be considered by Function Execute. This function finds non-executed and non-local Gate $g_{k}$ before Gate $g_{d}$ in which have common qubit $q_{1}$ with a different label. Since execution of Gate $g_d$ depends on execution of Gate $g_{k}$ and Gate $g_k$ is a non-local gate, Then Gate $g_d$ cannot execute and Function Execute returns False.

Sometimes Gate $g_{k}$ may be a non-local gate in which has a common qubit with Gate $g_{d}$ with Label ‘c’. In this case, the execution of Gate $g_{d}$ is independent the execution of Gate $g_{k}$. This in turn, non-execution Gate $g_{k}$ prevented to execution of Gate $g_d$ and the execution of other previous gates of $g_{d}$ are considered (Lines 7–9) . Equation (11) indicates this state.
$$\begin{aligned} index_{i,k}= index_{j,d}\quad \& \& ({C[ {k,inpu{t_{i,k}}} ].label =C[{d,inpu{t_{j,d}}} ].label = `c`} ) \quad \exists i,j \in \{1,2\} \end{aligned}$$
(11)
This state is shown in Fig. 3b.
There are no gates between Gate $g_{s}$ and Gate $g_{d}$ to prevent the execution of Gate $g_d$. In this case, this function returns True (Lines 13–14).
If Gate $g_{k}$ does not meet any of the conditions of Eqs. (10) and (11), Function $Execute(g_s,g_k,q\_teleport,run)$ is called recursively to consider if $g_{k}$ is executed or not (Line 19).

The proposed method is explained by an example. Figure 4a shows quantum circuit 2–4 dec given from Revlib⁴⁷. This circuit consists of six qubits and 27 gates. Our algorithm distributes this circuit into three partitions each containing two qubits. At first, Level I of proposed method distributes this circuit as shown in Fig. 4b and Array P is quantified as [3,3,2,1,2,1]. In this level, the number of non-local gates is obtained 13 and then total number of communications is set to 26. Table 2 demonstrates the steps of Level II of our method on this circuit. In this table, $g_s$, status of $g_s$ (one-qubit/ local/ non-local gate) and qubit which is teleported ($q\_teleport$) are given in Column 2. In Column 3, the partition that $q\_{teleport}$ is teleported to it (destination partition), $g_d$ and the partition that $q\_teleport$ is teleported back to it (source partition) are depicted respectively. Also the Array run that indicates i-th gate is executed or not is shown in Column 4 and array P is given in the last column. The steps of Level II is as following:

Step 1: $g_1$ to $g_6$ are one-qubit gates and no teleportation is required. Then $run[i]=1,\{i=1,...,6\}$.
Step 2: $g_7(q_1,q_4)$ is a non-local gate and $q_1$ is teleported to $P_1$. $g_{10}$ is the first gate which has common qubit $q_1$ with $g_7$. Since $g_{10}$ is dependent to $g_9$ and $run[9]=0$, $g_{10}$ could not be executed. Therefor $g_7$ is only executed and $run[7]=1$. Then $q_1$ is teleported back to $P_3$;
Step 3: $g_8(q_3,q_4)$ is a non-local gate and $q_3$ is teleported to $P_1$. $g_9$ is the first gate which has common qubit $q_3$ with $g_8$. Then $run[i]=1,i=\{8,9\}$. Other gates could not be executed and $q_3$ is teleported back to its source partition ($P_2$).
Step 4: $g_{10}(q_1,q_4)$ is a non-local gate and $q_1$ is teleported to $P_1$. Any gate has common qubit with $g_{10}$. Then $run[10]=1$ and $q_1$ is teleported back to $P_3$.
Step 5: $g_{11}(q_1,q_3)$ is a non-local gate and $q_1$ is teleported to $P_2$. Any gate has common qubit with $g_{11}$. Then $run[11]=1$ and $q_1$ is teleported back to $P_3$.
Step 6: $g_{12}(q_3,q_4)$ is a non-local gate and $q_4$ is teleported to $P_2$. $g_{13}$ is the first gate which has common qubit $q_4$ with $g_{12}$. Therefor $g_{13}$ is only executed and $run[i]=1,i=\{12,13\}$. Then $q_4$ is teleported back to $P_1$.
Step 7: $g_{14}(q_2,q_3)$ is a non-local gate and $q_2$ is teleported to $P_2$. $g_{18}$ and $g_{17}$ have common qubit $q_2$ with $g_{14}$. These gates are dependent to $g_{15}$ and $g_{16}$ which are local gates and could be executed. Then $run[i]=1, i=\{14,...,18\}$ and $q_2$ is teleported back to $P_3$.
Steps 8 and 9: $g_{19}$ and $g_{20}$ are local gates and executed. Then $run[i]=1,i=\{19,20\}$.
Step 10: $g_{21}(q_2,q_4)$ is a non-local gate and $q_2$ is teleported to $P_1$. Then $run[i]=1,i=\{21,...25\}$. Then $q_2$ is teleported back to $P_3$.
Steps 11 and 12: $g_{26}$ and $g_{27}$ are local gates and are executed. Then $run[i]=1,i=\{26,27\}$.

As shown above, each of Steps 1, 2, 3, 4, 5, 6 and 9 require to two teleportations. Then the total number of teleportations is 14 for this circuit.

Table 2 The steps of distribution Circuit 2-4dec into 3 partitions with 7 qubits and 27 gates.

Full size table

Experimental results

We implemented our method in MATLAB on a Core i7 CPU operating at 1.8 GHz with 8 GB of memory. We used many circuits to compare the performance of the proposed method with previous approaches: that of³⁹, the dynamic programming approach of³⁸, the evolutionary algorithm of⁴⁰, the automated approach of²⁰ and the windows-based method of⁴⁶. The benchmark circuits are given from⁴⁸ (the circuits from 1 to10), Revlib⁴⁷ (the circuits from 11 to 15 and 26 to 31), some quantum error-correction encoding circuits⁴⁹(the circuits from 16 to 25) and n-qubit Quantum Fourier Transform circuits (QFT)⁵⁰ where $n \in \{16, 32, 64, 128, 256\}$. The benchmark circuits include some of the gates of the gate library synthesized following the method in⁵¹. In this paper CNOT, CZ and one qubit gates are considered as the gate library.

To put the quality of results into perspective, the standard deviation criterion is employed as Eq. (12):

$$\begin{aligned} Dev=\frac{T_{ap}-T_{best}}{T_{best}}*100 \end{aligned}$$

(12)

Where $T_{best}$ is the best number of teleportations obtained among all of approaches and $T_{ap}$ is the obtained number of teleportations of approach that we compare ours to.

First, Table 3 shows the number of teleportations in comparison with the windows-based approach of⁴⁶. In this table, the number of qubits, gates and partitions are given in Columns 3, 4 and 5. Also Columns 6, 7, 8 and 9 report the number of teleportations and Dev of the proposed method and method⁴⁶, respectively. As shown in this table, except Circuits 2-4dec, Cycle17_3, Ham15-D3, Ham7_106 and Parity247, Dev for our approach have zero value and demonstrate the proposed method outperformed that of⁴⁶ to reach minimum number of teleportations in these circuits.

Table 3 The number of teleportations $(N_t)$ and Dev of the proposed algorithm in comparison with the method of⁴⁶.

Full size table

First, it is important to demonstrate how applying Level II to the partitioning of Level I improves the number of communication. As mentioned before in that section, in Level I of the proposed algorithm, two teleportations are needed to execute each non-local gate because after a qubit is teleported to the destination home, it is teleported back to its source. Also, the proposed algorithm on level II allowed to teleported qubit to used optimally in the destination home. Therefore, it can save many number of quantum teleportations. Figure 5a shows the effectiveness of applying Level II to Level I to decrease the number of teleportations for Circuits 1 to 15. As can be seen, the bottom bar (the blue bar) indicates the required number of communications after applying Level II to these benchmarks and the top bar (the orange bar) indicates extra teleportations without applying Level II. As shown in this figure, in all of cases, over 70% of non-local gates could be implemented locally and the Level II reduces the number of teleportations to less than half in all of the samples.

In another test, we considered the impact of the number of subsystems on the number of teleportations. A near-balanced distribution of qubits over more quantum circuits requires more communications. Figure 5b demonstrates the effect of the number of units (K) on the required number of teleportations on Circuit Hwb50 with 56 qubits and 6430 gates. In this figure, qubits are distributed across {2, 3,..., 7} units. As shown in this figure, an increase in the number of partitions used to distribute qubits requires more communications among them and a large number of teleportations is used. Lines Blue and Orange show the obtained number of teleportations before and after applying Level II to Circuit Hwb50, respectively.

Second, we tested our method on another benchmark (numbered 16–25) and compared with method of⁴⁶. These results are demonstrated in Table 4. The best obtained results are marked in bold. Except of three circuits, our approach has outperformed in comparison to⁴⁶.

Table 4 Comparison the number of teleportations of proposed method ($N_t$) with proposed approach of⁴⁶ on Circuits 16 to 25.

Full size table

Third, we ran our method on QFT circuit in comparison with the method of²⁰. We distributed the quantum circuit across {4,6,8,...16} quantum devices. Also, the $N_q$ and $N_g$ are 201 and 19900, respectively. Figure 6 shows the proportion of the number of teleportations over the total number of two-qubit gates for our approach and approach of²⁰. As shown in this figure, this ratio grows by increasing the number of partitions. Also our approach has acted better than method²⁰ in all cases in terms of the ratio between the number of teleportations and the number of two qubit gates. Since, the proposed approach considers all of the configurations to execute more non-local gates, it found the minimum number of communications in comparison with the approach of²⁰ in which they implemented a group of non-local gates with a common control qubit only. As shown in this figure, when QFT is distributed in $K=\{4,6,8\}$, the number of teleportations obtained by the proposed approach have many differences from the approach of²⁰, but two methods acted almost identically for $K=\{10,12,14,16\}$.

In another test, we demonstrate the effectiveness of load-balance tolerance ($\omega$) on the number of non-local gates in Level I. Figure 7 shows the number of non-local gates for various $\omega =\{0.1,...,0.9\}$ on one sample circuit. As shown in this figure, $N_{non}$ is reduced by increasing the load-balance tolerance. According to Eq. (6), when Factor $\omega$ increases, the qubits that have many communications with each other, are located in the same partition. Therefore, the number of non-local gates is reduced.

Table 5 The teleportation cost of the proposed method ($N_t$) on Circuits 26 to 31 in comparison with of^38,39,40.

Full size table

Another set of the test samples was taken from Revlib to compare the proposed method with the other approaches^38,39,40 such as: Alu_primitive, Parity, Flip_flop, Sym9_147 (the circuits 26 to 31). The number of qubits, gates and partitions are given in Columns 2, 3 and 4 of Table 5 respectively. Also Columns 5, 6 and 7 report the number of teleportations of^38,39,40 too. The last column shows the obtained number of the teleportations of the proposed approach. As can be seen, the proposed method outperformed the other approaches.

Conclusion

In this paper, a two-level hierarchical architecture of distributed quantum computing was proposed to build large quantum systems in which the number of communications among quantum subsystems is minimized. In the first level, an integer linear programming model was proposed to distribute the qubits to K balanced subsystems. In the second level, we presented a new data structure for representing quantum circuits. Also, according to the partitioning of the first level, when one of the qubits of a non-local gate is teleported from its source subsystem to the destination, it is used optimally by other gates in the destination subsystem before being teleported back to its own subsystem. Moreover, we proposed a recursive method to optimize the number of teleportations. Finally, we ran the proposed method on the different benchmarks and showed that it produces better results in comparison with the previous ones.

References

Krantz, P. et al. A quantum engineer’s guide to superconducting qubits. Appl. Phys. Rev. 6, 021318 (2019).
Article ADS Google Scholar
Huang, H.-L. et al. Experimental blind quantum computing for a classical client. Phys. Rev. Lett. 119, 050503 (2017).
Article ADS Google Scholar
Cacciapuoti, A. S. et al. Quantum internet: Networking challenges in distributed quantum computing. IEEE Netw. (2019).
Cacciapuoti, A. S., Caleffi, M. & Van Meter, R. & Hanzo, L. Quantum teleportation for the quantum internet. in IEEE Transactions on Communications, When Entanglement Meets Classical Communications (2020).
Cacciapuoti, A. S. & Caleffi, M. Toward the quantum internet: A directional-dependent noise model for quantum signal processing. in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 7978–7982 (IEEE, 2019).
Krojanski, H. G. & Suter, D. Scaling of decoherence in wide NMR quantum registers. Phys. Rev. Lett. 93, 090501 (2004).
Article ADS Google Scholar
Cuomo, D., Caleffi, M. & Cacciapuoti, A. S. Towards a distributed quantum computing ecosystem. arXiv preprintarXiv:2002.11808 (2020).
Blatt, R. & Roos, C. F. Quantum simulations with trapped ions. Nat. Phys. 8, 277–284 (2012).
Article CAS Google Scholar
Bruzewicz, C. D., Chiaverini, J., McConnell, R. & Sage, J. M. Trapped-ion quantum computing: Progress and challenges. Appl. Phys. Rev. 6, 021314 (2019).
Article ADS Google Scholar
Kjaergaard, M. et al. Superconducting qubits: Current state of play. Annu. Rev. Condens. Matter Phys. 11, 369–395 (2020).
Article Google Scholar
Huang, H.-L., Wu, D., Fan, D. & Zhu, X. Superconducting quantum computing: A review. Sci. China Inf. Sci. 63, 1–32 (2020).
Article ADS MathSciNet Google Scholar
Slussarenko, S. & Pryde, G. J. Photonic quantum information processing: A concise review. Appl. Phys. Rev. 6, 041303 (2019).
Article ADS Google Scholar
Van Meter, R., Ladd, T. D., Fowler, A. G. & Yamamoto, Y. Distributed quantum computation architecture using semiconductor nanophotonics. Int. J. Quantum Inf. 8, 295–323 (2010).
Article Google Scholar
Monroe, C. et al. Large-scale modular quantum-computer architecture with atomic memory and photonic interconnects. Phys. Rev. A 89, 022317 (2014).
Article ADS Google Scholar
Ahsan, M., Meter, R. V. & Kim, J. Designing a million-qubit quantum computer using a resource performance simulator. ACM J. Emerg. Technol. Comput. Syst. (JETC) 12, 1–25 (2015).
Google Scholar
Bennett, C. H. et al. Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels. Phys. Rev. Lett. 70, 1895 (1993).
Article ADS MathSciNet CAS Google Scholar
Duan, L.-M., Lukin, M. D., Cirac, J. I. & Zoller, P. Long-distance quantum communication with atomic ensembles and linear optics. Nature 414, 413–418 (2001).
Article ADS CAS Google Scholar
Sangouard, N., Simon, C., De Riedmatten, H. & Gisin, N. Quantum repeaters based on atomic ensembles and linear optics. Rev. Mod. Phys. 83, 33 (2011).
Article ADS Google Scholar
G Sundaram, R., Gupta, H. & Ramakrishnan, C. Efficient distribution of quantum circuits. in 35th International Symposium on Distributed Computing (DISC 2021) (Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2021).
Andrés-Martínez, P. Automated distribution of quantum circuits. Theor. Comput. Sci. 410, 2489–2510 (2018).
Google Scholar
Grover, L. K. Quantum telecomputation. arXiv preprint arXiv:quant-ph/9704012 (1997).
Cirac, J., Ekert, A., Huelga, S. & Macchiavello, C. Distributed quantum computation over noisy channels. Phys. Rev. A 59, 4249 (1999).
Article ADS MathSciNet CAS Google Scholar
Cleve, R. & Buhrman, H. Substituting quantum entanglement for communication. Phys. Rev. A 56, 1201 (1997).
Article ADS CAS Google Scholar
Reichardt, B. W., Unger, F. & Vazirani, U. Classical command of quantum systems. Nature 496, 456–460 (2013).
Article ADS CAS Google Scholar
Sheng, Y.-B. & Zhou, L. Distributed secure quantum machine learning. Sci. Bull. 62, 1025–1029 (2017).
Article Google Scholar
Dousti, M. J., Shafaei, A. & Pedram, M. Squash 2: A hierarchical scalable quantum mapper considering ancilla sharing. arXiv preprintarXiv:1512.07402 (2015).
Karypis, G. & Kumar, V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 359–392 (1998).
Article MathSciNet Google Scholar
Moghadam, M. C., Mohammadzadeh, N., Sedighi, M. & Zamani, M. S. A hierarchical layout generation method for quantum circuits. in The 17th CSI International Symposium on Computer Architecture & Digital Systems (CADS 2013). 51–57 (IEEE, 2013).
Breuer, M. A. A class of min-cut placement algorithms. in Proceedings of the 14th Design Automation Conference. 284–290 (1977).
Wang, G. & Khainovski, O. A fault-tolerant, ion-trap-based architecture for the quantum simulation algorithm. Measurement 10, 10–4 (2010).
Google Scholar
Stoer, M. & Wagner, F. A simple min-cut algorithm. J. ACM (JACM) 44, 585–591 (1997).
Article MathSciNet Google Scholar
Sargaran, S. & Mohammadzadeh, N. Saqip: A scalable architecture for quantum information processors. ACM Trans. Architect. Code Optim. (TACO) 16, 1–21 (2019).
Article Google Scholar
Karypis, G. & Kumar, V. Multilevel k-way hypergraph partitioning. VLSI Des. 11, 285–300 (2000).
Article Google Scholar
Kimble, H. J. The quantum internet. Nature 453, 1023–1030 (2008).
Article ADS CAS Google Scholar
Caleffi, M., Cacciapuoti, A. S. & Bianchi, G. Quantum internet: From communication to distributed computing! in Proceedings of the 5th ACM International Conference on Nanoscale Computing and Communication. 1–4 (2018).
Bourzac, K. 4 tough chemistry problems that quantum computers will solve [news]. IEEE Spectrum 54, 7–9 (2017).
Article Google Scholar
Yimsiriwattana, A. & Lomonaco Jr, S. J. Generalized ghz states and distributed quantum computing. arXiv preprintarXiv:quant-ph/0402148 (2004).
Davarzani, Z., Zomorodi-Moghadam, M., Houshmand, M. & Nouri-baygi, M. A dynamic programming approach for distributing quantum circuits by bipartite graphs. Quantum Inf. Process. 19, 1–18 (2020).
Article MathSciNet Google Scholar
Zomorodi-Moghadam, M., Houshmand, M. & Houshmandi, M. Optimizing teleportation cost in distributed quantum circuits. Theor. Phys. 57, 848–861 (2018).
Article MathSciNet Google Scholar
Zahra Mohammadi, M. Z.-M., Houshmand, M. & Houshmandi, M. An evolutionary approach to optimizing communication cost in distributed quantum computation. arXiv (2019).
Ghodsollahee, I. et al. Connectivity matrix model of quantum circuits and its application to distributed quantum circuit optimization. Quantum Inf. Process. 20, 1–21 (2021).
Article MathSciNet Google Scholar
Daei, O., Navi, K. & Zomorodi-Moghadam, M. Optimized quantum circuit partitioning. Int. J. Theor. Phys. 59, 3804–3820 (2020).
Article Google Scholar
Dadkhah, D., Zomorodi, M., Hosseini, S. E., Plawiak, P. & Zhou, X. Reordering and partitioning of distributed quantum circuits. IEEE Access. 10, 70329–70341. https://doi.org/10.1109/ACCESS.2022.3186485 (2022).
Article Google Scholar
Daei, O., Navi, K. & Zomorodi, M. Improving the teleportation cost in distributed quantum circuits based on commuting of gates. Int. J. Theor. Phys. 60(9), 3494–3513. https://doi.org/10.1007/s10773-021-04920-y (2021).
Article MathSciNet MATH Google Scholar
Dadkhah, D., Zomorodi, M. & Hosseini, S. E. A new approach for optimization of distributed quantum circuits. Int. J. Theor. Phys. 60(9), 3271–3285. https://doi.org/10.1007/s10773-021-04904-y (2021).
Article MATH Google Scholar
Nikahd, E., Mohammadzadeh, N., Sedighi, M. & Zamani, M. S. Automated window-based partitioning of quantum circuits. Phys. Scr. 96, 035102 (2021).
Article ADS CAS Google Scholar
Wille, R., Große, D., Teuber, L., Dueck, G. W. & Drechsler, R. Revlib: An online resource for reversible functions and reversible circuits. in 38th International Symposium on Multiple Valued Logic (ISMVL 2008). 220–225 (IEEE, 2008).
Maslov, D. Reversible logic synthesis benchmarks page. http://www.cs.uvic.ca/maslov/ (2005).
Cross, A. W., DiVincenzo, D. P. & Terhal, B. M. A comparative code study for quantum fault-tolerance. arXiv preprintarXiv:0711.1556 (2007).
Fowler, A. G. & Hollenberg, L. C. Scalability of Shor’s algorithm with a limited set of rotation gates. Phys. Rev. A 70, 032329 (2004).
Article ADS MathSciNet Google Scholar
Barenco, A. et al. Elementary gates for quantum computation. Phys. Rev. A 52, 3457–3467 (1995).
Article ADS CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
Zohreh Davarzani & Mariam Zomorodi
Department of Computer Engineering, Payame Noor University, Tehran, Iran
Zohreh Davarzani
Department of Computer Science, Faculty of Computer Science and Telecommunications, Cracow University of Technology, Krakow, Poland
Mariam Zomorodi
Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad, Iran
Mahboobeh Houshmand

Authors

Zohreh Davarzani
View author publications
You can also search for this author in PubMed Google Scholar
Mariam Zomorodi
View author publications
You can also search for this author in PubMed Google Scholar
Mahboobeh Houshmand
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.D. contributed in writing the main manuscript text and M.Z. and M.H. contributed in revising, verifying the results, and improving the writing of the manuscript.

Corresponding author

Correspondence to Mariam Zomorodi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Davarzani, Z., Zomorodi, M. & Houshmand, M. A hierarchical approach for building distributed quantum systems. Sci Rep 12, 15421 (2022). https://doi.org/10.1038/s41598-022-18989-w

Download citation

Received: 26 November 2021
Accepted: 23 August 2022
Published: 14 September 2022
DOI: https://doi.org/10.1038/s41598-022-18989-w

This article is cited by

Routing Strategy for Distributed Quantum Circuit based on Optimized Gate Transmission Direction
- Zilu Chen
- Xinyu Chen
- Zhijin Guan
International Journal of Theoretical Physics (2023)
Optimization of the transmission cost of distributed quantum circuits based on merged transfer
- Xueyun Cheng
- Xinyu Chen
- Zhijin Guan
Quantum Information Processing (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.