Abstract
We consider decomposition for a controlledR_{ n } gate with a standard set of universal gates. For this problem, a method exists that uses a single ancillary qubit to reduce the number of gates. In this work, we extend this method to three ends. First, we find a method that can decompose into fewer gates than the best known results in decomposition of controlledR_{ n }. We also confirm that the proposed method reduces the total number of gates of the quantum Fourier transform. Second, we propose another efficient decomposition that can be mapped to a nearestneighbor architecture with only local CNOT gates. Finally, we find a method that can minimize the depth to 5 gate steps in a nearestneighbor architecture with only local CNOT gates.
Similar content being viewed by others
Introduction
Due to the recent advances in quantum device technology, an arbitrary singlequbit gate or a Zrotation gate can be implemented with fairly high accuracy, and a small quantum algorithm can be tested. However, even with the gate of a small error rate currently being realized, it is difficult to directly perform scalable quantum computation since it requires that arbitrarily large computations is implemented. In order to overcome this problem, faulttolerable computation is still needed^{1}. Therefore, for reliable quantum computation, all quantum operations of a quantum algorithm should be represented by a universal gate set that arises from a faulttolerant protocol such as Clifford + T gates^{2}.
We consider a standard set of universal gates consisting of Hadamard (denoted H), phase (S), π/8 (T), and controlledNOT (CNOT) gates. Although it is known that quantum algorithms have much lower computational complexities than classical algorithms for problem such as factoring large integers^{3}, when such quantum algorithm are decomposed into CNOT, H, S, and T gates, the result includes a huge number of gates. Thus, the advantages of quantum computing might be nullified. To enhance the benefits of quantum computation, it is important to use an efficient decomposition of quantum algorithms into universal gates. Here, we first consider the decomposition of singlequbit gates and twoqubit gates. Any singlequbit gate can be decomposed in terms of Hadamard gates and Zrotation gates R_{ z }(θ)^{4,5}, and there are wellknown methods to approximate R_{ z }(θ) efficiently^{6,7,8,9}. Next, we consider a controlledR_{ n } gate as the simplest 2qubit gate to be decomposed into a universal set of gates. ControlledR_{ n } gates represent the fundamental part of the quantum Fourier transform (QFT) and many other quantum algorithms. Thus, controlledR_{ n } decomposition has a significant impact on the overall decomposition of a quantum algorithm. In this work, we propose efficient controlledR_{ n } decomposition methods as a technique to help enhance the benefits of quantum computation.
Background
Approximation of R _{ n } gate
An R_{ n } gate is defined as follows:
The R_{2} gate is an S gate (or P gate), and the R_{3} gate is a T gate. The R_{2} and R_{3} gates are included in the universal set. However, R_{ n } for n ≥ 4 cannot be exactly decomposed with only a standard set of universal gates^{8}. Thus, we should approximate R_{ n } for n ≥ 4 to express it with the standard set.
To approximate the R_{ n } gate, we use the gridsynth method^{9}. Given a precision ε > 0, the approximation of an R_{ n } gate is to find an operator U expressible as H, S, T and Pauli operators such that
where the norm is the operator norm.
The gridsynth algorithm^{9} gives the result of the efficient approximation of an R_{ n } gate in a probabilistic manner. Thus, we estimate the average number of gates for it. From Table 1, we can assume the average numbers of gates for an approximation of an R_{ n } gate as 127, 253 and 379 with ε = 10^{−5}, 10^{−10}, and 10^{−15}, respectively. Note that the average number of gates is independent of the rotation angle.
Zero ancillary qubit method (Method 1)
A controlledR_{ n } gate is defined as follows:
Figure 1 shows the circuit of the controlledR_{ n } gate with 2 CNOTs, 2 R_{n+1}s and 1 \({R}_{n+1}^{\dagger }\) gate. This method is a wellknown and fundamental method for the decomposition of a controlledR_{ n }^{10}. When we approximate the controlledR_{ n } with precision 10^{−10}, the total number of gates is 761 on average from Table 1. Thus, the approximation of one controlledR_{ n } requires an excessive number of gates.
One ancillary qubit method (Method 2)
Figure 2 shows the circuit of the controlledR_{ n } gate using a single ancillary qubit. The circuit consists of 1 R_{ n }, 16 CNOTs, 4 Hs, 8 Ts and 6 T^{†}s.
As noted in ref.^{8}, one advantage of such a circuit is that it reduces the depth with only a small constant overhead. As mentioned earlier, R_{ n } and R_{n+1} require many gates according to the precision. In the case of the precision 10^{−10}, R_{ n } and R_{n+1} both require approximately 253 gates. Therefore, the approach where a single ancillary qubit is employed appears to be beneficial.
We note that the ref.^{11} offers an approach to implementing a controlledU operation using an ancillary qubit containing an eigenstate of U. However, in this paper, we only focus on an approach using \(0\rangle \) state as an ancillary qubit. Thus, we have considered decomposition of controlledR_{ n } gate in an approach of the ref.^{11}. As future work, we will analyze the decomposition of a controlledU operation.
ControlledT decomposition based method (Method 3)
The previously known efficient decomposition of a controlledT is shown in ref.^{12}. We can observe that the middle T gate in ref.^{12} can be replaced with the R_{ n }. In this case, controlledR_{ n } gate can be decomposed into 4 Hadamard gates, 2 Phase gates, 12 CNOT gates, 8 T gates, and 1 R_{ n } gate. This result is the best known to date and is the same as in ref.^{13}. If we use two ancillary qubits, T depth of decompsition of controlledT can be reduced from 5 to 3^{13}. However, if we consider only one ancillary qubit, Tdepth 5 and Tcount 9 are the best results in decomposition of controlledT gate.
Results
In this work, we improve the previous method to three ends: to reduce the total number of gates, achieve an efficient layout and achieve a smaller depth.
Smaller number of total gates (Improvement 1)
We propose an improvement whereby the controlledR_{ n } consists of a lower total number of gates keeping one R_{ n } gate.
Theorem 1.
The controlledR_{ n } gate can be decomposed with at most one ancillary state \(0\rangle \) into one R_{ n }, eight CNOTs, four Hs, four Ts and four T^{†}s.
The proof is given in Section Proofs. The corresponding decomposition is shown in Fig. 3.
The advantage of the proposed method is shown in Table 2. The data were estimated by the ScaffCC program^{14}. In particular, in the case of a controlledT, using ancillary qubits results in an exact decomposition of the controlledR_{ n } and not an approximation. Thus, the gap between Method 1 and Improvement 1 is more larger. The Method 3 is more efficient than the Method 2 in decomposition of controlledT. However, it consist of 12 CNOTs, 4 Hs, 1 P, 1 P^{†}, 5 Ts and 4 T^{†}s. The decomposition includes 27 gates, whereas our decomposition includes only 21 gates. In more detail, Tcount is the same for ref.^{12} and our method. However, the advantage of our method is reduction by 4 CNOT gates and 2 Phase gates. The reduction of CNOT gates is important since implementation of CNOT gates is physically not easy and controlledR_{ n } is not the final algorithm^{15,16}. Thus, its impact in quantum algorithms will be large. For example, according to module count analysis of ScaffCC Program^{14} for Shor’s algorithm, the controlledT gate is used 641,990,656 times in total. This means that reducing 6 gates in decomposition of the controlledT gate reduces 3,851,943,936 gates in computing of Shor’s algorithm.
Efficient layout (Improvement 2)
For practical quantum computing, we should consider the layout of quantum circuits. Since nonlocal twoqubitgate operation is not allowed in general, a longrange CNOT gate is implemented with several adjacent SWAP gates. In the following theorem, we present an efficient decomposition of a controlledR_{ n } gate without using nonlocal CNOT gates.
Theorem 2.
A controlledR_{ n } gate can be implemented under the nearestneighborinteractiononly architecture with at most one ancillary state \(0\rangle \) using one R_{ n }, twelve adjacent CNOTs, four Hs, four Ts and four T ^{†}s.
The proof is given in Section Proofs. The corresponding circuit is shown in Fig. 4. Let us consider one longrange CNOT gate, where the control qubit is the first qubit and the target qubit is the third qubit. Naively, we can decompose such a CNOT gate into one adjacent CNOT gate and two swap gates. The swap gates can be decomposed into three CNOT gates. Thus, the longrange CNOT can be implemented with 7 CNOT gates. More efficiently, the longrange CNOT can be implemented with only 4 CNOT gates^{17}. Thus, Method 2 consists of 1 R_{ n }, 28 adjacent CNOTs, 4 Hs, 8 Ts and 6 T ^{†}s, while Improvement 2 consists of 1 R_{ n }, 12 adjacent CNOTs, 4 Hs, 4 Ts and 4 T ^{†}. Therefore, using our method, we use 16 fewer CNOT gates, 4 fewer T gates and 2 fewer T ^{†} gates.
Smaller depth (Improvement 3)
The depth of a circuit means the length of the critical path of the circuit. To ensure an efficient run time of a practical quantum computer, the depth of a circuit should be minimized. For this purpose, we propose a circuit with a smaller depth for a controlledR_{ n }.
Theorem 3.
While maintaining the R_{ n }type gate depth 1, the controlledR_{ n } can be implemented with at most one ancillary state \(0\rangle \) with a depth of 5 gates in \(\{adjacent\,CNOT,{R}_{n+1},{R}_{n+1}^{\dagger }\}\).
The proof is given in Section Proofs. The corresponding circuit is shown in Fig. 5. Method 2 for the controlledR_{ n } has a depth of 25, while this circuit only has a depth of 5. Although Method 1 only has a depth of 4, the depth after the approximation of the R_{ n }type gates is nearly twice that of Improvement 3.
We note that from Fig. 8.(a) in ref.^{12}, controlledS gate can be decomposed in a depth of 5. However, in the decomposition, two longrange CNOTs is used. Thus, in order to represent controlledS gate only with adjacent CNOTs and R_{ n }type gates, the longrange CNOTs should be transformed into several adjacent CNOTs or layout of qubits should be changed. That is, more resources than in the method of in Fig. 5 are required. According to module count analysis of ScaffCC Program^{14} for Shor algorithm, the controlledS gate is used 641,013,760 times in total. This means that reducing one depth in decomposition of the controlledS gate affects 641,013,760 computing in Shor’s algorithm.
Efficient decomposition of the quantum Fourier transform
The quantum Fourier transform (QFT) is the key ingredient for quantum factoring and many other quantum algorithms^{2}. The total number of gates of the QFT for n qubits (denoted QFT(n)) is obtained as
Now, we compare the total number of gates for the QFT by applying each decomposition method. QFT_{M1}(n), QFT_{M2}(n), QFT_{M3}(n) and QFT_{I1}(n) denote the total number of gates by Method 1, Method 2, Method 3 and Improvement 1, respectively, as follows:
where c means average number of gates over 10,000 runs for an approximation of R_{ n } with angle π/2^{n−1} corresponding to the precision of Table 1. For example, if a precision ε = 10^{−10} then c = 253. Thus, the benefit of Improvement 1 for Method 3 is obtained as
for n. In this paper, we only consider the error rate in approximation of R_{ n } gate not the overall error rate in approximation of QFT. However, we can notice that Method 3 and Improvement 1 have the same number of R_{ n } gate, and Improvement 1 has smaller number of gates than Method 3. Thus, the overall error rate in approximation of QFT for Improvement 1 might be not greater than that for Method 3. From Table 3 and the above Equations (5–8), it is shown that Improvement 1 is more efficient than Method 1, Method 2 and Method 3.
Discussion
We have investigated the decomposition problem for the controlledR_{ n } gate since it is an important twoqubit gate. One method has been proposed that utilized a single ancillary qubit to reduce the number of gates. In this work, we have extended this method for three purposes: to reduce the number of gates, to find a good mapping for an architecture with only nearestneighbor interactions, and to minimize the critical path. Specifically, we have realized that the proposed method reduces the number of gates for the quantum Fourier transform.
As future work, we will consider three issues. First, we need to check whether the proposed methods are optimal. In addition, it would be interesting to investigate how much performance gain is possible for quantum algorithms such as Shor’s factoring algorithm since it heavily uses the quantum Fourier transform. For more general situations, we need to develop a decomposition method for controlled multiqubit unitary transforms.
Proofs
Proof of Theorem 1.
Proof.
Let \(\psi \rangle \) be an arbitrary twoqubit state. Then, \(\psi \rangle \) can be represented as
where α_{ i } are complex numbers and \({\sum }_{i=00}^{11}{\alpha }_{i}{}^{2}=1\). Thus,
Let an unitary operator U be an operator denoted by
where C_{ ij } denotes a CNOT gate with control qubit i and target qubit j. Then,
Thus,
Proof of Theorem 2.
Proof.
Let \(\psi \rangle \) be an arbitrary twoqubit state. Then, \(\psi \rangle \) can be represented as
where α_{ i } are complex numbers and \({\sum }_{i=00}^{11}{\alpha }_{i}{}^{2}=1\). Let an unitary operator U be the operator denoted by
where C_{ ij } denotes a CNOT gate with control qubit i and target qubit j. Then,
Thus,
Proof of Theorem 3.
Proof.
Let \(\psi \rangle \) be an arbitrary twoqubit state. Then, \(\psi \rangle \) can be represented as
where α_{ i } are complex numbers and \({\sum }_{i=00}^{11}{\alpha }_{i}{}^{2}=1\). Then,
where C_{ ij } denotes a CNOT gate with control qubit i and target qubit j.
Thus,
References
Preskill, J. Quantum Computing in the NISQ era and beyond, arXiv:1801.00862, 2018.
Nielsen, M. and Chuang, I. Quantum Computation and Quantum Information, Cambridge University Press, 2000.
Shor, P. W. Polynomialtime algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26, 1484–1509 (1997).
Kliuchnikov, V., Maslov, D. & Mosca, M. Practical Approximation of SingleQubit Unitaries by SingleQubit Quantum Clifford and T Circuits. IEEE Transactions on Computers 65, 161–172 (2016).
Kitaev, A. Y., Shen, A. H., and Vyalyi, M. N. Classical and Quantum Computation, ser. Graduate studies in mathematics, v. 47, Boston, MA, USA: American Mathematical Society, 2002.
Bocharov, A., Roetteler, M. & Svore, K. M. Efficient synthesis of probabilistic quantum circuits with fallback. Physical Review A 91, 052317 (2015).
Bocharov, A., Roetteler, M. & Svore, K. M. Efficient synthesis of universal repeatuntilsuccess quantum circuits. Phys. Rev. Lett. 114, 080502 (2015).
Kliuchnikov, V., Maslov, D. & Mosca, M. Fast and efficient exact synthesis of singlequbit unitaries generated by clifford and T gates. Quantum Information and Computation 13, 0607–0630 (2013).
Ross, N. J. & Selinger, P. Optimal ancillafree clifford +T approximation of Zrotations. Quantum Information and Computation 16, 0901–0953 (2016).
Barenco, A. et al. Elementary gates for quantum computation. Physical Review A 52, 3457 (1995).
Kitaev, A. Quantum measurements and the Abelian Stabilizer Problem. quantph/9511026 (1995).
Amy, M., Maslov, D., Mosca, M. & Roetteler, M. A meetinthemiddle algorithm for fast synthesis of depthoptimal quantum circuits. IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems 32, 818–830 (2013).
Selinger, P. Quantum circuits of Tdepth one. Physical Review A 87, 042302 (2013).
JavadiAbhari, A. et al. ScaffCC: A Framework for Compilation and Analysis of Quantum Computing Programs. ACM International Conference on Computing Frontiers (CF 2014), Cagliari, Italy, May 2014; https://github.com/epiqc/ScaffCC.
Shende, V., Bullock, S. & Markov, I. Synthesis of Quantum Logic Circuits. IEEE Transactions on ComputerAided Design 25, 1000 (2006).
Sedlák, M. & Plesch, M. Towards optimization of quantum circuits. Central European Journal of Physics 6, 128 (2008).
Viamontes, G. F., Markov, I. L., and Hayes, J. P. Quantum Circuit Simulation, Springer, 2009.
Kliuchnikov, V. New methods for Quantum Compiling. UWSpace, http://hdl.handle.net/10012/8394 (2014).
Acknowledgements
We thank Ali JavadiAbhari for his support in the use of the ScaffCC software. This work was supported by a Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government [17ZH1200, Research and Development of Quantum Computing Platform and its CostEffectiveness Improvement].
Author information
Authors and Affiliations
Contributions
T. Kim wrote the manuscript, and B.S. Choi revised the manuscript. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kim, T., Choi, BS. Efficient decomposition methods for controlledR_{ n } using a single ancillary qubit. Sci Rep 8, 5445 (2018). https://doi.org/10.1038/s4159801823764x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4159801823764x
This article is cited by

Resourceefficient digital quantum simulation of dlevel systems for photonic, vibrational, and spins Hamiltonians
npj Quantum Information (2020)

Dynamic Concatenation of Quantum Error Correction in Integrated Quantum Computing Architecture
Scientific Reports (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.