A Higher radix architecture for quantum carry-lookahead adder

In this paper, we propose an efficient quantum carry-lookahead adder based on the higher radix structure. For the addition of two n-bit numbers, our adder uses \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n)-O(\frac{n}{r})$$\end{document}O(n)-O(nr) qubits and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(n)+O(\frac{n}{r})$$\end{document}O(n)+O(nr) T gates to get the correct answer in T-depth \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(r)+O(\log {\frac{n}{r}})$$\end{document}O(r)+O(lognr), where r is the radix. Quantum carry-lookahead adder has already attracted some attention because of its low T-depth. Our work further reduces the overall cost by introducing a higher radix layer. By analyzing the performance in T-depth, T-count, and qubit count, it is shown that the proposed adder is superior to existing quantum carry-lookahead adders. Even compared to the Draper out-of-place adder which is very compact and efficient, our adder is still better in terms of T-count.


Introduction
As the field of quantum computing has been gaining momentum over the last few years, the need for optimizing quantum circuits is also growing.Quantum adders are one of the most important basic components of quantum computing circuits.The continuous development of quantum adders has not only improved the efficiency of small basic quantum computing circuits such as multiplication circuits but also has a significant effect on some prominent large quantum circuits.On the one hand, an efficient quantum adder can increase the speed of quantum addition and multiplication operations and reduce the cost of the required resources.On the other hand, quantum adders are widely used in Shor's algorithm 1 which plays an important role in the field of public key cryptography.Therefore, an efficient quantum adder not only has significant financial benefits but also makes a crucial contribution to the development of quantum computing.
Even though the adders are highly analyzed in classical computing, we observe that the niche is still not properly studied in the quantum paradigm.In the field of quantum adders, the quantum ripple carry adder (RCA) 2,3 was first proposed.However, the T-depth of quantum RCAs increases linearly with the number of input qubits, which means they need a long time to perform the operation.Then some quantum carry look-ahead adder (CLA) designs such as Draper's logarithmic adder 4 , whose T-depth increases logarithmically with the number of input qubits, have been proposed to get further efficiency gains.In the field of the classical adder, Gurkaynak 5 et al. found that increasing the radix of the CLA can effectively decrease the computation time of classical CLAs.However, the problem of quantum CLA implementation has not been explored yet, to the best of our finding.
The objective of this work is to explore the potential of higher radix strategy in improving the performance of quantum arithmetic circuits.As we will see later, high fan-out is a challenge for quantum adder.Using the idea of separating the propagation and summation in the Manchester Carry Chain (MCC), we avoided this problem and proposed an innovative quantum higher radix adder.Specifically, this circuit can be divided into two parts.Firstly, in the higher radix part, we use Gidney's Logical-And 6 and Selinger's Multi-control Toffoli construction method 7 to implement a general quantum higher radix circuit.Secondly, in the MCC part, we chose the Brent-Kung structure as the carry path and Gidney's RCA as the sum path after carefully analysing all possible carry propagation structures and sum paths.By integrating the quantum higher radix and MCC parts, we propose a quantum higher radix adder.
This work improves the efficiency of quantum circuits at different scales by proposing an innovative higher radix adder circuit.It is hoped that our paper can contribute to a deeper understanding of the great potential of multi-control Toffoli gates and higher radix strategy in improving the performance of quantum arithmetic circuits.
The remainder of this paper is divided into eight parts.The next part (Section 2) introduces prominent previous research works.Following that, the second part (Section 3) describes the implementation details of the higher radix layer and the whole structure of the quantum higher radix adder.We present the evaluation results in the third part (Section 4).We conclude the paper in thereafter (Section 5), though some additional information/discussion/example can be found in the three subsequent

Method
In this section, we first introduce C n NOT which is an important basic component of the quantum higher radix adder.Specifically, our method is divided into three steps.In the first step, we describe how to construct the higher radix layer based on the C n NOT gate.In the second and third steps, the Brent-Kung structure and Gidney's RCA are chosen as the Carry path and Sum path by analyzing five carry-propagate structures and two sum structures, respectively.At the end of this section, we describe how to construct the overall higher radix circuit in detail.

Basic Component: C n NOT
C n NOT is a basic component of the quantum higher radix adder.In order to construct a cheap C n NOT gate, we use and optimize Gidney's Logical-And structure.

2/19
In Gidney's paper 6 , the first formula in Equation ( 1) is used to define the special state |T in the Logical-And structure (Figure 1(a)).According to it, we can apply a Hadamard gate first and then a T gate on an ancilla with state |0 to obtain |T .However, we found that it is also possible to use ancilla with state |1 instead of |0 to construct this structure.The related formula is shown in the lower part of Equation (1).In brief, we can perform operations such as NOT before Logical-And structure, which can be used to reduce qubits required for our quantum adder in later sections.After expanding the scope of application of Logical-And, we then describe the specific structure and decomposition method of the proposed multi-control Toffoli gate.
As shown in Table 1, we found that as the number of control qubits increases, the C n NOT gate can effectively reduce the average T-count, T-depth, and QC per control qubit.In order to reduce the cost required by the circuit, we use multi-control Toffolis in this paper.In 2013, Selinger 7 proposed a general method for constructing C n NOT gates using Clifford gates and T gates.This paper optimizes Selinger's general method by referring to Logical-And structure 6 and another Toffoli decomposition method 12 .Figure 2 shows the specific structure of our multi-control Toffoli gate.Based on Logical-And structure, we take C 3 NOT as an example to show how to construct multi-control Toffoli gates.In Figure 2(a), we show how to construct an unpaired C 3 NOT gate with Clifford + T gates.For a pair of C 3 NOT gates, Figure 2(b) and 2(c) introduce the computation and uncomputation circuits, respectively.For Figures 2(a), 2(b) and 2(c), the first circuits from left to right are the original designs, and the second circuits show how to use Toffolis to decompose circuits in the first column.Besides, the third circuits use Logical-And structure to decompose the original circuits.
Specifically, using computation and uncomputation structures of Logical-And to decompose the rest of the Toffoli gates except the middle one can effectively reduce the T-count and T-depth of our multi-control Toffolis.As a result, the efficiency of the whole circuit is increased.
As shown in Figure 3, there are five general decomposition methods for the middle unpaired Toffoli gate. 7,12 ccording to Table 2, we can find that all decomposition methods use 7 T gates.Since the proposed higher radix adder has a high QC, we do not wish to introduce more ancilla qubits by using decomposition methods with ancilla bits.Although the T-depth of

3/19
the quantum circuit that we obtain using Method 5 to decompose one unpaired Toffoli is only 1, it will introduce 4 additional ancillae, which will greatly increase qubits to construct a CLA, so we do not choose this decomposition method.Similarly, Method 4 also introduces extra ancilla qubits.Among Methods 1, 2, and 3, Method 3 has the highest efficiency because it has the smallest T-depth.Hence, we choose Method 3 to decompose all the unpaired Toffolis in our work.

Step 1: Higher Radix Layer
In this section, we introduce the higher radix strategy and implementation details of how to apply it to quantum circuits.Radix is the bit-width of each CLA block in carry look-ahead adders.For convenience, we use r to represent radix.In the classical world, the higher radix adder has shown its advantages 5 .By increasing the radix, an adder with higher fan-in and fan-out is constructed, which can reduce the propagation time of propagates (p) as well as generates (g), and improve the efficiency of classical adder.We firstly take the radix 2 CLA as an example to calculate the sum of two binary numbers a and b.We need to get the carry c after 2-step calculations.In the first step, we calculate the p and g of each bit.For the i th bit, we use the formulas 1 and 2 to compute the corresponding p i and g i , respectively.In the second step, after computing the propagation of p and g to obtain the intermediate quantities P and G, we can calculate the carry c.Here, we assume that j is less than or equal to i. Equations (4), (5)   4/19 and (6) describe how to propagate p and g from bit i to bit j.After using the formulas (7), we are able to obtain the final carry c.The standard notations of ⊕ for logical XOR, •, +, • for logical AND, logical OR and propagation, respectively are here.
As shown in Figure 4, we use multi-control Toffoli gates to construct the quantum propagation structures with radix 2, 3, and 4 respectively.Similarly, the higher radix strategy also reduces the number of propagation layers of p and g, thereby reducing T-count required for the addition operations, and further effectively reducing the operation time.However, due to the extremely high cost of fan-out larger than 2, the quantum higher radix strategy proposed only focuses on increasing the fan-in of the adder.In this paper, we try to construct a quantum higher radix adder, which means we need to construct the higher radix layer which is based on multi-control Toffolis for the propagation of p and g.Figures 4(b) and 4(c) show the specific circuits for the higher radix structure with radix 3 and radix 4 as respective examples.

Step 2: Carry Path
In this section, we describe the details of our carry path.Through the calculation in the previous section, we have already obtained p and g for each bit.In order to get the carries, we need to select a particular carry-propagate structure as the carry path to propagate p and g.As shown in Figure 6, five different propagation structures are proposed in the literature, which are subsequently discussed.• Sklansky.In 1960, J. Sklansky 13 proposed a conditional CLA adder with high fan-out nodes and minimal depth.The structure of it is shown in Figure 6(a).
• Kogge-Stone.The Kogge-Stone structure 14 was published in 1973, which has a low depth but a high number of nodes.
The structure is shown in Figure 6(b).
• Ladner-Fisher.As shown in Figure 6(a), the topology of Ladner-Fisher 15 looks the same as the Sklanskly structure.Hence, it also has low depth but high fan-out nodes.However, there are some differences between these two structures in the application.

5/19
• Brent-Kung.The Brent-Kung structure 8 is one of the most important propagation structures.Compared to other structures, this structure has a very small number of nodes as well as low fan-in and fan-out, despite having a large logic depth.Therefore, it is widely used in quantum CLA designs.
• Han-Carlson.In 1987, the Han-Carlson structure was first proposed 16 .In order to improve the overall efficiency of the propagation, it combines the Brent-Kung and Kogge-Stone structures together.
The propagation operations are the main cost of the carry path.Furthermore, since qubit can not be copied, the carrypropagate structures with fan-out larger than 2 introduce additional cost.Therefore, as shown in Figure 6, the Brent-Kung structure which has the smallest number of propagate operations and low fan-out is selected as the carry path for the p and g propagation in our paper.
(a) J. Sklansky and Ladner-Fisher . Carry-propagate structures, where the green node represents one propagate operation.

Step 3: Sum Path
We then discuss the implementation details of the sum path.As described in the previous sections, the final sum can be calculated by feeding carries into the sum path.For the sum path, we can choose between CSA and RCA.
• RCA.The general structure of RCA is shown in Figure 7(a).As discussed in Section 2, there are various quantum RCAs have been proposed so far.In this paper, we choose the most efficient Gidney adder which has the minimum T-count and T-depth as our RCA structure.
• CSA.As shown in Figure 7(b), the Carry Select adder consists of two Ripple Carry adders and one select circuit.
In the first part, we built two quantum RCAs with the same structure.The input carry bit of them are set to 0 and 1.Therefore, using these sub-circuits, we can obtain the sum when the inputs are 0 and 1, respectively.In both the classical and quantum worlds, this part can be computed in parallel with the carry path, thus effectively reducing the time cost.

6/19
In the second part, we construct a select sub-circuit.Suppose we know that the real input carry is c, which can only be 0 or 1. Depending on c, the sum calculated by the corresponding RCA is then chosen as the final result.More specifically, when c is equal to 0, we choose the sum of the quantum RCA whose input carry is 0 as the final sum.Similarly, when c = 1, the final result is the sum of quantum RCA whose input carry is 1.
It is worth noting that the quantum CSA contains expensive CSWAP gates which can be decomposed by Clifford+T gates.As shown in Figure 8, there are 2 decomposition methods.For Method 1 12 , the CSWAP gate is decomposed into a Clif-ford+T circuit with 7 T gates and T-depth of 4. If we use Method 2 (this is adopted from Quipper documentation (https: //www.mathstat.dal.ca/~selinger/quipper/doc/Quipper-Libraries-GateDecompositions.html.)), the CSWAP gate is decomposed into a circuit with 7 T gates incurring the T-depth of 3.
In this paper, we use CSA1 to denote the CSA structure whose CSWAPs are decomposed by Method 1.Similarly, CSA2 represents the CSA whose CSWAPs are decomposed by Method 2.
After determining the design details of these sum paths, we performed a systematic analysis of their performance.As shown in Table 3, the RCA structure is always cheaper than the CSA structures in terms of T-count, T-depth and QC.Therefore, we choose Gidney's RCA as our sum path.In this paper, we only use one higher radix layer.As shown in Figure 9(a) (This diagram was inspired from https://web.stanford.edu/class/archive/ee/ee371/ee371.1066/lectures/lect_04.pdf), the classical higher radix adder applies the higher radix strategy to every layer of the Brent-Kung tree, thus reducing the depth of computation from log2 n to log n r .In this part, we explain why do we not follow the same example from the classical computing.According to Figure 10, this is due to the fact that if the strategy is used at every layer, RCAs with large T-depth will be introduced in the sum path, which deprives our higher radix adder of the significant advantage of low T-depth.Therefore, we recommend using the higher radix strategy only for the first few layers of the quantum Brent-Kung Tree.

Complete circuit
The structure of the proposed quantum higher radix circuit is shown in Figure 11.It can be divided into 7 stages.
• Notations.The binary bit-width of the addends is denoted as n, and r denotes the value of the radix.The binary expansion of the number a is denoted as a = a n−1 a n−2 • • • a 0 , where a n−1 is the most significant bit and a 0 is the least significant bit.For circuit decomposition, paired Toffoli gates are decomposed into Logical-And, while unpaired Toffoli gates are decomposed using Method 3 shown in Figure 3. Thus, TC 3 is equal to 7 and T D 3 is equal to 3.
• Step 1.In this step, our task is to calculate the p and g.Since we do not need the most significant carry, no operation is performed on the most significant group.By taking a i , b i as control qubits and an ancilla with initial state |0 as the controlled qubit, we apply the CCNOT gate to compute g i and then store it in the corresponding ancilla.After that, we use a i as the control and b i as the controlled qubit to apply the CNOT gate.As a result, the corresponding p i is stored in the corresponding b i position.Some Toffoli gates are unpaired in the whole circuit, we decompose those Toffoli gates by using Method 3.
For convenience, α is introduced to denote the addend qubits in the most significant group.This step requires TC 3 •(n−α) T-count, T D 3 T-depth, and extra ancilla qubits are 3 • Step 2. In the second step, we group the initially obtained p and g by using the higher radix structure.Specifically, we construct the corresponding higher radix structure according to the method shown in Figure 4, and then apply it to the corresponding g i and p i calculated in step 1 to obtain g group and p group .Since the controlled qubits of the last Toffoli will be used to store the carry later, no uncomputation is performed on it.Hence, the last Toffoli is always unpaired, we only decompose it using Method 3, but decompose the rest of the Toffolis into Logical-And.
For convenience, β and ρ are introduced to represent a complex intermediate variable for constructing multi-control Toffolis and the number of groups divided, respectively.In step 2, the required T-count is ρ • Step 3. In step 3, we construct the Brent-Kung tree using the p group and g group processed by the higher radix structure to calculate the carry path.We tried using Logical-And here, but found no benefit.Hence, here all the Toffolis are decomposed by Method 3.
In this step, the required T-count is 2 , and the number of extra ancilla qubits is 2 • Step 4. In this step we uncompute the operation of calculating the intermediate p in step 3. We repeat the calculation of all Toffolis for the intermediate variable p in the reverse order.
In step 4, the required T-count is ).Here we do not need any extra ancilla qubit.
• Step 5.Here we uncompute Step 2. We just repeat the same Toffolis from Step 2 in reverse order, except for the last one.
Since we decompose them into Logical-And structures, no additional cost is needed in this step.
• Step 6.In step 6, we restore the original binary addends a and b by applying the NOT gate and Toffoli on p and g.In order to store the corresponding b i in the corresponding qubits, we apply the CNOT gate by taking a i as the control qubit and b i as the controlled qubit.Since the Logical-And that used previously introduces the measurement operation, so we then uncompute only the Toffoli gates which are applied to the least significant qubits of each group.
In this step, the required T-count is ρ • TC 3 , and the T-depth is T D 3 .We do not need any extra ancilla qubits.

9/19
• Step 7. In this step, we construct the Gidney's RCA 6 for every group to calculate the sum.
In step 7, the required T-count is 4 • (n − n r ), the T-depth is r, and the number of extra ancilla qubits is α − 1 + (r − 2) • ρ.An example of an addition operation performed by the higher radix adder is shown in Figure 11.We use seven colors to divide the whole circuit from step 1 to step 7 from left to right.The radix of this adder is set to 3 and the inputs (i.e., addends) are two 15-bit binary numbers denoted by a and b.By using this quantum circuit we can correctly get the sum of these two numbers.In order to show the overall structure more clearly, we use • to represent the controlled-NOT operation of the Logical-And structure in Figure 11.
Interestingly, there are two special cases.When r ≥ n, our adder is Gidney's RCA.When radix is equal to one, our adder is a simple CLA.In summary, the overall cost of our circuit is shown below.

T-count
It can be observed that the circuit structure of the quantum higher radix adder varies with radix.In the next section, we will discuss how radix affects the performance of our adder and compare it with other well-known work.

Results and Discussions
• Experiment 1: The effect of radix.
Figure 12 shows T-count, T-depth, and QC for nine different higher radix adders with radix from 1 to 9, respectively.It is clear that when the radix is fixed, the performance of our adder varies for different input sizes.As the input size increases, the overall cost is higher, which means that the larger the input size, the more complex and expensive any adder tends to be.Since an increase in input size means an increase in the number of operations, this can directly result in an increase in circuit scale.For larger circuit, more expensive cost is often required in terms of T-depth, T-count and QC.
Interestingly, for a fixed input size, increasing the radix does not reduce the cost monotonically.In this paper, the higher radix adder with r equal to 1 is a CLA, while r equal to 2 represents the higher radix adder without multi-control Toffolis.Compared to higher radix adder with r equal to 1, the cost of it with radix 2 is reduced in T-count, T-depth and QC, which means that the higher radix layer can effectively optimize the circuit even without introducing multi-control Toffoli.When r is larger than 2, our adder is a hybrid of quantum RCA and CLA.T-count and T-depth decrease first and then increase as the radix increases in the range where r is less than the input size.Meanwhile, QC increases steadily in the fluctuation.When r is equal to input size, QC drops abruptly.This dramatic change is caused by the transformation of our higher radix adder from the hybrid to Gidney's RCA.
It can be seen that the performance of the higher radix adder is significantly influenced by the radix.Therefore, by changing it, we can adapt the proposed adder to the specific requirements of different scenarios as well as minimize the overall cost.In general, when QC is more expensive, a small radix should be set to avoid introducing too much ancillas, while when T-depth or T-count is more expensive, we can try to set large radix to make T-depth or T-count small.
• Selection of Best Radix.
The best radix for T-depth, T-count and QC is defined as the radix that leads to the lowest cost of our adder in terms of T-depth, T-count and QC, respectively.As shown in Figure 12; the fluctuations of T-depth, T-count, and QC as r increases are different.Hence, the corresponding best radix may be different for T-depth, T-count, and QC.According to the formulae for cost (given in Equations ( 11), ( 12) and ( 13)), the corresponding optimum radix can be determined.See Appendix C for more details.
The performance of our adder compared to other well-known quantum adders is shown in the following.We summarize the cost formulas of them in Table 4 and 5. Based on them, the relevant data is visualized in Figures 13 and 14.
Firstly, we describe some important experimental details.In order to evaluate our adder more objectively, three Toffoli decomposition methods are used to decompose adders into three different versions.For adders which are denoted by , all the Toffoli pairs are decomposed using the Logical-And structure, and then the rest are decomposed using Method 3 mentioned in Section 3.For adders which are denoted by , all the Toffolis are decomposed by Method 3.
For adders which are denoted by •, only Gidney's RCAs 6 are decomposed using the Logical-And structure, and the rest  are decomposed by Method 3.Among them, the adders which are denoted by and have smaller qubits because that Logical-And structure is not used before the sum path, which means some ancilla can be reused in sum path after the uncomputation.For the adders which are denoted by •, the overall T-depth and T-count is reduced at the cost of QC.This is due to that it uses Logical-And structures wherever possible.
Then we compare the proposed adder with other well-known works.Compared to quantum RCAs, our adder consistently has a significant advantage in terms of T-depth, despite having more T-count and QC.Compared to quantum CLAs, our adders which are denoted by and • have similar T-count and T-depth, but significantly smaller T-count.For our adder which is denoted by , it significantly reduces the T-depth and further reduces the T-count at the cost of a slight increase in qubit.Since Draper's out-of-place adder 4 does not need to be complexly uncomputed like the in-place one.Therefore, we construct a simplified version of our higher radix adder to objectively compare with.According to Fig. 14, our adder slightly increases T-depth and QC, but significantly decreases T-count.Moreover, compared to Takahashi adder 9 which is a special quantum CLA that introduces grouping idea, all the versions of the higher radix adder have similar QC and significantly smaller T-count.For T-depth, our adders which are denoted by and • are similar to Takahashi adder, but our adder which is denoted by has a huge reduction.Besides, the higher radix adder is also compared with Takahashi combination adder 10 , which also combines RCA and CLA.Although our QC is larger, the T-count and T-depth of our adder have different degrees of reduction.It is obvious that our structure is more general and flexible, and further improves the overall efficiency.
In general, the higher radix adder needs more qubits as a cost to significantly reduce the overall T-count and T-depth compared to other adders.
• Connecting with existing quantum adders.
Our work can be seen as a bridge to connect existing quantum adders.Figure 15 illustrates the general framework of it, whose key parts are the carry path and the sum path.
For the carry path, quantum CLAs can be used to compute specific carries.For the sum path, any quantum adder can be used to calculate the final result based on those carries.It is interesting to note that the cost contribution of the carry path and sum path in the total circuit is adjusted by changing radix.More specifically, when the radix is large, the number of groups which is divided by the higher radix layer is small.Hence, the carry chain is short, which means the sum path is a larger cost contribution of the overall circuit than the carry path.On the contrary, when the radix is small, the carry chain is longer, which means the carry path accounts for a large portion of the total circuit.
According to this general framework, this work can be seen as a specific example based on Draper's CLA and Gidney's  To simplify the representation, we compare the costs of different higher radix adders with r from 3 to n − 1 and then use the lowest cost to represent the performance of our adder.The formula for ω   5. Performance analysis of quantum out-of-place CLAs.Since Draper's out-of-place adder does not need to be complexly uncomputed like the in-place adder, it is unfair to compare our adder directly to it.Therefore, we construct a simplified version of our higher radix adder to compare with.The relevant formulas are shown below.RCA.Specifically, our carry path uses the same Brent-Kung tree structure as Draper's CLA, and our sum path is Gidney's RCA.Apart from these two adders, other quantum adders can also be used to construct a higher radix adder.

Carry
Higher radix structure  In order to support one to construct the cheapest quantum adder in different scenarios quickly and easily, we summarize the performance of well-known quantum adders in Fig. 16.When QC is more expensive, it is more suitable to use adders with less ancilla such as Takahashi RCA.For T-count, RCAs such as Gidney's RCA can effectively reduce the overall cost.For reducing T-depth, it is recommended to integrate Draper's In-place CLA or other quantum CLAs within higher radix framework described in Fig. 15.

Conclusion
Quantum adder is one of the most fundamental components in quantum computing.Therefore, designing a quantum adder with lower cost is of great significance for establishing a more efficient and cheaper large-scale quantum circuit.This paper proposed an efficient quantum circuit for integer addition by introducing techniques from classical higher radix carry-lookahead adder and Manchester Carry Chain adder.In terms of T-depth and T-count, the proposed circuit is superior to all the existing quantum carry-lookahead adders except Draper Out-of-place CLA.Compared with Draper's Out-of-place CLA, the proposed higher radix adder has significantly lower T-count with comparable QC and T-depth.Due to practical constraints, we only analyzed three main quantum circuit complexity metrics, T-count, T-depth, and QC.
In the future, one may be interested in how to automatically design the best adder based on specific cost constraints and how to accurately and quickly tune the radix to obtain the most efficient adder.Additionally, exploring the T-count, T-depth and QC limits of quantum addition is also a meaningful and challenging problem.Finally, comparing various adder designs considering practical constraints, such as quantum error correction (QEC) and topological structures is an important open problem.
• r = 5.To begin with, since we do not need to compute the carry of the most significant bit, the most significant p and g are ignored.Firstly, the higher radix structure divides the remaining p and g into groups of five each, yielding a p group and a g group .In the carry path, according to the Brent-Kung structure, we do not need any operation.In this case, c 5 and g group are equal, both being 0. In the sum path, the ripple carry structure is used to add a, b; and the computed carry c by the group to obtain the sum s.
Higher radix structure: Carry path: • r = 1.When the radix is one, it is a special case.Since the higher radix structure does not work, it is essentially a CLA.

B Derivation Details of Cost Formulae
Here, we show the details of adding the sub-cost formulae of the seven steps to obtain the final cost formulae in section 3.5.In Derivations ( 14), ( 15), ( 16) respectively; we show the step-by-step derivation for the T-count and T-depth.

T-count
T-depth

C Optimum Radix
According to the cost formulae in Table 4, here we will analyze how to find the best radix in the case of large input size in detail.First, we divide the radix into three cases: small, medium and large, and use r s , r m and r l to denote them respectively.Furthermore, n is used to denote input size and r is used to denote the radix.It is assumed that O(r) is equal to O(1) in the r s case, O(r) is equal to O( √ n) in the r m case, and O(r) is equal to O(n) in the r l case.Besides, r should be strictly less than or equal to n.
For our adder which is denoted by , the asymptotic complexity of T-count, T-depth and QC is shown below.
For T-depth, the best radix is in the range of r s or r l .For T-count, the best radix is in the r s range.For QC, the best radix is in the range of r l .
For our adder which is denoted by •, the asymptotic complexity of T-count, T-depth and QC is shown below.For T-depth, the best radix is in the range of r l .For T-count, the best radix is in the r s range.For QC, the best radix is in the range of r l .
For our adder which is denoted by , the asymptotic complexity of T-count, T-depth and QC is shown below.For T-depth, the best radix is in the range of r l .For T-count, the best radix is in the r s range.For QC, the best radix is in the range of r s .

Figure 5 .
Figure 5. Chronology of publication of carry-propagate structures.

Figure 15 .
Figure 15.A general framework of higher radix adder.

4 .= 3 .= 2 .
Similar to radix 5, we show radix-4 addition below.Higher radix structure:Carry path: Similarly, the following is the process for an adder with radix 3.Higher radix structure: Carry path: The overall calculation process is shown as follows.Higher radix structure:Carry path:

Table 2 .
Summary table of Toffoli decomposition.

Table 3 .
Different structures of sum path (r = n).When calculating T-depth, we assume that Part 1 of the CSA has been completed when calculating the carry path.Therefore, for CSA1 and CSA2, T-depth equals to the T-depth of Part 2, which is the minimum T-depth of sum path for CSAs.

Overall Structure of Quantum Higher Radix Adder One Higher Radix Layer
Comparision of the cost required by quantum higher radix adders with different radix and different sizes.

Table 4 .
Performance analysis of different quantum adders.