Generalizing the inverse FFT off the unit circle

This paper describes the first algorithm for computing the inverse chirp z-transform (ICZT) in O(n log n) time. This matches the computational complexity of the chirp z-transform (CZT) algorithm that was discovered 50 years ago. Despite multiple previous attempts, an efficient ICZT algorithm remained elusive until now. Because the ICZT can be viewed as a generalization of the inverse fast Fourier transform (IFFT) off the unit circle in the complex plane, it has numerous practical applications in a wide variety of disciplines. This generalization enables exponentially growing or exponentially decaying frequency components, which cannot be done with the IFFT. The ICZT algorithm was derived using the properties of structured matrices and its numerical accuracy was evaluated using automated tests. A modification of the CZT algorithm, which improves its numerical stability for a subset of the parameter space, is also described and evaluated.


SUMMARY OF THE SUPPLEMENTARY APPENDICES
Appendix A gives an example with a 3 × 3 CZT and ICZT. The example uses expanded matrix notation to visualize the structured matrix equations that describe the CZT and the ICZT. It also shows how the inverse matrixŴ −1 can be expressed as the difference between two products of triangular Toeplitz matrices.
Appendix B describes a popular method for multiplying a Toeplitz matrix by a vector in O(n log n) time that uses circulant matrix embedding. This is the default approach that is used by both the CZT algorithm and the ICZT algorithm.
Appendix C describes another method for multiplying a Toeplitz matrix by a vector that uses Pustylnikov's formula. It also runs in O(n log n) time and can be used to implement alternative versions of the CZT and the ICZT algorithms.
Appendix D shows how to multiply a circulant matrix by a vector in O(n log n) time using FFT and IFFT. It also shows how to multiply a skew-circulant matrix by a vector in O(n log n) time. These subroutines are used by the Toeplitzvector multiplication algorithms in Appendices B and C.
Appendix E shows how to express the inverse of a symmetric Toeplitz matrix. It proves a special case of the Gohberg-Semencul formula, which uses only one generating vector instead of two vectors. It also proves that this generating vector, u, is equal to the first column of the inverse matrixŴ −1 .
Appendix F proves the formula for the elements of the generating vector u. This formula and the results from the previous appendix are used by the ICZT algorithm.
Appendix G describes alternative versions of the CZT and the ICZT algorithms that have improved numerical stability for chirp contours that are growing logarithmic spirals. These versions should be used in all cases when the transform parameter W has a magnitude that is less than 1, i.e., |W | < 1. Depending on the values of M , A, and W , these algorithms can reduce the numerical error by several orders of magnitude. Experimental results that confirm these findings are also included in this appendix.
Appendix H proves how changing the polar angle of the transform parameter A affects the polar angles of all elements in the CZT input vector x. In addition, it shows how a change in A affects the polar angles of the elements in the ICZT output vector x. This appendix also proves that these changes do not affect the norm of x, which was used to simplify the experimental design described in the Methods section.
Appendix I gives approximation formulas for the absolute numerical error of the CZT and the ICZT algorithms. It also gives error formulas for their sequential application, i.e., CZT followed by ICZT or ICZT followed by CZT. Empirical results that confirm these formulas are also provided in this appendix.

Matrix Shape
Generating Vector ( Figure S1. Illustration of the matrix shapes and the generating vector(s) for the different types of structured matrices used in this paper. The diagonal matrix requires only one vector, i.e., the main diagonal d. The Toeplitz matrix and most other matrices shown here can be generated using their first row, r, and their first column, c. The upper-triangular Toeplitz matrix can be fully described using only its first row, but keeping both r and c makes it possible to use algorithms intended for a more general Toeplitz matrix. Similarly, the lower-triangular Toeplitz matrix can be described using only its first column, but the algorithms described here use both r and c. This also applies to the symmetric Toeplitz matrix, for which r = c. The circulant and the skew-circulant matrices can be specified using either their first row or their first column, which leads to alternative versions of the algorithms (here only c is used). The Vandermonde matrix can be generated using only one vector, v, that is equal to its second column. This is possible because each row of the matrix is a (different) geometric progression that starts from 1, i.e., each element of the generating vector specifies the common ratio for the corresponding geometric progression. The DFT matrix (not shown) is another structured matrix that is a special case of a Vandermonde matrix (see Appendix D for an example).

SUPPLEMENTARY APPENDIX A: EXAMPLE
In the 3-by-3 case, the CZT can be expressed with the following matrix equation: Using Bluestein's substitution 5 , we can express the Vandermonde matrix W as a product of a diagonal matrix P, a Toeplitz matrixŴ , and another diagonal matrix Q, i.e., In this case, the expanded matrix equation for the ICZT looks like this: The inverse matrixŴ −1 is given in Eq. (79). The ICZT algorithm, however, does not construct this matrix explicitly. As proven in Theorem 5,Ŵ −1 can be expressed as follows: where the generating vector u = (u 0 , u 1 , u 2 ) is given by Eq. (78). For these values of the elements of u, Eq. (23) has the following form: The Toeplitz matrices A, A T , D T , and D are not constructed explicitly either. Instead, the vector x in Eq. (22), whereŴ −1 is expressed as in Eq. (24), is computed in O(n log n) time by exploiting the structure of these matrices.

SUPPLEMENTARY APPENDIX B: COMPUTING A TOEPLITZ-VECTOR PRODUCT USING CIRCULANT MATRIX EMBEDDING
This appendix describes a popular method for multiplying a square Toeplitz matrix by a vector 22 (see p. 202). This method can be extended to work with rectangular Toeplitz matrices as well. This is the default method that is used by both the CZT algorithm and the ICZT algorithm described in the main paper.
A. Example with M = N = 3 Let T be a 3-by-3 Toeplitz matrix that is generated by its first row r = (r 0 , r 1 , r 2 ) and its first column c = (c 0 , c 1 , c 2 ), where r 0 = c 0 . We wish to compute the product of T with the vector x = (x 0 , x 1 , x 2 ) T , i.e., To do this, we start by embedding the matrix T into an 8by-8 circulant matrixT (which is also a Toeplitz matrix). The vector x is padded with five zeros at the end, which results in an eight-element vectorx. Then, the vector y is equal to the first three elements ofŷ =Tx. More formally, The circulant matrixT is generated by its first columnĉ, whereĉ = (c 0 , c 1 , c 2 , 0, 0, 0, r 2 , r 1 ). The number of zeros inserted after c 2 is equal to 8 − 5 = 3, where 8 is the size of the matrixT and 5 is the number of free parameters in r and c (5 = 3 + 3 − 1, because r 0 = c 0 ). Givenĉ andx, the vectorŷ =Tx can be computed efficiently as described below. The last five elements ofŷ are computed but are then discarded, which is indicated with × in Eq. (26).

B. The General Case for a Rectangular Toeplitz Matrix
Let T be an M -by-N Toeplitz matrix that is generated by its first row r = (r 0 , r 1 , r 2 , . . . , r N −2 , r N −1 ) and its first column c = (c 0 , c 1 , c 2 , . . . , c M −2 , c M −1 ), assuming that r 0 = c 0 . Also, let x be a column vector of length N .
To compute y = T x, we start by embedding T into an nby-n circulant matrixT, where n = 2 log 2 (M +N −1) . This matrix is generated by its first columnĉ, which has n elements that are defined as follows: c = (c 0 , c 1 , . . . , c M −1 , 0, 0, . . . , 0, r N −1 , . . . , r 2 , r 1 ). (27) Note that r 0 , which is equal to c 0 , is not used in this formula. The number of zeros inserted inĉ between c M −1 and r N −1 is equal to n − (M +N −1). The vector x is padded with n−N zeros at the end, which results in a vectorx of length n. Given c andx, the productŷ =Tx can be computed in O(n log n) time using two FFTs and one IFFT (see Algorithm S4 in Appendix D).
Because the original matrix T appears as a sub-matrix ofT in its upper-left corner and because x is padded with the appropriate number of zeros, the first M elements ofŷ are equal to y, i.e., y k =ŷ k for each k ∈ {0, 1, 2, . . . , M − 1}.
To summarize, the product of an M -by-N Toeplitz matrix T and a vector x of length N can be computed using the following four steps: 1) go from the Toeplitz matrix T, which is generated by the vectors r and c, to the circulant matrixT by forming its first columnĉ using Eq. (27); 2) pad the vector x with n − N zeros at the end to get the vectorx; 3) passĉ andx to Algorithm S4, which computesŷ; and 4) set y to the first M elements ofŷ. Note that neither the Toeplitz matrix T nor the circulant matrixT is constructed explicitly; only their generating vectors are used to perform the necessary computations.
Algorithm S1 gives the pseudo-code for the procedure described above. The computational complexity of this algorithm is O(n log n). The function name is TOEPLITZMULTIPLYE, where 'E' abbreviates circulant 'embedding'. Algorithm S2 gives the pseudo-code for the helper function ZEROPAD that pads an array with zeros at the end such that the new length of the array is equal to n. This function is used in the next appendix as well.
There are several variations of the circulant embedding procedure that are equivalent to the one described above. In some of the other versions the first columnĉ ofT is equal to a cyclic shift of the elements of the vectorĉ that is defined in Eq. (27). For example, Pan 26 (see p. 66) constructsĉ as follows: c = (r N −1 , . . . , r 2 , r 1 , c 0 , c 1 , . . . , c M −1 , 0, 0, . . . , 0). (28) In this case, the n − (M +N −1) zeros are inserted at the end. The vector x is still padded with n−N zeros at the end. However, the result vector y is extracted fromŷ as follows: For example, if M = N = 3, then Eq. (26) could be replaced with the following expression: The original matrix T is still embedded inT (indicated with the highlighted region), but it is no longer in the upper-left corner. Similarly, the result vector y, which is of length M , starts at offset N −1 = 2 in the vectorŷ.
Algorithm S1. Multiply a Toeplitz matrix by a vector using circulant embedding. Runs in O(n log n) time. This appendix describes another algorithm for multiplying a Toeplitz matrix by a vector that uses Pustylnikov's decomposition 23, 24 (see also pages 40 and 66 in references 25 and 26, respectively). This method can also be used by the CZT and the ICZT algorithms described in the main paper. Let c = (c 0 , c 1 , c 2 ) be the first column and r = (r 0 , r 1 , r 2 ) be the first row of a 3-by-3 Toeplitz matrix T, where c 0 = r 0 . Once again, we wish to compute the product of T with the vector x = (x 0 , x 1 , x 2 ) T , i.e., The matrix T is square, but the number of its rows, or columns, is not a power of 2. Thus, the first step is to pad T and to embed it into a larger 4-by-4 Toeplitz matrixT. The matrixT is generated by its first column and its first row, which can be derived from the first column c of T by padding it with a single zero and from the first row r of T by padding it with a single zero as well. The vector x is also padded with a zero at the end to produce the vectorx, which is of size 4. This padding transforms Eq. (31) as follows: The padding is necessary because the multiplication algorithm relies on FFT and IFFT, which in this text are described as recursive algorithms that expect the size of their input vector to be a power of 2. The next step uses Pustylnikov's decomposition to express the padded Toeplitz matrixT as the sum of two matricesT andT , which are defined as follows: c 0 0+r 1 c 2 +r 2 c 1 +0 c 1 +0 c 0 0+r 1 c 2 +r 2 c 2 +r 2 c 1 +0 c 0 0+r 1 0+r 1 c 2 +r 2 c 1 +0 c 0 The matrixT is circulant. The matrixT is skew-circulant 28 . Some authors 26, 27 use the term f -circulant with f = −1 to refer to a skew-circulant matrix.
Using these two matrices, Eq. (32) can be computed in two steps as follows: Bothŷ =T x andŷ =T x can be computed efficiently as described below. The partial resultsŷ andŷ are then added to get the result vector y, i.e., The last element ofŷ =ŷ +ŷ is not needed (and is not computed), it appears only because of the padding.

B. The General Case for a Rectangular Toeplitz Matrix
In the general case, the matrix T is an M -by-N Toeplitz matrix, the vector x is of size N , and the result vector y is of size M . The matrix T is generated by its first column c = (c 0 , c 1 , . . . , c M −1 ) and its first row r = (r 0 , r 1 , . . . , r N −1 ), where it is assumed that c 0 = r 0 .
The first step is to go from the matrix T to the matrixT of size n-by-n, where n = 2 log 2 max(M,N ) . That is,T is a square Toeplitz matrix and n is a power of two. The first column ofT can be constructed by padding c with n−M zeros. Similarly, the first row ofT can be constructed by padding r with n−N zeros. The vector x is also padded with n−N zeros to create the vectorx.
Given the padded vectors c and r, which are now both of length n, we can use Pustylnikov's formula to calculate the first column c = (c 0 , c 1 , . . . , c n−1 ) of the circulant matrix T and the first column c = (c 0 , c 1 , . . . , c n−1 ) of the skewcirculant matrixT as follows: In other words,T andT have the following form: Similarly to the example above, the productŷ =Tx can be computed in two steps: The productŷ =T x is computed in O(n log n) time using Algorithm S4. The productŷ =T x is also computed in O(n log n) time using Algorithm S7. Finally, the first M elements ofŷ andŷ are added together to compute the result vector y, i.e., Algorithm S3 gives the pseudo-code for the procedure described above. The function name is TOEPLITZMULTIPLYP, where 'P' abbreviates 'Pustylnikov'. The algorithm runs in O(n log n) time. Once again, neitherT norT is constructed explicitly; only the generating vectors c and c are computed, which correspond to their first columns. The last n−M elements ofŷ andŷ are computed, but they are not used to compute y.
In contrast to the circulant embedding approach described in Appendix B, Algorithm S3 does not embed T into only one square matrix with n = 2 log 2 (M +N −1) . Instead, it uses two square matrices, each with n = 2 log 2 max(M,N ) .
Algorithm S3. Multiply a Toeplitz matrix by a vector using Pustylnikov's decomposition. Runs in O(n log n) time.

SUPPLEMENTARY APPENDIX D: MULTIPLYING A CIRCULANT OR A SKEW-CIRCULANT
MATRIX BY A VECTOR USING FFT AND IFFT The algorithm described in Appendix B relies on another algorithm for computing the product of a circulant matrix with a vector in O(n log n) time. In addition to this, the algorithm described in Appendix C needs to compute the product of a skew-circulant matrix with a vector in O(n log n) time. This appendix describes both of these algorithms for n-by-n matrices, where n is a power of two. It is assumed that appropriate padding has already been applied (i.e., see lines 10-21 in Algorithm S1 and lines 10-17 in Algorithm S3).

A. Multiplying a Circulant Matrix by a Vector
A circulant matrix is a square matrix with columns that are generated by successive circular shifts of its first column.
Let G be a 4-by-4 circulant matrix that is generated by its The matrix G can be diagonalized 28 (see p. 73) using the DFT matrix F and the inverse DFT matrix F −1 . In the 4-by-4 case these two matrices have the following form: where ω = e − 2πi 4 = cos(− π 2 ) + i sin(− π 2 ) = −i. Both matrices are symmetric, i.e., each is equal to its own transpose: The matrix G can be expressed, i.e., diagonalized, using the following product of three matrices: The diagonal matrix Λ is defined as follows: Furthermore, the vector λ = (λ 0 , λ 1 , λ 2 , λ 3 ) T , which contains the eigenvalues of G, is equal to the product of the DFT matrix F with the column vector c, i.e., λ = Fc.
The expanded form of Eq. (46) is shown below: The vector λ has the following structure: The product between the circulant matrix G and the vector x = (x 0 , x 1 , x 2 , x 3 ) T can be expressed as follows: The same approach works 28 for an n-by-n matrix G. Thus, the vector y can be computed efficiently using two FFTs and one IFFT as follows: where × denotes element-by-element multiplication. Algorithm S4 implements this formula. It runs in O(n log n) time.
A circulant matrix can also be generated by circular shifts of its first row instead of its first column. Thus, there is an alternative formula for y that diagonalizes G using its first row r = (c 0 , c n−1 , c n−2 , . . . , c 1 ), i.e., This formula can also be computed in O(n log n) time using one FFT and two IFFTs as follows: Note that in this case there is an extra multiplication by n for all elements of the result vector y. This is due to the normalization by 1 /n in the definition of F −1 (see Eq. (45)). For the sake of completeness, Algorithm S5 and Algorithm S6 give the pseudo-code for the Fast Fourier Transform 14: ; 15: end for 16: return y; x ← IFFT(y O ); 10: x ← EMPTYARRAY(n); 11: for k ← 0 to n/2 − 1 do 12: w ← e i2πk n ; 13: )/2; 15: end for 16: return x;

B. Multiplying a Skew-Circulant Matrix by a Vector
A skew-circulant matrix is a square matrix obtained by negating each element above the main diagonal of a circulant matrix 28 . A skew-circulant matrix is also often called an fcirculant matrix with f = −1, e.g., see references 26 and 27.
Let S be a 4-by-4 skew-circulant matrix that is generated by its first column c = (c 0 , c 1 , c 2 , c 3 ), i.e., The matrix S can be expressed as the following product 26, 28 of three matrices: whereĜ is a circulant matrix and both H and H −1 are diagonal matrices.
The matrix H is defined as follows: where σ = e − iπ n . The inverse matrix H −1 is equal to: The circulant matrixĜ is generated by its first columnĉ, which is equal to the product of H with the column vector c, i.e.,ĉ = H c. Thus,Ĝ has the following form: Combining Eq. (55) with Eq. (50) leads to the following formula for computing the product of the skew-circulant matrix S with the vector The same method can be applied if S is an n-by-n skewcirculant matrix 28 . In this case, Eq. (59) can be computed efficiently using two FFTs and one IFFT, i.e., where × denotes elementwise multiplication. Algorithm S7 implements this approach. It runs in O(n log n) time.
Because H and H −1 are diagonal matrices, multiplying each of them by a vector reduces to scaling the elements of the vector, which can be done in O(n) time. This scaling does not affect the computational complexity of Algorithm S7.
Algorithm S7. Multiply a skew-circulant matrix by a vector. This appendix shows how to express the inverse matrix W −1 using structured matrices so that any product between this matrix and a vector can be computed in O(n log n) time.
Let T be an n-by-n Toeplitz matrix generated by the vector r = (r 0 , r 1 , r 2 , . . . , r n−1 ) that specifies its first row and by the vector c = (c 0 , c 1 , c 2 , . . . , c n−1 ) that specifies its first column, where it is assumed that r 0 = c 0 . The inverse matrix T −1 may not be Toeplitz. Nevertheless, the Gohberg-Semencul formula 19, 20 makes it possible to express T −1 using upper-triangular and lower-triangular Toeplitz matrices.
In other words, the formula states that there exist a vector u = (u 0 , u 1 , u 2 , . . . , u n−1 ) such that u 0 = 0 and a vector v = (v 0 , v 1 , v 2 , . . . , v n−1 ) that satisfy the following matrix equation: The Gohberg-Semencul formula uses two vectors u and v to describe the inverse of a square Toeplitz matrix. As proven below, however, the vector u is not unique in that formula and can be scaled by a non-zero constant without affecting the inverse matrix. In turn, this result is used to prove that if the Toeplitz matrix is symmetric, then its inverse can be generated using only one specifically scaled vector u. Furthermore, this vector is equal to the first column of the inverse matrix.
Theorem 1. (The vector u is not unique.) Let T be a nonsingular n-by-n Toeplitz matrix. Let u = u 0 , u 1 , . . . , u n−1 and v = v 0 , v 1 , . . . , v n−1 be two vectors that determine the inverse matrix T −1 in the Gohberg-Semencul formula, i.e., where A, B, C, and D are specified in Eq. (61). Then, the Gohberg-Semencul formula also holds for all vectorsû obtained by multiplying the vector u with a nonzero scaling coefficient. More formally, for each α = 0, the following equation holds: whereÂ = α A ,D = α D, and u = (û 0 ,û 1 , . . . ,û n−1 ) = (αu 0 , αu 1 , . . . , αu n−1 ) = αu. (64) Proof. Because α = 0, it can be factored out and canceled from both sides of Eq. (63), which leads to Eq. (62). That is, Theorem 2. Let T be a non-singular Toeplitz matrix and let T −1 be its inverse that is generated by the vectors u and v in the Gohberg-Semencul formula. Then, the first row of T −1 is equal to the reverse of the vector v.
Proof. Let a = (u 0 , 0, . . . , 0) be the first row of A and let b = (0, 0, . . . , 0) be the first row of B in Eq. (61). Then, the first row of the inverse matrix T −1 can be expressed as follows: Then, the first column of T −1 is equal to: Theorem 4. LetŴ be a symmetric non-singular Toeplitz matrix. Let u and v be two vectors that generate the inverse matrixŴ −1 using the Gohberg-Semencul formula such that u 0 = v n−1 . Then, the vector u is equal to the reverse of the vector v, i.e., u = (u 0 , u 1 , . . . , u n−1 ) = (v n−1 , v n−2 , . . . , v 0 ).
Proof. Because the matrixŴ is symmetric, its inverse is also symmetric. This implies that the first column ofŴ −1 is equal to its first row. Combining the condition that u 0 = v n−1 with Theorem 2 and Theorem 3 shows that Eq. (68) holds. More formally, v n−1 u 0 Theorem 5. LetŴ be a symmetric non-singular Toeplitz matrix. Then, there is a vector u = (u 0 , u 1 , . . . , u n−1 ) in which u 0 = 0 such thatŴ −1 can be expressed as follows: Moreover, the vector u is equal to the first column of the inverse matrixŴ −1 .
Proof. Theorem 1 implies that u can be scaled by a factor α.
In particular, if we set α = vn−1 /u0, then the value of u 0 after scaling is equal to v n−1 . Then, Theorem 4 implies that the scaled vector u is equal to the reverse of the vector v.
SUPPLEMENTARY APPENDIX F: EXPRESSING THE GENERATING VECTOR u A general formula for the elements of the vector u can be derived from special cases for the first several values of n and then proven by induction for all n. Because u determinesŴ −1 , this is sufficient to formulate the ICZT algorithm using structured matrix multiplication.
A. Special Cases of the Formula for the Generating Vector u 1) Special Case for n = 1: In this degenerate case botĥ W andŴ −1 are equal to [ 1 ]. Also, u = (u 0 ) = (1).
2) Special Case for n = 2: Suppose that n = 2. Then,Ŵ is a 2-by-2 matrix that is defined by Eq. (9). That is, which is well-defined for each W ∈ C \ {0}. By definition, the product of a matrix with its inverse is equal to the identity matrix, i.e.,ŴŴ −1 = I. Theorem 5 implies that the first column ofŴ −1 is equal to a twoelement vector u = (u 0 , u 1 ) T . Therefore, the elements of u can be expressed as functions of the transform parameter W by solving the following linear system: This leads to the following formulas for u 0 and u 1 : To summarize, the column vector u, which is equal to the first column of the inverse matrixŴ −1 , can be expressed as: Plugging u into Eq. (70) leads to the correct value forŴ −1 : 3) Special Case for n = 3: In this case,Ŵ is a 3-by-3 Toeplitz matrix. From Eq. (9) we know that each element of W is a function of the transform parameter W . More formally, Similarly to the 2-by-2 case, the vector u is a solution to the following linear system: In this case, the elements of u can be expressed in terms of the transform parameter W as shown below: Plugging this vector into Eq. (70) leads to the inverse matrixŴ −1 (see Appendix A). In contrast to the 2-by-2 case, the 3-by-3 inverse matrix is no longer Toeplitz because the elements along its main diagonal are not identical, i.e., Repeating this process for other values of n allowed us to derive a general formula for the elements of u. The next two sections state and prove this formula.

B. General Formula for the Generating Vector u
In the general case, for each k ∈ {0, 1, 2, . . . , n − 1}, the elements of the vector u can be computed using the following formula: which is proven in the next section. For each k, the value of u k can be computed in O(1) using precomputed values of the products of polynomials that appear in the denominator of Eq. (80). These values can be computed in a separate loop that runs in O(n).
The vector u can then be plugged into Eq. (70) to compute the inverse matrixŴ −1 . It can also be used as an input to O(n log n) algorithms that compute the product ofŴ −1 with a vector, without actually storingŴ −1 in memory. This approach is used by the O(n log n) ICZT algorithm that is shown in Algorithm 2.

C. Proof of the General Formula for the Generating Vector u
This section shows that for each n ∈ N the vector u, which is defined by Eq. (80), is indeed the generating vector for the inverse matrixŴ −1 . Theorem 5 implies that this can be done by showing that u is the first column ofŴ −1 .
In the proofs it is necessary to distinguish between the vector u = (u 0 , u 1 , . . . , u n−1 ) of length n and the vector u = (u 0 , u 1 , . . . , u n−1 , u n ) of length n+1. Thus, the number of dimensions will be indicated with a left superscript for both u and its elements. Using this notation, the k-th element of n u = { n u 0 , n u 1 , . . . , n u n−1 } is given by: .
The following lemma expresses n+1 u k in terms of n u k .
The next lemma expresses n u k−1 in terms of n u k . Lemma 7. Let n ∈ N and let k ∈ {1, 2, . . . , n−1}. Then, n u k−1 can be expressed as follows: Proof. Replacing k with k − 1 in Eq. (81) leads to the following: .
Let p /2 be the power of W in the numerator of Eq. (81) and q /2 be the power of W in the numerator of Eq. (87). Then, q can be expressed in terms of p using the following formula: Therefore, The first product in the denominator of Eq. (87) is equal to: In this form, it is almost the same as the corresponding product in Eq. (81), except for the extra term (W n−k −1). The second product in the denominator of Eq. (87) is equal to: which has one more term, i.e., 1/(W k −1), than the corresponding product in Eq. (81). Combining Eqs. (89), (90), and (91) leads to Eq. (86).
The following lemma states a recursive formula for n+1 u k , which is expressed in terms of n u k and n u k−1 .
Lemma 8. For each n ∈ N and each k ∈ {1, 2, . . . , n−1}, the value of n+1 u k can be expressed as follows: Proof. Using Lemma 6, the left-hand side of Eq. (92) can be expressed in terms of n u k as follows: Using Lemma 7, which expresses n u k−1 in terms of n u k , the right-hand side of Eq. (92) can be stated as shown below: The rest of the proof shows that Eq. (94) is equal to Eq. (93). The expression in Eq. (94) can be simplified as follows: Expanding the numerator leads to: Finally, we get which completes the proof.
Let n P (x) be the following polynomial: where the k-th coefficient n p k is obtained by multiplying the k-th element of the vector n u by W − k 2 2 . In other words, (99) Let n F (x) be another polynomial that is defined as follows: where That is, n F (x) is the Lagrange polynomial 30, 31 that interpolates the n points (x 0 , y 0 ), (x 1 , y 1 ), . . . , (x n−1 , y n−1 ). In particular, if x 0 = W 0 , x 1 = W 1 , . . . , x n−1 = W n−1 , then Eq. (101) becomes: Furthermore, if y 0 = 1 and y 1 = y 2 = · · · = y n−1 = 0, then only f 0 (x) contributes to n F (x) and Eq. (100) simplifies to only one product that will be denoted with n L(x), i.e., For n+1, the formula for n+1 P (x) uses the elements of n+1 u instead of the elements of n u. Also, the formula for n+1 L(x) includes one more fractional term. More formally, We will use mathematical induction to show that n P (x) = n L(x) for each n ∈ N. The following lemma proves that 1 P (x) = 1 L(x), which is the base case of the induction.
Proof. The product that defines n L(x) in Eq. (103) becomes degenerate when n = 1 and evaluates to 1, i.e., 1 L(x) = 1. Similarly, for n = 1, Eq. (98) simplifies to: This formula evaluates to 1, which follows from Eq. (81), i.e., The following lemma proves an inductive formula that expresses the coefficients of the Lagrange polynomial n+1 L(x) in terms of the coefficients of n L(x).
Lemma 10. Let n k be the k-th coefficient of the Lagrange polynomial n L(x) that is defined in Eq. (103), i.e., Then, for each n ∈ N and each k ∈ {0, 1, 2, . . . , n}, the value of n+1 k in n+1 L(n) is given by the following formula: Proof. Expanding Eq. (105) in terms of n k leads to: The terms in Eq. (111) can be grouped by the powers of x. The coefficients of the resulting polynomial are equal to the coefficients n+1 k of n+1 L(x). More formally, The next lemma expresses the coefficients of n+1 P (x) in terms of the coefficients of n P (x).
1) Suppose that k = 0. Then, Eq. (99) implies that n+1 p 0 can be expressed as follows: Using Lemma 6 this derivation can be continued as follows: which proves the first case in Eq. (113).
2) Suppose that k ∈ {1, 2, . . . , n−1}. Then, Lemma 8 and Eq. (99) imply that n+1 p k can be expressed as follows: Using basic algebra, this can be simplified as follows: 3) Suppose that k = n. Then, using Eqs. (81) and (99), n+1 p n can be expressed as follows: Because n 2 = (n − 1) 2 + 2n − 1, this derivation can be continued as shown below: Therefore, The following theorem completes the inductive argument by showing that n L(x) = n P (x) for each integer n.
Theorem 12. For each n ∈ N and each x ∈ C, the values of n L(x) and n P (x) are equal, i.e., n L(x) = n P (x). (121) Proof. The proof is by mathematical induction. Lemma 9 proved Eq. (121) for n = 1, which forms the base case of the induction. The inductive step is formed by Lemma 11 and Lemma 10, which showed that n+1 L(x) can be derived from the coefficients of n L(x) using the same formula that derives n+1 P (x) from the coefficients of n P (x). This implies that n+1 P (x) = n+1 L(x) whenever n L(x) = n P (x). Taken together, the base case and the inductive step complete the inductive proof and show that n L(x) = n P (x) for each n ∈ N and each x ∈ C.
The next theorem combines the results from this section to show that the vector u is indeed the first column of the inverse matrixŴ −1 . This result follows from Theorem 12 when x is a power of W , i.e., x ∈ {W 0 , W 1 , W 2 , . . . , W n−1 }.

SUPPLEMENTARY APPENDIX G: CZT AND ICZT ALGORITHMS THAT REVERSE THE CONTOUR DIRECTION WHEN |W | < 1
This appendix describes alternative versions of the CZT and the ICZT algorithms that improve the numerical accuracy of the computed transforms. This is achieved by reversing the direction of the chirp contour when |W | < 1 and keeping the original direction when |W | ≥ 1. The parameters for the reversed contour are W = W −1 and A = A W −(M −1) .
This appendix also proves that this parameter adjustment does not affect the mathematical definitions of the CZT and the ICZT. In other words, the modified algorithms compute the same transforms, but the numerical precision of their results can improve by several orders of magnitude when the magnitude of the transform parameter W is less than 1.
The following lemma shows that the Vandermonde matrix W associated with the parameter W can be mapped to the Vandermonde matrix W associated with W = W −1 .
Lemma 14. Let W be a non-zero complex number and let W be the M -by-N Vandermonde matrix defined by Eq. (3), i.e., the element in the i-th row and j-th column of W is: for each i ∈ {0, 1, . . . , M −1} and each j ∈ {0, 1, . . . , N −1}. Also, let W be the Vandermonde matrix specified by the same formula but using W −1 instead of W . That is, Furthermore, let S be the N -by-N diagonal matrix generated by the last row of the matrix W , i.e., Then, the matrix W can be expressed as the following triple matrix product: where J denotes the M -by-M exchange matrix, i.e., Proof. The matrix product W S has the following form: which follows from the definitions of W and S. Each element of this matrix is given by the following formula: for each i ∈ {0, 1, . . . , M −1} and each j ∈ {0, 1, . . . , N −1}. This implies that the matrix product W S is equal to the matrix obtained by reversing the rows of the matrix W . This operation can be formalized using the exchange matrix J , i.e., Because the matrix J is its own inverse, it follows that Eq. (130) can be derived by pre-multiplying both sides of the previous equation by J , i.e., W = J J W = J W S.
The next lemma shows how the diagonal matrix A, which is associated with the parameter A, can be mapped to the diagonal matrix A that is associated with A = A W −(M −1) .
Lemma 15. Let A and W be two non-zero complex numbers. Let A be the diagonal matrix defined by Eq. (8), i.e., Also, let A be the following N -by-N diagonal matrix: Then, the matrix A can be expressed as the product of the matrix A and the matrix S defined by Eq. (129), i.e., Proof. Both A and S are diagonal matrices. This implies that both matrices are invariant with respect to transposition. Thus, In other words, each diagonal element of the matrix product S A is equal to the product of the corresponding elements of S and A; all other elements of S A are zero. Thus, Therefore, S A = A , which completes the proof.
The following theorem proves that computing the CZT without contour reversal is mathematically equivalent to computing the CZT with contour reversal and then reversing the order of the elements in the output vector.
Theorem 16. Let M and N be two positive integers and let A, W ∈ C \ {0} be two non-zero complex numbers. Let x = (x 0 , x 1 , x 2 , . . . , x N −1 ) ∈ C N be a complex input vector for a CZT that is parametrized by M , W , and A. Let X be the CZT output vector. More formally, Let A = A W −(M −1) and W = W −1 be the parameters for the reversed chirp contour. Let X be the output vector of a CZT that is parametrized by M , W , and A . That is, Then, X can be obtained by reversing the order of the elements in X . More formally, In other words, reversing the direction of the chirp contour reverses the order of the elements in the output vector.
Proof. Eq. (140) has the following matrix form: where W is the Vandermonde matrix defined by Eq.
(3) and A is the diagonal matrix defined by equation Eq. (8). Similarly, the vector X can be expressed as follows: From Lemma 14 it follows that the matrix W in Eq. (143) can be replaced with the matrix product J W S, i.e., From Lemma 15 it follows that A = S −1 A . Therefore, which proves that the vector X can be obtained by reversing the elements of the vector X .
The next theorem proves that computing the ICZT without contour reversal is theoretically equivalent to computing the ICZT with reversal of both the contour and the input vector. The theorem is stated for the square case, i.e., when M = N .
Theorem 17. Let N be a positive integer and let A and W be two non-zero complex numbers. Let X = (X 0 , X 1 , . . . , X N −1 ) be a complex vector that is used as an input to an ICZT parametrized by N , W , and A. Let x be the result of this transform. More formally, Let A = A W −(M −1) and W = W −1 be the parameters for the reversed chirp contour, where M = N . Let x be the output vector of the ICZT parameterized by N , W , and A for which the input vector is set to J X. That is, Then, x and x are equal, i.e., x = x .
Proof. This result can be proven by expressing x and x in matrix form and showing that the two expressions are equal. From Eqs. (130) and (137) it follows that W = J W S −1 and A = S A. Thus, the vector x is equal to: Similarly, the vector x can be derived by inverting the matrix expression for the CZT, i.e., This completes the proof, because Eqs. (149) and (150) are equal.
Algorithm S8 gives the pseudo-code for the alternative version of the CZT algorithm that reverses the direction of the chirp contour when |W | < 1. It is implemented as a wrapper around Algorithm 1. Similarly, Algorithm S9 shows the pseudo-code for the alternative version of the ICZT algorithm, which also reflects the direction of the chirp contour when |W | < 1. The computational complexity of both algorithms remains unchanged, i.e., it is still O(n log n).
For improved numerical accuracy, lines 6 and 7 in both Algorithm S8 and Algorithm S9 should be computed with higher precision if possible. The reason is that even a small numerical error in the computed values of A and W affects all subsequent operations.
The following example illustrates Theorem 16. In the 3-by-3 case, the CZT can be expressed as follows: Previously, it was shown that W = J W S, A = S −1 A , and X = W A x. Thus, Eq. (151) can be stated as follows: To summarize, if the CZT is computed for a reversed chirp contour, i.e., X = W A x, then the output vector X = W A x for the original contour can be computed by reversing the order of the elements in the vector X , i.e., X = J X .
The next example illustrates Theorem 17, which showed that x = (A ) −1 (W ) −1 J X = A −1 W −1 X = x. For the 3-by-3 ICZT case, this equation has the following form: This result follows from (A ) −1 = A −1 S −1 and (W ) −1 = (J W S −1 ) −1 = S W −1 J , and also from the fact that the matrix J is its own inverse. In other words, when computing the ICZT for the reversed chirp contour, the elements of the input vector need to be reversed in order to get the same result as the ICZT for the original chirp contour. c Difference Figure S3. Absolute numerical error of the ICZT algorithm, computed without contour reversal (a) and with contour reversal (b). The difference between (a) and (b) is shown in (c). The surfaces were generated for M = 64 as functions of the transform parameters A and W , which were discretized as described in the Methods section. Each of the 5,200 points on each surface was computed by averaging the numerical errors over 10 random unit-length input vectors. Figure S2 compares the numerical accuracy of the CZT algorithm with and without contour reversal. The surface in Fig. S2a was computed with Algorithm 1, which does not reverse the contour. The surface in Fig. S2b was computed with Algorithm S8, which reverses the direction of the chirp contour when |W | < 1. The difference between these two surfaces is shown in Fig. S2c, where positive values indicate that the error in (a) is higher than the error in (b).
The numerical error is equal to X − X , whereX is the computed output vector and X is the true but unknown output vector. In this experiment,X was computed using 128-bit floating-point numbers and X was computed with 1024-bit numbers. In both cases, the mpmath library 34 was used to software-emulate these high-precision floating-point numbers. The values of X for both Fig. S2a and Fig. S2b were computed with Algorithm 1, i.e., without contour reversal. Figure S3 summarizes the results of a similar experiment for the ICZT. The error is given by x − x , wherex is the computed output vector and x is the true but unknown output vector. The value ofx was computed with 128-bit precision in Fig. S3a and Fig. S3b, using Algorithm 2 and Algorithm S9, respectively. For both of these figures, x was computed with 1024-bit precision using only Algorithm 2, i.e., ICZT without contour reversal. Figure S3c shows the pointwise difference between the two surfaces, i.e., (c) = (a) − (b). The difference is zero when |W | > 1. For contours with |W | < 1, however, the difference is positive and can be as high as 7 orders of magnitude in this case, i.e., when M = 64.
These results may be explained by an asymmetry in the con-dition number of the matrixŴ (see Eq. (9)). For |W | < 1, the condition number grows much faster than for |W | > 1.
In log space, the contour reversal can be expressed as: The eigenvectors of this matrix are (0, 1) and (1, M −1 2 ). The surfaces in Figs. S2 and S3 are bent along two lines that correspond to these two eigenvectors. The first line is log 2 |W | M =0; the second line is log 2 |A| = M −1 2M log 2 |W | M . To summarize, this appendix showed that the numerical accuracy of both the CZT and the ICZT can be improved by several orders of magnitude for chirp contours that are growing logarithmic spirals, i.e., when |W | < 1. In those cases, the revised algorithms reverse the direction of the chirp contour. Furthermore, this contour reversal must be accompanied by a reversal of the order of the elements in the vector X, which is either the CZT output vector or the ICZT input vector.
To make the contour reversal rules easier to remember, one can apply the following guidelines that use the four contour colors from Fig. 4. A blue contour should never be reversed because it is always a decaying logarithmic spiral. A green contour should be reversed only if it describes a growing spiral and left unchanged if it describes a decaying spiral. Similarly, a red contour should be reversed if it describes a growing spiral and left unchanged otherwise. Finally, a black contour (i.e., one that starts inside the unit circle and ends outside the unit circle) should always be reversed, which transforms it into a blue contour. In short, all chirp contours that are growing spirals should be reversed.

SUPPLEMENTARY APPENDIX H: LINK BETWEEN THE POLAR ANGLE OF THE PARAMETER A
AND THE VECTOR x IN BOTH THE CZT AND THE ICZT Theorem 18. Let W be a non-zero complex number. Also, let A and B be two non-zero complex numbers such that |A| = |B| = µ. Moreover, let α = arg(A) be the polar angle of A and let β = arg(B) be the polar angle of B, i.e., Finally, let M and N be two positive integers.
. . , X A M −1 ) ∈ C M be the complex vector of length M that is equal to the CZT of x parametrized by M , W , and A. More formally, where W is the Vandermonde matrix generated by the parameter W as defined by Eq.
(3) and A is the diagonal matrix generated by the first N negative integer powers of the parameter A starting from A −0 , i.e., Similarly, let X B = (X B 0 , X B 1 , X B 2 , . . . , X B M −1 ) ∈ C M be the complex vector of size M that is equal to the CZT of x parametrized by M , W , and B. That is, where B is a diagonal matrix generated by the first N negative integer powers of the parameter B, i.e., Then, the vector X B can be expressed as the CZT of the vector x = D x where the transform is parametrized by M , W , and A and where the diagonal matrix D is defined as follows: In other words, Proof. The proof follows from the definitions of the matrices A and B. Combining Eqs. (158) and (155) leads to: Similarly, combining Eqs. (160) and (156) leads to: Therefore, B = AD. This implies that: which proves Eq. (162).
Corollary 19. Changing the polar angle of the CZT transform parameter A from α to β is equivalent to changing the elements of the input vector x by adding k(α−β) to the polar angle of each x k . This change does not affect the norm of x.
Proof. This corollary follows from Eq. (162). That is, multiplying the input vector x by the diagonal matrix D, which is generated by a vector of complex exponentials, changes only the polar angles of the elements of x and leaves their magnitudes unchanged. More formally, let x k = |x k |e iθ k be the polar form of x k for each k ∈ {0, 1, 2, . . . , N −1}. Then, That is, the real term k(α − β) is added to the polar angle of x k , which does not change the Euclidean norm of x.
Theorem 20. Let W be a non-zero complex number and let A and B be two non-zero complex numbers such that their magnitudes are equal, i.e., |A| = |B| = µ > 0. Let α be the polar angle of A and let β be the polar angle of B. More formally, be the complex vector of length N that is equal to the ICZT of X parametrized by N , W , and A. That is, where W is the Vandermonde matrix defined by Eq.
(3) and A is the diagonal matrix defined by Eq. (158).
be the complex vector of length N that is equal to the ICZT of X parametrized by N , W , and B, i.e., where B is the diagonal matrix defined by Eq. (160).
Then, x B = D −1 x A , where D is the diagonal matrix defined by Eq. (161).
Proof. Previously, it was proven that B = AD. Thus, Plugging this expression into Eq. (170) completes the proof, i.e., Corollary 21. Changing the polar angle of the ICZT transform parameter A from α to β is equivalent to changing the elements of the output vector x by subtracting k(α−β) from the polar angle of each x k . This does not affect the norm of x.
Proof. This corollary follows from Eq. (170), which implies that the polar angle of x B k can be obtained from the polar angle of x A k by subtracting k(α − β). In other words, where θ A k is the polar angle of x A k . Therefore, x B = x A .

SUPPLEMENTARY APPENDIX I: APPROXIMATION FORMULAS FOR THE NUMERICAL ERROR
This appendix states formulas that approximate the absolute numerical error of the CZT algorithm, the ICZT algorithm, and their sequential applications. The formulas are given for the square case in which M = N . The formulas apply to the modified algorithms described in Appendix G that reverse the direction of the chirp contour when |W | < 1. Finally, these formulas are suitable only for the case when the chirp contour is a logarithmic spiral that spans a 360 • arc.
The error formulas are stated using the following four terms: which are defined for each k ∈ {0, 1, 2, . . . , N −1}. The first term corresponds to the factors used on line 7 of Algorithm 1. The second term is used on lines 8 and 11 of Algorithm 1 and on line 9 of Algorithm 2. The third term is used on line 16 of Algorithm 1. The fourth term is used on line 33 of Algorithm 2. All of these lines are inside loops that perform N iterations (in the square case). Thus, each term specifies the k-th element of a vector. The error formulas use four aggregate terms that are computed by taking the logarithm of the Euclidean norm of these four vectors, i.e., The error formulas that cover the ICZT algorithm also use the following three terms that are computed from the elements of the vector u, which is defined by Eq. (18): The term U 1 is equal to the logarithm of the Euclidean norm of a vector formed by all elements of the vector u except u 0 . It maps to lines 25 and 26 of Algorithm 2. The term U 2 is similar, but it uses all elements of the vector u, including u 0 . It maps to lines 27 and 28 of Algorithm 2. Finally, U 3 is equal to the negative logarithm of the magnitude of u 0 (by definition, u 0 cannot be zero). It maps to line 30 of Algorithm 2. All error formulas also include an offset that depends on N and p, where p is the number of precision bits in the IEEE-754 floating-point numbers 33 that are used to compute the transforms. For computations with 128, 256, 512, and 1024 bits, the value of p is equal to 113, 237, 489, and 997, respectively. This offset determines the base level of the error surface. It is defined as follows: For this evaluation, the terms in Eqs. (174)-(177) and the elements of the vector u were computed with very high precision (1024 bits). This was done to reduce the effect of the numerical error in these terms on the overall numerical error. All other computations were performed using 128 bits.

A. Error Formula for the CZT Algorithm
The CZT maps the input vector x to the output vectorX. The absolute numerical error is given by: where X is the true but unknown output vector of the transform. Figure S4a plots the decimal logarithm of E as a function of A and W . This surface can be approximated with the following error formula: log E ≈ T 1 + T 2 + T 3 + B + log x .
This formula fits the empirical data very well, as indicated by the R 2 values in the second column of Table S1. Furthermore, the quality of the fit improves as N increases.

B. Error Formula for the ICZT Algorithm
The ICZT algorithm maps X tox. Its absolute numerical error is given by: wherex is the computed output vector and x is the true but unknown output vector. Figure S4b plots the decimal logarithm of E as a function of A and W . The error can be predicted using the following formula: log E ≈ T 2 + T 4 + U 1 + U 2 + U 3 + B + log X . (189) The goodness of fit is reported in Table S1 Table S1. The R 2 values for fitting error models to empirically derived error surfaces. Each point on each surface was computed by averaging the numerical errors over 10 random unit-length input vectors. The discretizations for A and W are the same as in the main text. Perfect fits correspond to R 2 = 1. In all four cases, the R 2 values are very close to 1 and improve as N increases. (c) the CZT followed by the ICZT; and (d) the ICZT followed by the CZT. These surfaces were computed using the modified algorithms that reverse the direction of the chirp contour when |W | < 1, i.e., using Algorithm S8 and Algorithm S9. The points in all surfaces represent the average error for 10 randomly-generated unit-length input vectors. The ground truth output vectors were approximated using 1024-bit floating-point numbers. All other computations used 128-bit numbers. Each surface consists of 5,200 points generated using the discretization described in the Methods section. That is, 52 evenly-distributed points in the range [0.5, 2.0] for |A| and 100 evenly-distributed points in the same range for |W | M . All three axes in each of the four plots are scaled logarithmically.

C. Error Formula for the CZT Followed by the ICZT
For the CZT-ICZT procedure, the absolute numerical error is equal to the Euclidean distance between the original input vector x and the computed vectorx. More formally, Figure S4c plots E as a function of the transform parameters A and W . The error values can be analytically approximated using the following formula: log E ≈ T 1 + T 2 + T 4 + U 1 + U 2 + U 3 +B + log x . (191) The goodness of fit of this expression is evaluated in Table S1.
This formula can be viewed as a sum of the terms in Eqs. (187) and (189). Because the output scaling on line 16 of Algorithm 1 is immediately canceled by the input scaling on line 9 of Algorithm 2, the corresponding terms T 3 and T 2 are excluded from Eq. (191). The term log X is excluded as well. Also, the term B is counted only once.

D. Error Formula for the ICZT Followed by the CZT
The absolute error of the ICZT-CZT procedure is given by: where X is the true output vector andX is the computed output vector. Figure S4d visualizes the decimal logarithm of the error, which can be approximated as follows: log E ≈ 2T 2 + T 3 + U 1 + U 2 + U 3 + B + log X . (193) This formula can be viewed as a sum of the terms in Eqs. (187) and (189) except for B, which is included only once, and log x , which is excluded. Also, terms T 1 and T 4 are excluded because the output scaling performed on line 33 of the ICZT algorithm is immediately undone by the input scaling on line 7 of the CZT algorithm. This implies that the absolute numerical error is independent of A, which is confirmed by Fig. S4d. Furthermore, the last column of Table S1 shows that Eq. (193) approximates the empirical error very well.