Cube attacks on round-reduced TinyJAMBU

Lightweight cryptography has recently gained importance as the number of Internet of things (IoT) devices connected to Internet grows. Its main goal is to provide cryptographic algorithms that can be run efficiently in resource-limited environments such as IoT. To meet the challenge, the National Institute of Standards and Technology (NIST) announced the Lightweight Cryptography (LWC) project. One of the finalists of the project is the TinyJAMBU cipher. This work evaluates the security of the cipher. The tool used for the evaluation is the cube attack. We present five distinguishing attacks DA1–DA5 and two key recovery attacks KRA1–KRA2. The first two distinguishing attacks (DA1 and DA2) are launched against the initialisation phase of the cipher. The best result achieved for the attacks is a distinguisher for an 18-bit cube, where the cipher variant consists of the full initialisation phase together with 438 rounds of the encryption phase. The key recovery attacks (KRA1 and KRA2) are also launched against the initialisation phase of the cipher. The best key recovery attack can be applied for a cipher variant that consists of the full initialisation phase together with 428 rounds of the encryption phase. The attacks DA3–DA5 present a collection of distinguishers up to 437 encryption rounds, whose 32-bit cubes are chosen from the plaintext, nonce, or associated data bits. The results are confirmed experimentally. A conclusion from the work is that TinyJAMBU has a better security margin against cube attacks than claimed by the designers.

www.nature.com/scientificreports/ EUROCRYPT 2009 8 . The attack sums output values of a black box polynomial P over all possible values of a chosen collection of input variables. It aims to reduce the degree of P . The collection of input variables is called a cube C . The cube is uniquely determined by a set I of input variable indices. A polynomial P S(I) obtained after summation over C is called a superpoly. In 2009 Dinur and Shamir applied the cube attack against the Trivium stream cipher 8 . Since then, the attack has been used to analyse many other stream ciphers, see references [9][10][11][12][13][14][15][16][17][18] , for example. TinyJAMBU is a sponge-based AEAD stream cipher. When considering an AEAD stream cipher, the cube attack may be applicable to different cipher phases. A typical stream cipher has the following phases: initialisation, associated data processing, encryption, finalisation, decryption and verification. Application of cube attacks against different cipher phases requires specific security assumptions. In general, each attack aims to recover some secret information about the cipher. The following list identifies typical attacks against AEAD stream ciphers.
• Key recovery attacks (KRA)-they aim to retrieve the superpolies of cubes, which include variables of a secret key. The attack is typically applied against the initialisation phase, where the key is input into the internal states with some public variables. In the case of TinyJAMBU, key recovery cube attacks can be launched against any phase. This is due to the fact that the key bits are input to the internal state of the cipher during all phases. • State recovery attacks (SRA)-they target superpolies that include internal state variables. They are applicable when the superpoly depends on both few internal states and some public variables at a particular time instance (clock). • Distinguishing attacks (DA)-they allow to differentiate a stream cipher from a truly random one. They work if there is a superpoly, which becomes a constant (zero or one) after summing over all cube values. Such cubes are also called cube testers 9 . • Known plaintext attacks (KPA)-it is assumed that an adversary is able to read plaintexts and associated data but is not able to change them. Consequently, cubes chosen by the adversary can include neither plaintext nor associated data bits. They, however, can include initialisation vector/nonce bits. In this case, we deal with a chosen initialisation vector attack. For TinyJAMBU, its nonces contain 96 bits. Thus, an adversary may select cubes from the nonce bits. • Chosen plaintext attacks (CPA)-it is assumed that an adversary can not only read plaintext and associated data but also it is able to modify them at will. This means that the adversary can choose cubes that include both plaintext and associated data bits (apart from initialisation vector bits).
Our contributions. Note that attacks presented in the work are against round-reduced versions of Tiny-JAMBU. We apply five distinct strategies for cube selection that allow us to construct appropriate distinguishers. Our five distinguishing attacks (DA1-DA5) can be launched against both initialisation and encryption phases of the cipher. DA1 and DA2 are applied against the cipher initialisation phase. The other attacks (DA3-DA5) are implemented against the encryption phase. Table 1 shows a summary of our results. For the DA1 attack, it is possible to design distinguishers for cubes, whose sizes range from 3 to 20 bits. They work if an adversary is able to observe the keystream after the full initialisation phase (with 2176 rounds). Note that after initialisation, TinyJAMBU employs a set of permutation rounds before producing the keystream. We extend DA1 by including additional permutation rounds (reduced) in the encryption phase of TinyJAMBU. The attack extension is referred to as DA2. For the DA2 attack, we find random distinguishers from a cube space of 2 96 , which use 15 and 25 bit cubes. They work for the total number of 2592 rounds. We also show a DA2 that selects cube from a reduced cube space of 2 32 . The attack works for up to 2614 rounds with a 18-bit cube.
The DA3-DA5 attacks need 32-bit deterministic cubes. Our experimental results indicate that after 437 rounds, every output bit is affected by the 32-bit cube tester. In other words, all the output keystream bits are expected to depend on the 32-bit cube variables after 437 permutation rounds. Therefore, 437 encryption rounds can be considered as the upper bound for the 32-bit cube tester.
We have also applied two key recovery attacks KRA1 and KRA2. These two attacks are implemented against the initialisation phase of the cipher. For the KRA1, it is possible to recover eight bits of the secret key if an adversary is able to observe the keystream after the full initialisation phase (2176 rounds). The KRA2 identified several linear superpolies for 2592-2604 rounds. Our results show that KRA2 works up to 2604 rounds when www.nature.com/scientificreports/ the target is recovering at least one bit of the secret key. To the best of our knowledge, our results obtained for TinyJAMBU are the first third-party analysis that produces experimentally verifiable outcomes.

Cube attack
A cube attack is a relatively recent cryptanalytic technique. To describe it, we follow the presentation given by Dinur and Shamir at EUROCRYPT 2009 8 . The idea behind the attack is to represent a keystream output by a polynomial over secret and public variables. In the cube attack, we assume that an adversary can evaluate the polynomial for public variables. The evaluation allows the adversary to reduce the degree of the polynomial. For AEAD stream ciphers, public variables include bits of the initialisation vector, associated data, and plaintext. It is assumed that the public variables can be chosen by the adversary in an arbitrary way. Unlike algebraic attacks, cube attacks treat the keystream polynomial as a black box. Suppose that an adversary is able to access a keystream polynomial of a cipher. The polynomial is defined over the binary field GF (2). It depends on both secret-key variables K = {k 0 , · · · , k i−1 } and public variables V = {v 0 , . . . , v j−1 } . Consider a keystream polynomial P of a degree deg over i secret and j public variables. Define a maxterm t I of the polynomial P as a term whose all variables are public. The term variables are pointed by a collection of indices I ⊆ {1, · · · , j} . The variables indexed by I are called a cube C . The polynomial P can be written as where each term of q(k 0 , · · · , k i−1 , v 0 , · · · , v j−1 ) does not contain at least one public variable from the maxterm t I . P S(I) is called a superpoly of the index set I if it does not contain any constant or any term that has a common factor with the maxterm t I . We denote the cardinality of I by |I| and the size of a cube by ℓ c . Observe that |I| = ℓ c . Interestingly enough, if |I| = deg − 1 , then the degree of the superpoly P S(I) is guaranteed to be linear.
Cube attacks work by summing the values of a polynomial P over all possible 2 |I| Boolean values for variables indexed by I (or alternatively over all values of the cube). If the cube is big enough, i.e., ℓ c = deg − 1 , then the degree of P is reduced to one. This means that the superpoly P S(I) becomes linear. If we repeat the above procedure many times but for different cubes, we can generate a system of linear equations involving the secret variables. After a sufficient number of equations, we can solve a system of linear equations and discover the secret variables/key. In general, the cube attack is run in two stages, namely pre-processing and online.
Pre-processing stage. This stage is executed under an assumption that a description of a stream cipher is public. Consequently, our adversary has access to both public and secret variables and can manipulate them. Our goal is to identify cubes that generate linear superpolies for secret key variables. Since a keystream polynomial P(K, V ) form is not known, it is necessary to estimate the degree of P(K, V ) . This should give us some idea about cube sizes for which we can expect a linear superpoly. We can start from random cubes of small sizes. To choose a random cube C of size ℓ c , we select a collection of indices I c ⊆ {0, · · · , vlen − 1} at random, where vlen denotes the length of the initialisation vector V and ℓ c = |I c | . Consider a keystream polynomial P C (K, V ) = C P(K, V ) that results from summing P(K, V ) over all values of the cube C . It is expected that if we have chosen a "right" cube, then P C (K, V ) = P S(I) is a linear combination of secret variables {k 0 , · · · , k klen−1 } , where klen is the length of the secret key K. To identify the right cube, we need a linearity test for P C (K, V ).
We use the BLR test 19 to check if a polynomial P C (K, V ) is linear. The test verifies whether the following relation holds: where K 0 = {0} klen and K 1 , K 2 are fixed and random bits. If the BLR test is run n times, then we can conclude that P C (K, V ) is linear with probability 1 − 2 −n . By choosing a big enough n (say n = 100 ), we can guarantee the polynomial is linear (with probability 1 − 2 −100 ).
Once we get a linear P C (K, V ) = P S(I) , it can be written in its algebraic normal form (ANF) as follows where public variables from V \ C are set to zero. We know the above representation but we do not know the binary coefficients α i ; i = −1, 0, . . . , klen − 1 . We can determine the coefficients by running klen + 1 cube experiments There is an interesting case when P S(I) stays constant (0 or 1) for all secret keys. Then the polynomial P S(I) is called a distinguisher that allows to differentiate the cipher from a truly random one. Cubes that generate distinguishers are called cube testers 9 .
Online stage. To execute this stage, it is assumed that an adversary has access to an implementation of the cipher in hand. It can manipulate public variables but cannot see secret ones. Furthermore, we suppose that it has successfully executed the pre-processing stage. In other words, the adversary has discovered klen + 1 linearly independent superpolies P S(I j ) , where each P S(I j ) corresponds to its cube C j . Thus, it can write the following system of equations: (1) P(k 0 , · · · , k i−1 , v 0 , · · · , v j−1 ) ≡ t I .P S(I) + q(k 0 , · · · , k i−1 , v 0 , · · · , v j−1 ), www.nature.com/scientificreports/ where j = 1, . . . , klen + 1 . The values on the left hand side are calculated for the corresponding cubes. As the coefficients α i,j have been determined at the pre-processing stage, the adversary can solve the system from Equation (4) using Gaussian elimination, for example. This concludes the cube attack as the adversary has been able to calculate the secret key K.

Overview of TinyJAMBU
TinyJAMBU 4 is a family of AEAD sponge-based stream ciphers. The family includes three members: Tiny-JAMBU-128, TinyJAMBU-192 and TinyJAMBU-256. As we investigate the resistance of TinyJAMBU-128 against cube attacks, our description is focused on TinyJAMBU-128 only.
Specification of TinyJAMBU-128. TinyJAMBU-128 uses a 128-bit key K = {k 0 , · · · , k 127 } and a 96-bit In the heart of the cipher, there is a 128-bit nonlinear feedback shift register (NFSR). An internal state of NFSR at clock t is denoted by The NFSR state is updated by a nonlinear combination of register bits and a cryptographic key. Unless specified otherwise, a block refers to a group of 32 bits. In particular, the third 32-bit block {b 64 , b 65 , · · · , b 95 } of the NFSR is referred to as a keystream. The block is XOR-ed with a plaintext block and they produce the respective ciphertext block. The last 32-bit block {b 96 , b 97 , · · · , b 127 } of the NFSR absorbs via XOR all the cipher inputs, i.e. a nonce, associated data and plaintext blocks. The cipher also employs 3-bit constants denoted by FrameBits to indicate different phases of cipher operations.
TinyJAMBU-128 state update function. TinyJAMBU-128 follows a sponge 20 structure with iterations that use a keyed permutation P r . The permutation is implemented using NFSR, whose state update function is described by Algorithm 1. The function takes the five state bits ( b 0 , b 47 , b 70 , b 85 , b 91 ) and a bit of the key K and produces a feedback bit that becomes b 127 . The permutation P r calls Algorithm 1 r times.
Operation phases of TinyJAMBU-128. In order to encrypt plaintext blocks, TinyJAMBU-128 goes through four phases, namely, initialisation, associated data processing, encryption and finalisation. For decryption of ciphertext blocks, the cipher proceeds through the same initialisation and associated data processing phases. The next phases are decryption and tag verification, which match the encryption and finalisation phases. As the work describes cube attacks against the first three phases, we briefly discuss them.
Initialisation. Algorithm 2 shows a pseudocode of the initialisation phase. It consists of two parts, namely, key and nonce setups. At the key setup, a cryptographic key K is loaded into the NFSR by executing P 1024 . During the nonce setup, a nonce is absorbed into NFSR as a 32-bit block. FrameBits are set to "1". For each nonce setup call, the NFSR state is updated by running P 384 before the nonce blocks are XOR-ed into the NFSR state. Note that the second version of TinyJAMBU-128 employs P 640 instead of P 384 during the nonce setup.
(4) P S(I j ) (K) = α −1,j + α 0,j k 0 + α 1,j k 1 + · · · + α klen−1,j k klen−1 , www.nature.com/scientificreports/ Associated data processing. After the NFSR state is initialised, the associated data AD = AD {0···adlen−1} = {d 0 , · · · , d adlen−1 } are processed block by block, where adlen is the length (number of bits) of the associated data. Algorithm 3 details steps of the associated data processing. The NFSR state is first updated by running the permutation P 384 , which is followed by loading the 32-bit associated data into B {96···127} . Note that if the length adlen of associated data is not a multiple of 32, then additional steps are required to process the last partial block of associated data (refer to the original description of TinyJAMBU for details). Frame-Bits in this phase are set to "3". Similarly to the nonce setup, the second version of TinyJAMBU-128 applies P 640 instead of P 384 for the associated data processing.
Encryption. Algorithm 4 illustrates the encryption phase. Encryption directly follows the associated data processing phase. FrameBits are set to "5" during the encryption. Plaintext bits are processed block by block. Let M = {m 0 , · · · , m mlen−1 } denote the plaintext of length mlen.

Cube attack against TinyJAMBU
Observe that nonce, associated data and plaintext bits are used to constantly update the NFSR state. Clearly, the authors have intended to increase dependencies among all bits involved in the initialisation, associated data processing and encryption phases. Besides, the cryptographic key K is always used for each state update. Consequently, each output bit of the permutation P r can be seen as a complex function of all input bits. They include bits of the NFSR state, the key, the nonce, the associated data and the plaintext. As a significant part of the bits are public, there are many options for selecting cubes at the pre-processing stage. We implement five distinguishing attacks DA1-DA5 and two key recovery attacks KRA1-KRA2. They cover the three cipher phases: initialisation, the associated data processing and encryption. We need to pay attention to the third block B {64···95} of the NFSR state as it plays the role of keystream. We aim to identify bits of a keystream block that, when used in a cube, produce either linear superpolies or constants.
Algorithm 5 details steps in the pre-processing phase of our generic cube attack against the cipher. Its goal is to identify cube testers or cubes with linear superpolies. As we do not have any information about appropriate cube sizes, we test different cube sizes ℓ c . For each cube size, the resulting superpolies of random cubes are tested for linearity. In Algorithm 5, the C++ built-in function rand() is used to generate the arbitrary selection of cubes. Note that the pre-processing stage is needed to perform only once. The pre-processing stage is already completed for the cubes identified for TinyJAMBU in this paper. Hence, an adversary does not require to compute it again and can proceed directly to the online stage without performing this stage. Algorithm 5 also shows the pseudocode for steps performed at the online stage. The online stage is needed to perform every time the cipher is re-keyed (for a key recovery attack). The adversary needs to use a known plaintext attack or a chosen plaintext attack model depending on the operation phases of TinyJAMBU to which the attack is applied. These are common assumptions within security models for performing the analysis of ciphers including the cube attack. Our implementations of cube attacks are applied to round-reduced variants of TinyJAMBU. All results have been experimentally verified (see Ref. 21 Table 2. We assume that they are applied at clock t = 0 and cubes are chosen from the nonce bits only. As a 32-bit keystream block depends on key and nonce bits, we intend to find cubes (defined over nonce bits only) whose superpolies are linear and depend on some key bits. Consider DA1 and KRA1 from Table 2. We assume that the cipher goes through initialisation but skips the associated data processing and encryption phases. In other words, we can observe the keystream immediately after the permutation round of the initialisation phase. We choose cubes at random from a 64-bit nonce. Note that due to our assumptions, the NFSR state does not go through any permutation rounds after the last 32 bits of the nonce is XOR-ed into the last block of the state. This means that the last 32 bits of the nonce do not get mixed into the keystream block. Consequently, the keystream does not contain any variables from the last 32 bits of the nonce. Trivially, if we include variables from the last 32 bits of the nonce, then we get a distinguisher as cube summation must give us a constant.
Note that according to the specification of TinyJAMBU-128, the cipher goes through 1024 rounds of permutation before keystream bits can be observed. Thus, DA1 and KRA1 are extended to DA2 and KRA2, respectively (see Table 2). These attacks are against a cipher that includes r 3 additional permutation rounds (reduced) at the encryption phase. This means that the cipher does not absorb any associated data, i.e., processing of associated data is skipped. This also implies that keystream bits depend on both the key and nonce bits. So cubes can be selected from all 96 bits of the nonce. For DA2 and KRA2, the cipher uses the full initialisation phase, i.e, r 1 = 1024 and r 2 = 384 . However, the number r 3 of encryption permutation rounds is reduced.
The DA1 and KRA1 are performed to identify the dependency of the output function with the first 64 nonce bits immediately after the initialisation phase. These two attacks determine whether an adversary can perform an attack by selecting cubes from the first 64 nonce bits if the keystream was to be observed immediately after the initialisation. Hence, it is assumed that the keystream can be observed without going through the encryption phase for these two attacks. In contrast, the DA2 and KRA2 are performed to identify the dependency of the output function with the nonce bits, including the last block of the nonce. For these two attacks, after completing the initialisation phase, reduced encryption permutation rounds are assumed to be performed since the cubes may now contain the last block of the nonce. The last block of the nonce will result in trivial distinguishers if these additional encryption permutation rounds are not performed. Note that the associated data processing phase is an optional phase for TinyJAMBU; that is, if there is no associated data, then this phase will be skipped. www.nature.com/scientificreports/ The associated data processing phase is assumed to be skipped for all these four attacks as no keystream is output during this phase.
Experimental results for DA1 and KRA1. We have implemented DA1 and KRA 1 as described by Algorithm 5. For a given cube size, we choose cmax = 5000 random cubes. For each cube, we run 50 BLR linearity tests. We have found many cube testers, whose sizes range from ℓ c = 3 to ℓ c = 20 . The total number of permutation rounds employed in the cipher is 1024 + 384 × 3 = 2176 . A sample of cube testers found is presented in Table 3.
We only list cubes up to size 12 in this table. The table details: cube size (the first column), a collection of cube indices (the second column), a collection of keystream bits corresponding to the cube tester (the third column) and the number of superpolies for the given cube (the fourth column). Some additional statistics are presented in Table 4. Note that the complexity of the DA1 very much depends on the size of a cube. This is to say that it ranges from �(2 3 ) to �(2 20 ).
We also found a small set of cubes that resulted in non-constant superpolies. These superpolies are used to implement the KRA1. These are listed in Table 5. The cubes for KRA1 range from l c = 3 to l c = 13 and can be used to recover eight bits of the secret key after 2176 rounds of the initialisation phase. The complexity of solving these equations is negligible.
The experiments demonstrate that the initialisation phase of the cipher provides a relatively low diffusion. This is due to the fact that the cipher iterates 384 times the permutation P after loading the second block of the nonce. This number is definitely too low. Note that the authors of the cipher have now increased this number to 640, which improves diffusion during the initialisation phase.
Experimental results for DA2 and KRA2. We have also conducted experiments for the DA2 and KRA2 attacks. In this case, we assume that the cipher includes the full initialisation phase together with a reduced number of permutation rounds P r 3 at the encryption phase. Note that after initialisation, the NFSR state goes through the permutation P r 3 . It means that the last 32 bits of the nonce bits get mixed with other bits before keystream bits become observable. This implies that in the attack, we can choose cubes from all 96 bits of the nonce.
The attack follows the steps given by Algorithm 5. For a given cube size, we choose cmax = 5000 random cubes. Given a cube, we determine its superpoly and check its linearity by running 50 BLR tests. We begin with r 3 = 384 rounds and then, we keep increasing the number r 3 by a multiple of 32, i.e. r 3 = 384, 416, 448, . . . . We refer to this as DA2 with random cubes selected over the full cube space. During our experiments, we are able to find many cube testers of size 15 for the permutation P 384 and one cube tester of size 25 for the permutation P 416 . We have also conducted experiments for the permutation P 448 with cube sizes up to ℓ c = 40 . However, we have failed to find any.
A sample of cube testers of size 15 and the only cube tester of size 25 are given in Table 6. Cube testers of size 15 are able to distinguish the cipher from a truly random one if the cipher uses no more than 2560 rounds of the permutation P. The best result we got for DA2 with random cube selection from the full cube space is the cube tester of size 25 that works for the cipher with 2592 rounds of the permutation P.
Next, we have tried to find cubes for an arbitrary number r 3 , not necessarily a multiple of 32. As the last block of the nonce is the last to be XOR-ed into the NFSR state, one can argue that the block bits are not as thoroughly mixed with other bits. So it is reasonable to choose cubes taking as many as possible bits from the last block. This approach should eliminate the maxterms of the corresponding superpoly that are not mixed well with the last block of the nonce. This approach has been verified experimentally. We refer to this as DA2 with reduced cube space. We find that the 32-bit cube {v 64 , · · · , v 95 } works up to r 3 = 437 rounds of the permutation P and results in a distinguisher. As a result, with this method, we have got cube testers that allow to distinguish the cipher with 2613 rounds of P from a truly random cipher.
DA2 with smaller cube sizes and extension to a key recovery attack. The 32-bit cube for 437 rounds DA2 is a distinguisher. This means the cube size is too large. We use two techniques for extending the experiments to identify DA2 with smaller cube sizes and possible extensions to a key recovery attack (KRA2). For these experiments, we select the cube bits from a reduced set of nonce bits (last block of the nonce). Other nonce blocks are only included in the cube space when the cube size is larger then 32 bits. In other words, Technique 1. We conducted experiments by gradually reducing the size of the 32-bit cube. The degree of a superpoly is expected to increase (roughly by 1) when the size of the corresponding cube is reduced by 1. For each cube size, depending on the cube space, we tested cmax = 5000 to cmax = 100,000 superpolies generated from random cubes of the given size. The superpolies are tested for at least 200 linearity tests. This process enabled us to find additional distinguishers for DA2 with much smaller cube sizes. We also found a small number of non-constant superpolies with these experiments. Overall, with this process, we found cubes of sizes in between l c = 13 to l c = 21 . These cubes work for encryption round in between r 3 = 416 to r 3 = 437.
Technique 2. Cubes obtained using Technique 1 above are of relatively smaller sizes ( ≤ 21 ). For any such cube sizes, the search space is relatively small and the search time is fast due to the smaller cube sizes. It is possible to exhaustively test the entire cube space for such cases. We conducted a set of experiments by reducing the cube sizes further and then enumerating through the entire cube spaces of the reduced cube sizes. Algorithm 6 details the steps of this process. Using steps in Algorithm 6, we have obtained additional distinguishers and nonconstant superpolies.
A sample of the cube testers and non-constant superpolies that are obtained using the above two techniques are listed in Tables 7 and 8, respectively. Recall that the register bits that are used for keystream, i.e., {b t 64 , · · · , b t 95 } , are updated by shifting the contents of the register bits {b t−1 65 , · · · , b t−1 96 } . Therefore, any successful cube for a keystream bit b t i will also work for the keystream bit b t+1 i−1 . With technique 1, surprisingly we found some DA2 cubes of sizes l c ≤ 31 that are successful for the keystream bit b 437 65 (see Table 7). This means the same cube will also be successful for the keystream bit b 438 64 , i.e., will work up to r 3 = 438 rounds. Experimental results confirm this observation. The best distinguisher for DA2 works until r 3 = 438 rounds with a cube size of 18. As a result, with this method, we have obtained a cube tester that allows us to distinguish the cipher with 2614 rounds of P from a truly random cipher. We think that the smaller cube size that works for r 3 = 438 rounds is due to the structure of the corresponding output polynomial. To illustrate an example where a superpoly may pass for a smaller cube but fails for a larger cube, let us consider a hypothetical output function www.nature.com/scientificreports/ over the cube v 0 will pass the linearity test. However, a larger cube v 0 v 1 or v 0 v 2 of size 2 will fail the linearity test in this case. To check for such cases for r 3 ≥ 438 , we further tested cubes with sizes l c < 32 . However, we did not find any such cubes for r 3 rounds beyond 438. For KRA2, we obtained several non-constant superpolies for r 3 = 416 to r 3 = 428 rounds. However, some superpolies are repeated for different cubes, i.e., some equations are the same. The cube sizes for these superpolies ranges between 9 to 16. Notice for Table 8 that most of these superpolies contains only a single variable. Therefore, during the online phase, the cube summation results itself will output the values of most of these superpolies. Overall, the best cube for KRA2 works up to r 3 = 428 rounds when the target is at least a single bit recovery of the key.
Overall comments on the results of DA2 and KRA2. It is worth noticing that the original TinyJAMBU-128 takes 3200 rounds of the permutation P. We count the number of rounds executed during initialisation and encryp-   www.nature.com/scientificreports/ tion of the first plaintext block. It appears that the cipher (its first version) leaves a relatively small security margin, which is 3200 − 2614 = 586 rounds. The computational complexity of the DA2 attack varies from �(2 8 ) to �(2 32 ) . Compared to the complexity of DA1, the computation overhead for DA2 is significantly higher. This difference is the result of a bigger number of rounds in the attacked cipher that includes the initialisation and encryption phases. Our experiments confirm the necessity to separate processing of two consecutive 32-bit input blocks by a sufficiently big number of rounds of P. The increment of the number r 2 of P rounds from 384 (for TinyJAMBUv1) to 640 (for TinyJAMBUv2) strengthens the cipher as it increases both diffusion of bits and algebraic degree of keystream functions. The margin for DA2 with random cubes from full cube space (2 96 ) against TinyJAMBUv2 is expected to be higher than the first version of the cipher. For TinyJAMBUv2, the security margin against DA2 with the reduced cube space is expected to be the same as the first version ( 3968 − 3382 = 586 rounds). The security margin against KRA2 (at least a single bit key recovery) for TinyJAMBUv1 is 596 rounds. The same margin for KRA2 is expected against TinyJAMBUv2.
Description of attack process in encryption phase. The remaining three attacks DA3-DA5 are applied against a round-reduced cipher. Table 9 specifies the assumptions about round-reduced versions of the cipher. As the key bits are absorbed into the NFSR state during each permutation round, a goal of our attacks is not only to find cube testers but also to recover some bits of the key. As an independent research challenge, we aim to verify the designer's claim asserting that all bits of the keystream depend on all input bits after 598 rounds of the permutation P 4 . For the second version of the cipher (TinyJAMBUv2), the claim has been updated and it says that the full dependence is achieved after 512 rounds 5 .
For the DA3 attack, we assume that the cipher runs through the full initialisation phase and the permutation P 1024 when processing the first 32-bit plaintext block. Note that the associated data processing phase is skipped. Thus the attack starting state becomes B 3200 . The length mlen of plaintext is set to 64 bits. A cube is chosen to include the first 32 bits of the plaintext, i.e., {m 0 , · · · , m 31 } and the remaining 32 bits of the plaintext are set to zero.
For the DA4 attack, the cipher executes the initialisation phase, where the NFSR state goes through the full 1024 + 384 × 3 = 2176 permutation rounds. It means that the attack starting state is B 2176 . Note that the associated data processing phase is again skipped. Table 9 shows details of the attack. In particular, cubes are chosen from the last 32 bits {v 64 , · · · , v 95 } of the nonce V. In the encryption phase, the FrameBits are XOR-ed into the state and the state is updated by running P r 3 .
The DA5 attack is similar to DA4. We assume that the cipher executes the initialisation phase (with 2176 permutation rounds) and processes the first 32 bits of associated data (with 384 permutation rounds). Thus, the attack starting state becomes B 2560 . Similarly to DA4, in the encryption phase, FrameBits are XOR-ed and the state is updated by the permutation P r 3 with a reduced number r 3 . Cubes are selected from the first block of associated data, i.e., {d 0 , · · · , d 31 } . Table 9 compares our three attacks. The main difference among them is the selection of cubes.
The DA3 and DA4 are similar to the case of DA2, except that in these two cases, the cubes are chosen from the first block of the plaintext or the last block of the nonce, respectively. Hence, similar to DA2, these two attacks also assume to skip the associated data processing phase. However, we note that for DA3, the attack can be performed even if the associated data processing phase is included. The DA3 selects the cube from the plaintext, which occurs after the associated data processing phase. Hence, the additional rounds of the associated data processing phase will not impact the plaintext cubes. The DA5 is performed to identify the dependency of the output function with the associated data bits; hence, the associated data processing phase is included for this attack.  Table 9). Given a cube, we check the resulting superpoly for linearity using 50 BLR tests. At the same time, the number of permutation rounds of P r 3 is gradually increased. Consider DA3. We have found a few single bits of the keystream outputs that produce constant superpolies for r 3 = 416, 417, 437 . Similar results are obtained for DA4. For the DA5 attack, we get linear superpolies for r 3 = 416, 437 . Table 10 summarises our experiments with the three attacks. Note that we did not test all the values for r 3 between 416 to 437. However, we are confident that cube testers exist for any r 3 in the interval (416, 437). For all attacks, the largest number of rounds in the encryption phase is r 3 = 437 . We have also tried bigger values (i.e. r 3 ≥ 438 ). Unfortunately, we could not find any cube and the matching superpoly that passes the BLR test. Note that the 32-bit cube testers allow to tell apart the cipher from a random one only. Although the attacks do not allow to recover any of the key bits, they give an insight into cipher security.
As all cube testers, in the three attacks, require 32-bit cubes, the complexity of the attacks is �(2 32 ) . Note that the attacks apply a similar approach. It should not be a surprise that the results are also similar. From the result given in Table 10 we see that our cube testers work up to 437 rounds in the encryption phase. This leads us to a conclusion that the cipher has a better security margin than the one claimed by the designers.
The cubes for DA2 and KRA2 can also be applied against DA3 to DA5 by using same or corresponding indices for plaintext in DA3, nonce in DA4 and associated data in DA5. Experimental results verify this observation. So, it is also possible to find cube sizes of 18 (compute corresponding indices from Table 7) for DA3 to DA5 that works for 438 rounds of encryption phase.

Conclusion
We have investigated the resistance of the TinyJAMBU cipher against cube attacks. The cipher is a finalist of the NIST LWC Project. We have applied five variants of the distinguishing attack: DA1-DA5, and two variants of the key recovery attack: KRA1-KRA2. They all target the first version of the cipher called TinyJAMBU-128. The changes in the second version of the cipher only increase the number of rounds during the nonce-setup, associated data processing, and finalisation; no other changes are made in this version. The first two attacks DA1 and KRA1 are launched against the initialisation phase (that includes 2176 rounds) of the cipher. For DA1, we have been able to find cube testers (distinguishers) with cube sizes ranging from 3 to 20. For KRA1, we have identified non-constant superpolies that can be used to recover eight bits of the secret key. The attack DA2 is an extension of DA1. It is applied against a cipher variant that includes the initialisation phase and 438 encryption rounds. We have found 18-bit cube testers. The KRA2 is applied against a cipher variant that includes the initialisation phase www.nature.com/scientificreports/ and 428 encryption rounds. Note that the results of DA1 and some results in DA2 (for random cubes from full cube space) are only applicable to TinyJAMBUv1. However, the results from the DA2 with reduced cube space and KRA2 are applicable to both TinyJAMBUv1 and TinyJAMBUv2. The other three attacks (DA3-DA5) are against cipher variants with the encryption phase. Bits of cubes are chosen from either plaintext, nonce or associated data. We note that for DA3 to DA5, there are some smaller cubes of sizes less than 32 that work up to 438 rounds; however, the superpoly of the 32-bit cube tester do not pass beyond 437 rounds. As a result, we have identified 437 rounds as the upper bound on the number of rounds, for which the attacks work and allow to find 32-bit cube testers. Note that the designers of TinyJAMBUv2 claim that after 512 rounds, all output bits in keystream are affected by all input bits. Based on our results, we expect that the full dependency is achieved after 437 rounds. The conclusion on the full dependency achievement of a 32-bit cube tester after 437 rounds are based on the fact that the deterministic 32-bit cube tester does not pass beyond 437 rounds. That means all these 32 cube variables are present in the output equation obtained after 437 rounds. As the degree of the output function increases significantly after each round, it is expected that the 32-bit cube variables will be present in all the output equations obtained after 437 rounds. Hence, the cube is unlikely to pass beyond 437 rounds.
We emphasize that the results reported in this paper do not threaten the security of TinyJAMBU. We hope that the cubes identified in the work contribute to a better understanding of security strengths and limitations of the cipher.

Code availability
The source codes and detailed experimental results for all our implementations can be accessed from: https:// github. com/ cst17 09690/ tinyJ ambuC ubeAt tack.