Identity-based controlled delegated outsourcing data integrity auditing scheme

With the continuous development of cloud computing, the application of cloud storage has become more and more popular. To ensure the integrity and availability of cloud data, scholars have proposed several cloud data auditing schemes. Still, most need help with outsourced data integrity, controlled outsourcing, and source file auditing. Therefore, we propose a controlled delegation outsourcing data integrity auditing scheme based on the identity-based encryption model. Our proposed scheme allows users to specify a dedicated agent to assist in uploading data to the cloud. These authorized proxies use recognizable identities for authentication and authorization, thus avoiding the need for cumbersome certificate management in a secure distributed computing system. While solving the above problems, our scheme adopts a bucket-based red–black tree structure to efficiently realize the dynamic updating of data, which can complete the updating of data and rebalancing of structural updates constantly and realize the high efficiency of data operations. We define the security model of the scheme in detail and prove the scheme's security under the difficult problem assumption. In the performance analysis section, the proposed scheme is analyzed experimentally in comparison with other schemes, and the results show that the proposed scheme is efficient and secure.


Related work
Cloud data auditing has garnered increasing attention in recent years.Ateniese et al. 11 first proposed a public audit cloud data auditing scheme based on RSA homomorphic tagging technology.This scheme enables the remote verification of cloud data integrity by randomly selecting a subset of data blocks for auditing.Yang et al. 12 proposed an auditing protocol with privacy preserving.This protocol combines data homomorphic authentication tags with the random masking approach.The combination of these techniques assures that a third-party auditor cannot access the user's private information while verifying integrity.Li et al. 13 and Zheng et al. 14 proposed approaches that aim to tackle the issue of privacy preservation in cloud data.Ping et al. 15 proposed a cloud data auditing approach utilizing random sampling to verify data integrity.However, this method is limited to queries and does not support dynamic updates of cloud data.Yu et al. 16 proposed an attribute-based cloud data auditing scheme, where users can define custom attribute sets and designate authorized third-party auditors to inspect the integrity of outsourced data.Jalil et al. 17 has presented a public auditing scheme that utilizes BLS signatures for cloud data.This scheme effectively achieves public auditing goals while preserving data privacy.Nevertheless, the scheme's efficacy is strongly dependent on the PKI, necessitating a more intricate approach to certificate management.Ji et al. 18 proposed an identity-based auditing scheme that effectively addresses the challenges related to certificate management in the context of PKI.
For delegated outsourced integrity auditing of data, Guo et al. 19 introduced a novel approach for dynamic provable data possession, wherein the burden of frequent auditing tasks is shifted to an external auditor.This approach aims to alleviate the validation overhead experienced by the client.Additionally, the proposed scheme incorporates a secure auditor responsible for verifying the integrity of outsourced files.Notably, the auditor is not granted access to information about the user's files.Yang et al. 20 proposed a proof-of-storage approach that allows for delegation and supports the involvement of third-party auditors.In the scenario that the designated auditor is not accessible, it is possible to substitute the auditor at any given moment, hence allowing for the appointment of a new auditor to conduct data integrity verification.Rao et al. 21introduced a novel dynamic auditing approach to outsourcing to mitigate the presence of dishonest entities and conflicts while enabling the verifiable dynamic update of outsourced data.Zhang et al. 22 introduced an approach for data outsourcing with public integrity verification based on identification, wherein the original data owner can delegate an proxy to produce the data's signature and afterward outsource it to a cloud server.However, it is essential to note that none of the above schemes support a regulated delegated data outsourcing mechanism.
In response to the data update question, Thangavel et al. 23 proposed a cloud storage auditing scheme based on Ternary Hash Tree (THT) that supports dynamic updating of data.This scheme enhances the efficiency of updating ternary trees compared to binary trees.Zou et al. 24 mentioned a public auditing scheme that implements the Ranked-based Merkle Hash Tree (RMHT) to enable the auditing of secondary file blocks.Hariharasitaraman et al. 25 introduced a novel public authentication approach that uses a Position aware Merkle tree (PMT).This scheme incorporates a ternary tuple scheme and exhibits strong resilience in offering authentication and data integrity services.Li et al. 26 constructed a certificate-less verifiable data ownership mechanism that demonstrates efficiency.Additionally, He utilized the Dynamic Hash Table (DHT) to facilitate data updates and provide data privacy protection.Peng et al. 27 proposed a cloud storage auditing scheme based on Multi-RepMlica Position-Aware Merkel Tree (MR-PMT), which can effectively verify the integrity of multi-replica data.However, the effectiveness of the auditing process is compromised when the size of replica files exceeds a certain threshold.Rao et al. 21presented the Batch-Leaves-Authenticated Merkle Hash Tree (BLA-MHT), which allows batch-authenticate of multiple leaf nodes and their corresponding indices while demonstrating resistance against replacement assaults.In Table 1, we present a comparative analysis of our proposed scheme and several relevant schemes, focusing on controlled data outsourcing, certificate management, source data auditing, and dynamic updates.Source data auditing refers to the process of auditing and verifying the source files.In our research, we propose a reliable source file auditing mechanism to ensure the consistency and integrity of the source files during data outsourcing.Theoretical analysis and experimental results have substantiated the inherent resilience and security characteristics of the proposed scheme, while maintaining performance levels without discernible degradation.
Organization The subsequent sections of the paper are structured in the following manner: "Preliminaries" section provides a comprehensive overview of the procedures that are necessary for the execution of this research study.A comprehensive account of the technique employed in this study is provided in "Method" section."Security analysis" section provides a comprehensive security analysis of the proposed scheme."Performance analysis" section provides a comprehensive evaluation of the proposed scheme performance."Conclusion" section summarizes the full article.

Preliminaries Bilinear mapping
A bilinear mapping 28 is defined as follows: Given two multiplicative cyclic groups G 1 and G 2 of order p, where p is a large prime number, g is a randomly chosen generator of group G 1 .The bilinear pairing function e : G 1 × G 1 → G 2 has the following three features: 1. Bilinearity: For all u, v ∈ G 1 and x, y ∈ Z * p , there exists e(u x , v y ) = e(u, v) xy ; 2. Computability : For all u, v ∈ G 1 , there exists an efficient algorithm to compute e(u, v); 3. Non-degeneracy: For a generating element g of G 1 such that equation e(g, g) = 1 holds.

Difficulty assumptions
Compute the Diffie-Hellman problem 7 : Given a multiplicative cyclic group G 1 of large prime order p, g is a randomly generated element of G 1 , and given (g, g a , g b ) ∈ G 1 , any probabilistic polynomial time algorithm is difficult to compute g ab in the unknown a, b ∈ Z * p , i.e., DL problem 29 : Given a multiplicative cyclic group G 1 of large prime order p, g is a randomly generated element of G 1 , and given (g, g a ) ∈ G 1 , it is hard for any probabilistic polynomial time algorithm to compute a ∈ Z * p , i.e., Table 1.The comparison of different program functions.

Data structure
The red-black tree (RBT) 30 is a self-balancing binary lookup tree where each node possesses an additional attribute indicating its color, which can be red or black.It maintains the balance of the tree by rotating or recoloring the nodes.During dynamic update operations, it has a time complexity of O(1) for insertion and deletion operations and a worst-case time complexity of O(log n) for lookup.The properties of the red-black tree make an RBT of n nodes always maintain the height of log n.
Bucket-based red-black tree (B-RBT) is a balanced search tree that can reach a rebalanced state of the tree in constant time in the worst case.Precisely, instead of storing a single key composition in the leaf nodes of the search tree, the structure consists of buckets storing an ordered list consisting of multiple keys in each leaf.The B-RBT proposed in this scheme does not require a global reconstruction technique, and for the deletion of nodes in the tree, there is no need for global reconstruction as in the case of a traditional red-black tree.As a result, the structure requires less space and time.The rules of the B-RBT structure are as follows:(1) each node has a color, red for red nodes and black for black nodes; (2) the root node is black; (3) each leaf node is a black bucket; and (4) if a node is red and the parent node is red as well, its children are all black, i.e., all paths where at most two consecutive red nodes exist; (5) the number of black nodes on each path starting from any node to its leaf node differs by at most 1.The B-RBT structure is shown in Fig. 1, where each node has an extra bit to indicate its color, and each node has the values of five attributes consisting of multiple keys for buckets, left child pointers, right child pointers, parent pointers, and color bits.The structure inserts elements into buckets instead of internal tree nodes, and each tree node stores a path value that is less than or equal to the key stored in the bucket of its right child tree and greater than the key of the bucket in its left child tree.Furthermore, the buckets are split when a single bucket increases to a specific value and merge when two neighboring buckets decrease to a specific value.Each bucket contains a header that attaches the bucket to a tree node and stores additional information about it, such as its size.For storing the size of the buckets, our scheme preserves a bucket threshold consistent with that proposed by Elmasry et al. 31 , i.e., the number of keys in each bucket must be in the range of 0.5-2 H.The splitting operation may result in the need to add a new node to the tree and may break the equilibrium state of the original B-RBT.Our goal is to maintain pointers to the first, middle, and last list nodes in the header of each bucket so that each update operation invokes a repair procedure and fixes it before the next split or merge violates the tree's equilibrium state, guaranteeing the correct setup of these pointers until then.

Method System model
The scheme proposed in this paper contains a total of five entities: DO, KGC, PS, TPA and CSP, and the system model is shown in Fig. 2. Our proposed system model provides fine-grained control and authorization mechanisms to ensure data integrity with efficient dynamic data updating and source file auditing capabilities.
DO is the data owner, which stores the data in the cloud.CSP provides a powerful storage service for DO.PS is the proxy service provider where DO authorizes a PS to upload the data stored in CSP.DO blinds the data before uploading it to the PS.Blinding is a data processing technique aimed at concealing or protecting sensitive information while maintaining the utility of the data.In our research, the primary purpose of employing blinding techniques is to safeguard the privacy and confidentiality of the data, preventing unauthorized access and leakage.

Threat models
The proposed scheme faces three main types of attacks: 1.A malicious PS can mimic a DO or another authorized PS or abuse the DO's authorization delegation by processing the DO's files and uploading them to CSP storage.2. Due to hardware failure or self-interest, a malicious CSP may delete or modify the DO's files, especially those accessed less frequently.3.During the data outsourcing phase, there is a possibility that the PS or CSP may steal the DO's data.Semitrustworthy TPA are unreliable in terms of data privacy and show curiosity about the content of data outsourced by DOs during the audit process.

Security goals
Considering that the scheme may be subject to the above attacks, our scheme aims to achieve the following security objectives: 1. Controllable delegation Proofs of authorization generated by the DO can only be used by the specified PS to outsource specified files.Even authorized PS cannot misuse authorization proofs to outsource unspecified files, and multiple PSs cannot collude to infer new authorization proofs for outsourced unspecified files.2. Audit correctness Means that only data evidence and tag evidence generated by the CSP is valid simultaneously to pass the TPA validation.3. Privacy preservation In order to protect the sensitive information of the outsourced data, it is necessary to blind the data during the data outsourcing phase to protect the privacy and security of the data.During the audit process, TPA cannot retrieve the user's data information.

Overview of the proposed scheme
The identity-based controlled delegated outsourced data integrity auditing scheme proposed in this paper consists of three phases (setup phase, auditing phase, and dynamic updating phase), and the setup and auditing phases contain the following eight polynomial time algorithms(SysSetup , KeyGen,DeleGen,DataBlind,DataOutsourcing ,ChalGen , ProofGen and ProofVerify).

Setup phase
1. SysSetup(1 κ ) → (SysPara, msk) .System setup algorithm.The KGC executes this algorithm and inputs the system security parameter κ , outputs the system global parameter SysPara , and the KGC's master key msk.2. KeyGen(SysPara, msk, ID i ) → sk i .Key generation algorithm.KGC executes this algorithm and takes as input the global parameters SysPara , master key msk , and identity ID i and outputs the corresponding key sk i ., where is the proof of authorization, and Ŵ is the delegated proof. 4. DataBlind(M, π) → M ′ .Data blinding algorithm.The plaintext data M and the blinding factor π are used as input, and the blinded data M ′ is output.5. DataOutsourcing(SysPara, Dele_Info, sk p , M ′ ) → (τ , M * ) Data Outsourcing Algorithm.The PS executes the algorithm, which inputs the global parameters SysPara , the delegated authorization certificate information Dele_Info , the key sk p of the PS, and the blinded source data M ′ to be outsourced and outputs the correspond- ing file tag τ and the processed data M * .

Audit phase
1. ChalGen(SysPara, τ ) → chal .Audit challenge generation algorithm.The TPA executes this algorithm by inputting the global parameters SysPara and the file tag τ .After the file tag τ passes the legitimacy verifica- tion by the TPA, it outputs the audit challenge chal.2. ProofGen(SysPara, chal) → proof Proof generation algorithm.The CSP executes the algorithm.Inputs the global parameters SysPara and the audit challenge chal and outputs the audit challenge proof proof = (δ, η) , where δ is the tag evidence, and η is the data evidence.3. ProofVerify(SysPara, τ , proof ) → (True/False) .Proof validation algorithm.The TPA executes this algorithm by inputting the global parameter SysPara , audit challenge proof proof and the file tag τ , and outputting True when the evidence validation passes, otherwise outputting False.

Security definitions
This subsection presents formal security definitions to fulfill the above security goals.Two probabilistic polynomial-time adversaries,A 1 and A 2 ,are used to simulate a malicious PS and a malicious CSP, respectively.A 1 has the capability to simulate collusion in order to forge or misuse the authorization of DO, and it can also simulate the CSP to modify the stored files of DO without being detected.

Game 1:
We have defined a security game against malicious PS based on the concept of probabilistic polynomial time.The adversary A 1 plays the following game with the challenger C: Setup: The challenger C runs the system setup algorithm SysSetup(1 κ ) → (SysPara, msk) to obtain the global parameters SysPara and the master key msk and sends SysPara to adversary A 1 .
Queries: Adversary A 1 performs multiple polynomial queries to the challenger C , and challenger C responds to the polynomial queries of the adversary A 1 as follows: 1. Extraction query: The adversary A 1 performs a private key extraction query on all identifiers ID i .Challenger C computes the private key sk i of ID i and sends it to A 1 .2. Delegation query: Adversary A 1 submits proof of delegation to challenger C .If DO's private key sk o has yet to be queried before, challenger C first generates DO's private key sk o .Then challenger C responds with delegation proof .3. File processing query: Adversary A 1 submits the proof of authorization (�, Ŵ) to the challenger C in this query.If the PS's private key sk p and proof of authorization have yet to be queried, challenger C generates them first.Then challenger C responds with the processed file M * .
Output: Eventually, adversary A 1 generates a processed file M * under the legal authorization .Adversary A 1 wins the game if the following conditions are met: 1.The adversary A 1 obtains its private key without extracting a query from ID o ; 2. Adversary A 1 has failed to conduct a proxy search on the certificate of authority ; 3. Adversary A 1 did not perform a file processing query involving proof of authorization ; 4. Proof of delegation Ŵ is ID o valid for ID p .
Definition 1 An identity-based CDODIA scheme is secure against adaptive mimicry and abuse of delegated proofs if any probabilistic polynomial-time adversary A 1 and challenger C win the above game with negligible probability only.

Game 2:
We define security games against malicious CSP with probabilistic polynomial time adversaries A 2 .Play the following game with challengers C: Setup: Challenger C runs the system setup algorithm SysSetup(1 κ ) → (SysPara, msk) to obtain the global parameters SysPara and the master key msk and sends SysPara to adversary A 2 .
Queries: Challenger C adaptively interacts with adversary A 2 to enforce the integrity auditing protocol.Here adversary A 2 plays the role of a validator, which can respond to any integrity challenge initiated by the challenger C.
1. Challenge generation query: Challenger C generates a challenge response for a particular data file and sends it to A 2 .2. Challenge response query: Adversary A 2 generates evidence as a challenge-response based on processed file.3. Challenge validation query: Challenger C validates its challenge-response evidence and returns the validation results to the adversary A 2 .
Output: Finally, challenger C and adversary A 2 complete the last round of the integrity audit protocol.Chal- lenger C sends an audit challenge chal against a processed file M * , and adversary A 2 generates evidence proof as a challenge-response.Assume that the file M * stored in the CSP has been corrupted or modified and that the audit challenge chal contains the corrupted or modified data.Suppose evidence proof contains a valid tuple ( ϑ, { η j } 1≤j≤c ) for the audit challenge chal and the processed file M * , and the tuple is not the same as the audit challenge chal and the correctly maintained file M * .In that case, it is stated that adversary A 2 wins the game.
Definition 2 An identity-based CDODIA scheme is secure against data modification attacks if any probabilistic polynomial-time adversary A 2 wins the above game only with negligible probability.

Detailed description of the proposed scheme
In this section, the three stages of the proposed scheme are described in detail, and the audit process is illustrated in Fig. 3.

Detailed description of setup phase
KGC chooses two multiplicative cyclic groups G 1 and G 2 of large prime order p, based on the security param- eter κ. g is a randomly generated element,g ∈ G 1 .Selecting a bilinear pairing function e : and computes y = g x , the master key msk = g x 1 .Three collision-resistant hash functions are chosen: The final system parameters SysPara = {G 1 , G 2 , e, p, g 1 , y, H 1 , H 2 , H 3 , {µ i } 0≤i≤l , {v i } 0≤i≤ℓ , {u i } 0≤i≤n } are pub- lished, and the master key msk is kept secretly by KGC itself.
After receiving the identity ID i from DO or PS, KGC generates the private key sk i for DO and PS according to its own master key msk , which is as follows: first, KGC calculates To authorize a legitimate proxy PS, the DO generates an authorization proof containing the DO's identity ID o and the proxy PS's identity ID p .The authorization proof may also contain other information related to the source data M, such as the time of delegation, file type, etc., i.e., � = (ID o ||ID p ||TimeStamp||Type) .Based on the system parameter SysPara , the DO first calculates Then, it randomly selects σ φ ∈ Z * p and generates the proof of delegation Ŵ = (α, β, γ ) , where Finally, the proof of delegation (�, Ŵ) is sent to the proxy PS.After receiving the information of the proof of delegation from the DO, the PS can verify the legitimacy of the delegation information by calculating u o and ϕ as well as checking the equation If the equation is valid, the PS receives the delegation from the DO; otherwise the delegation request is rejected.
Given an outsourced data source file M ∈ {0, 1} * with legitimate proof-of-authorization information, the DO divides the file into n data blocks,M = {m 1 , m 2 , . . ., m n } .π ∈ Z * p is randomly selected as the blinding factor, and each blinded data block m ′ i = (m i ||i) + π is computed.Finally, the blinded file M ′ is sent to the PS.
After receiving the blinded data from the DO, firstly, the PS divides the blinded data Second, PS randomly selects the source file identifier F id ∈ Z * p and the random value σ F ∈ Z * p and computes v F = g σ F and sets τ 0 = �||F id ||v F ||r .PS selects the signature algorithm PS.Sign = (KeyGen, Sign, Verify) 32 to sign τ 0 and generates the source file tag τ = τ 0 ||PS.Sign.Sign(τ 0 , ssk)||spk , where the signed public-private key pair (ssk, spk) is generated by the algorithm PS.Sign.KeyGen(1 κ ) .Then PS uses its private key sk p to sign the data block where = ||F id ||i .Finally, the processed source files M * = (M ′ , F id , Ŵ, {ϑ i } 1≤i≤c ) and (τ , {ϑ i } 1≤i≤c ) are sent to CSP and TPA, respectively, and M * and {ϑ i } 1≤i≤c are deleted locally.

Detailed description of audit phase
In this phase, the TPA performs integrity auditing of the cloud data from time to time, which contains the following algorithms: (1) (6) e(α, g) ?= e(y, g 1 ) • e(sk www.nature.com/scientificreports/Before the integrity audit begins, the TPA runs the algorithm PS.Sign.Verify(τ 0 , spk) to verify the legitimacy of the file tag τ .If the validation fails, the audit request is rejected; if it passes the validation, TPA randomly selects a non-empty subset S F from the set [1, r] and randomly selects s i ∈ Z * p to generate the audit challenge chal = (i, s i ) i∈S F , and finally sends (chal, F id ) to the CSP.
Upon receiving the audit challenge from the TPA, the CSP locates the files M * to be integrity audited based on (chal, F id ) and calculates the tag proof and data proof Finally,proof = (ϑ, {η j } 1≤j≤c ) is sent to the TPA along with the delegation Ŵ.
After receiving the proof, the TPA calculates ( u o , u p , ϕ) based on the system parameters SysPara and the file tag τ .Then TPA verifies the legitimacy of the delegation Ŵ according to Eq. ( 6); if the equation is valid, it means that the PS has a proof of legitimate authorization.Finally, TPA performs an integrity audit by checking equation If the equation is valid and outputs True, the file stored in the CSP is complete; otherwise, if it outputs False, the file is incomplete.

Detailed description of dynamic update phase
The B-RBT structure proposed in this scheme can effectively support dynamic updating of data (modification, insertion and deletion) in the following process: 1. Data insertion process: when DO needs to insert a new data block after a certain data block, DO generates the corresponding dynamic updated information Update_Info_PS = (Insert, i, Dele_Info ′ , M ′ ) and sends it to PS, where Insert is the insert command, i is the location to be inserted,M ′ is the newly inserted data, and Dele_Info ′ is the delegated authorization proof information of the new data M ′ .After receiving the dynamic updated information, PS generates the new updated information Update_Info_TPA = (Insert, i, M ′ * ) after processing the new data M ′ , and sends the new updated information to TPA.The TPA receives the update information and stores the signature of the data block to be inserted in the B-RBT data structure, inserts a new list node at the specified position of the specified bucket B, and calls the repair process of bucket B.
The TPA receives the updated information and stores the signature of the data block to be inserted in the B-RBT data structure.When the bucket exceeds the maximum threshold, if the repair pointer P K has not yet reached the root node R, the repair process of B is called repeatedly until PK reaches R. Then bucket B is split into two buckets {B 1 , B 2 } , and the new parent node B 12 of {B 1 , B 2 } is added to the tree in the form of a red node, and the pointers of the two new buckets are made to point to the newly inserted nodes.Finally, the repair process is called again for one of the two buckets to complete the signature storage and update these data records in the CSP. Figure 4a shows that after inserting data into bucket S in Fig. 1, the capacity of its bucket exceeds over the maximum threshold, so bucket S is divided into two buckets, node K is used to manage the two new buckets and node K is labeled as red.2. Data deletion process: When the DO needs to delete a block of data, DO generates the corresponding dynamic updated information Update_Info_PS = (Delete, i, Dele_Info ProofVerify(SysPara, τ , proof ) → (True/False) (10)   e(ϑ, g) ?=(e(y, g 1 ) • e(sk end of the left bucket, and the parent node of B ′′ is the grandfather node of the original B or B ′ .If the deleted parent bucket is black, mark B ′′ double black to complete the data block signature deletion and update these data records in the CSP. Figure 4b shows the merging of bucket L and sibling bucket L ′ with node M when the size of bucket L's sibling bucket L ′ will not exceed the minimum threshold after deleting the data of bucket L in Fig. 1. 3. Data modification process: The operation is stored in the CSP to correct certain information.The specific operation and insertion and deletion of the same can be viewed as the original data deletion operation, in the insertion of new data to complete.

Security analysis
Suppose the entities in the scheme proposed in this paper are honest in executing the individual protocols as expected.In that case, key generation, delegation generation, and file processing can be audited correctly.
Theorem 1 Privacy Preservation.PS, TPA or CSP cannot extract the DO's original data in the acquired data.In the data outsourcing phase, PS or CSP cannot extract the real data of DO from the blind data.In the auditing phase, TPA cannot obtain the real data of DO from the data signature.
Proof Data privacy security means that an adversary cannot obtain any original data information without the corresponding data decryption key.Firstly, in the data outsourcing phase, the blind data blocks M ′ and the blinding factors π ∈ Z * p are random functions generated by the DO randomly selecting the key seeds, and the blind data blocks are generated independently from each other, and thus the PS or the CSP wanting to extract the real data information M from them can be treated as a DL problem, and the probability of solving the DL problem can be negligible.Secondly, in the auditing phase, it can be verified by the following equation: From the above derivation it can be seen that if the values of (sk p,1 • H 3 (�)) s i and g σ F are given, it is compu- tationally infeasible to try to derive the value of i∈S F (sk p,1 • (H 3 (�)) s i •σ F ,and that c j=1 (u η j j ) σ F can be viewed as blinded by i∈S F (sk p,1 • (H 3 (�)) s i •σ F .Thus the TPA trying to extract the real data information from the tag proof ϑ can be viewed as a CDH problem, and the probability of solving the CDH problem is negligible.Proof From the previous section, the conditions for Eq. ( 4) to hold are obvious, and its correctness will not be proved in detail here.From Eq. ( 6), u o = (u o,1 , u o,2 , . . ., u o,l ) , and ϕ = (φ 1 , φ 2 , . . ., φ ℓ ) can be calculated by Eqs. ( 1) and (5).Therefore there are From the above equation, Eq. ( 6) holds.
The evidence proof returned by the CSP can be obtained from , where From Eq. ( 10), it follows that From the above proof, Eq. ( 10) holds.
Theorem 3 If the CDH difficulty assumption problem holds and the signature scheme PS.Sign = (KeyGen, Sign, Verify) employed by PS is computationally secure, then our scheme is secure and capable of withstanding adaptive attacks and abuse of delegation attacks, i.e., no cloud client or malicious CSP can collude with PS to construct a new delegation proof and use this delegation proof to outsource files.
Proof Suppose that adversary A generates with probability ε a processed file containing a forged delegation.Then, given a multiplicative cyclic group G 1 with element g and a CDH difficulty assuming (g, g χ , ζ ) , challenger C and the adversary A interact adaptively to compute ζ χ .
Setup: Challenger C sets up l u = 2(q e + q s ),l w = 2q s , where q e and q s denote the total number of extracted queries and delegated queries, respectively.Assume that we have l u (l + 1) < p and l w (ℓ + 1) < p .The challenger Next, the challenger C generates parameters as follows: The above public parameters are indistinguishable from the system global parameter SysPara.
Queries: Adversary A can adaptively interact with the challenger C and perform the following queries: 1. Extracting query: Adversary A can adaptively submit its identity ID i to challenger C to obtain its private key.Since challenger C does not have the master key msk , challenger C will respond to the query as follows: challenger C computes u i according to the Eq. ( 1) such that and hence,µ 0 • l j=1 µ p and computes the private key sk i = (sk i,1 , sk i,2 ) , where If lets the σ ′ i = σ i − x/W( u i ) , the private key sk i for the identity ID i can be proved to be legal and valid by verifying the following equation.e(ϑ, g) = e((g x 1 ) • e(( = (e(y, g 1 ) • e(sk www.nature.com/scientificreports/and Therefore, the private key generated by the above method is indistinguishable from the real private key.If W( u i ) = 0 , the algorithm is terminated.2. Delegation of authority query: Adversary A may adaptively submit proof of authority to the challenger C .
Challenger C can respond as follows.
Challenger C first calculates u o according to Eq. ( 1) and second calculates W( u o ) .There are two scenarios for this condition: Case 1: W( u o ) = 0 .Challenger C generates a private key for the DO according to the extraction query sk o , generates a proof of delegation Ŵ according to the proposed scheme and returns it to the adversary A. Case 2: W( u o ) = 0 .For a given proof of authorization , challenger C computes ϕ according to Eq. ( 5) .Simi- larly, we make and so it follows that v 0 • ℓ j=1 v , , it follows from the following equation that the delegation proof generated by the above method is indistinguishable from the real proof of delegation Ŵ .If F( ϕ) = 0 , then the algorithm is terminated.
3. File processing query: The challenger C accepted the authorization proofs (�, Ŵ) submitted by the adversary A .. The proposed scheme leads to a file processing query, which requires an extraction query and delegation of authority query.Challenger C computes W( u o ) and F( ϕ) , and if W( u o ) = 0 and F( ϕ) = 0 , challenger C terminates the algorithm.If W( u o ) = 0 and F( ϕ) = 0 , then challenger C executes the algorithm: Step A: Challenger C generates a proof of delegation Ŵ in the same way as in the delegation of authority query.
Step B: First, challenger C computes u p according to Eq. ( 1) and determines whether W( u p ) is zero, if W( u p ) = 0 , terminate the algorithm; otherwise, generate the private key sk p for PS as in the extrac- tion query.Secondly, challenger C randomly selects σ F ∈ Z * p and computes v F = g σ F , for each data block m = (m i,1 , m i,2 , . . ., m i,c ) {1≤i≤r} randomly selects σ i ∈ Z * p and sets so we have Finally, challenger C computes the corresponding metadata tag ϑ i = sk p,1 • (g σ i ) σ F for the data block m .It follows that the metadata tags generated by the above approach are indistinguishable from the real generated metadata tags.
Step C: After completing the above steps, challenger C sends the processed file M * = {{ϑ i } 1≤i≤c , Ŵ, F id } to adversary A.
Output: Finally, if challenger C does not terminate the algorithm, adversary A outputs a processed file M * with non-negligible probability with respect to the proof of authorization .If the adversary A wins the game, e( � ϑ, g) = e((g x 1 ) Process a file M with an authorization proof , calculate u p = H 2 (ID p ) and extract the secret key sk p of PS.According to the proposed scheme.Process the file M using the parameter u j = g , where π j , ω j ∈ Z * p . 2. The algorithm interacts with adversary A to execute the integrity auditing protocol.According to Game 5, adversary A terminates the auditing protocol if the data tag { η j } 1≤j≤c output by the adversary during the auditing process is not equal to the desired data tag {η j } 1≤j≤c .
According to Game 4 we have ϑ = ϑ .From Eqs. ( 11) and ( 12) we have that (12)   e( � ϑ, g) = e((g x 1 ) and a 160-bit Z * p parameter setting are chosen in our scheme, respectively.In the challenge generation phase of our scheme, the audit challenge chal = (i, s i ) i∈S F is initiated by the TPA to the CSP, and its communication cost is 2|p| , and the communication cost incurred by the CSP returning proof proof = (ϑ, {η j } 1≤j≤c ) to the TPA is |p| + |q| .In addition, the data structure of this paper's scheme is stored on the TPA side, which requires lower communication costs than storing it on the CSP side.In Table 3, we compared the communication overhead incurred by the proposed scheme with other cloud data auditing schemes when sending audit challenges in the challenge generation phase and audit proofs in the proof generation phase.

Experimental analysis
The experimental environment is configured as an AMD Ryzen7 5800H with Radeon Graphics 3.2 GHz RAM32GHz laptop, and all the simulations are implemented on the Ubuntu system.Using the Pairing Based Cryptography PBC and the GUN Multiple arithmetic Precision to implement the corresponding cryptographic operations.Python was used for data processing and experimental result analysis.In our experiments, we chose 2000 data blocks, each with a size of 8 KB, and the length of p was chosen to be 160 bits.

Time overhead in the key generation and proof of authorization generation phase
The time overhead performance of generating and verifying the private key for a particular user and the time overhead performance of generating and verifying the proof of delegation are shown in Fig. 5.The time consumed     for key generation and verification, and attorney certificate generation and verification is about 9.13 ms, 32.34 ms, 10.75 ms, and 44.07 ms, respectively, which is negligible for deployment in real applications.
The computation overhead at the TPA and CSP sides Figure 6 shows the time overhead required to audit an outsourced file with a corruption rate of 1%; we simulate the time overhead required to audit an outsourced file under different corruption detection probabilities, i.e., 0.5, 0.6, 0.7, 0.8, 0.9, and 0.99.The simulation results in Fig. 7 show that our scheme incurs time overhead at both the TPA and CSP sides when executing the auditing protocol.For achieving a detection probability of 0.99, the TPA in the scheme can do it in less than 3 s.In the scheme, the computation overhead at the TPA side is higher than at the CSP side, which is caused by the higher number of bilinear mapping operations employed at the TPA side.
Time overhead in the data signature generation phase Figure 7 illustrates the time overhead performance curve of the proposed scheme with scheme 17 , scheme 19 , and scheme 21 in the data block signature generation phase.The proposed scheme has fewer exponential operations compared to the scheme 17 and scheme 19 , so its computational overhead is lower.Whereas scheme 19 has more multiplicative and exponential operations, and its computational overhead is higher.

Time overhead in the data proof generation phase
The CSP generates relevant data proof time performance curves based on challenge audits, as shown in Fig. 8.
From the figure, it can be observed that the proof generation time for each scheme increases linearly as the number of queried data blocks increases.Comprehensively checking all data blocks in the cloud increases the computational burden.Therefore, to improve efficiency, we propose to specify 460 data blocks in the query audit message, which is sufficient to achieve 99% probability of data corruption or tampering for a real cloud data auditing system.In this case, the computational overhead of the proposed scheme is only about 1.84 s.

Time overhead of the data proof verification phase
The performance curves of the time overhead generated by TPA during the data evidence validation phase are shown in Fig. 9.As shown in Fig. 9, all the validation data computation overheads are linear and increase with the number of challenge data blocks.However, compared to schemes 17 and 21 , our scheme has fewer multiplicative and exponential operations and uses less validation time.Compared with the scheme 19 , our scheme uses more pairwise operations in the verification phase and is, therefore, slightly more efficient than the scheme 19 .The time overhead is about 8.29 s in verifying 1000 data blocks.

Time overhead of the dynamic update phase
The B-RBT data structure of this scheme is compared with THT Scheme 23 , DHT Scheme 26 , and MR-PMT Scheme 27 , and the time cost of our scheme is lower.Where the time complexity of insertion and modification of data blocks for both the B-RBT structure and THT structure is O(1) , but the time complexity of lookup for the THT structure is O(log n) .Secondly, the structure time complexity of DHT and MR-PMT are both O(log n) , which takes more time overhead, the time overhead required for data update operation of each scheme is shown in Fig. 10.

Conclusion
In this paper, we propose a secure identity-based controlled delegation of outsourced data integrity auditing scheme.The data owner generates a proof of authorization delegation and appoints an proxy to help him/her upload files to the cloud.The processing and outsourcing of selected files on behalf of the data owner may only be carried out by an authorized proxy server.The identity-based controlled delegation and public auditing features make our scheme superior to existing cloud data auditing schemes.In order to satisfy the data update efficiency, the B-RBT data structure is introduced to complete the data update in a constant time.The scheme has been deemed secure and effective based on security analysis and experimental outcomes.

Figure 6 .
Figure 6.The computation overhead at the TPA and CSP sides under different detection probabilities of corruption.
RBT data structure.KGC generates private keys for DO and PS.In our system model, DO and KGC are trusted entities.CSP belongs to the category of incompletely trusted entities, which is trustworthy in data privacy but untrustworthy in data integrity; it may damage and tamper with DO's data.TPA belongs to the category of semi-trustworthy entities, which is trustworthy in terms of integrity verification but untrustworthy in terms of data privacy; it will fulfill the auditing tasks of DO as required, but it will be curious about the DO's data content is curious.

Table 2 .
Comparison of computational overhead.