Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
Integrity checking using third party auditor in cloud
storage
Sutirtha Chakraborty Shubham Singh Surmila Thokchom
Dept. of Computer Science & Dept. of Computer Science & Dept. of Computer Science &
Engineering Engineering Engineering
Bhagalpur College of Engineering NIT Meghalaya NIT Meghalaya
Bhagalpur, India Shillong, India Shillong, India
email: sutirtha38@gmail.com email: ShubhamSinghCMR@nitm.ac.in email: Surmila.Thokchom@nitm.ac.in
Abstract—Cloud computing, nowadays, is one amongst the A solution to protect confidentiality is the way in which
rapidly developing technologies around the world which provides user stores data. User can encrypt data before storing it in the
a variety of services to its users. One of the salient services cloud to prevent unauthorized access [3]. But verifying the
offered by the cloud to its users is Storage as a Service(STaaS). authenticity of data stored at an untrusted storage server is
This service makes its user burden free from the task of storing hard [4]. Such server retains large amount of data, some for
and maintaining data by allowing him to transfer his data to a
remote storage server and access it remotely. The user’s data
long period of time, during which data loss may occur due
is maintained, managed and backed up by the cloud and is to administration errors, data movement etc. It is difficult to
made available to him over the internet. But the main concern detect whether data has been modified while accessing it as
while using this type of service is the privacy and integrity of getting an entire file stored at a remote server is expensive in
stored data. The data stored at remote locations may be accessed, I/O cost and limits the scalability of the network stores. Thus
modified or damaged by an attacker. Users require their data to the client should be able to verify that the stored data is intact
be safe from all such unauthorized activities hence techniques without retrieving it from the storage server and letting the
for data integrity verification are needed to check whether the server access the entire file.
integrity of stored data is intact or not. Integrity verification
is a challenging task because the user doesn’t posses the data Remote data integrity check allows a storage server to
locally. The proposed scheme checks the integrity of data stored prove to a verifier that it is actually storing owners data.
at remote location through a third party auditor using bilinear Recently, it has become more and more significant due to
pairing. Later, aggregation is applied over the proposed scheme the development of online and distributed storage systems. In
for further optimization. integrity checking with public verifiability, an external auditor
Keywords—Cloud computing, third party auditing, group sig- is able to verify the integrity of stored data.
nature, random masking, integrity check.
II. R ELATED WORK
I. I NTRODUCTION Cloud computing is a technology which allows user to
Cloud computing is defined as a network based computing subscribe to high quality services from data & software that
which provides shared processing data & resources to its user resides at remote servers. It provides many benefits to the
when required [1]. It provides a user various capabilities for users which minimizes the resources at clients end. But the
storing their data at the cloud server & processing it when remote servers at which their data is present are not fully
required. It is now a highly demanded service because of trustworthy and possesses many security challenges like the
the offered advantages like low cost services, high computing security and integrity of data. A lot of work has been done
power & performance, accessibility, scalability as well as to address these issues. C. Wang et al proposes a scheme
availability [2]. It offers various services like platform as in which a third party auditor verifies the integrity of data
a service, software as a service, utility services etc. to its without demanding any local copy and introduces no additional
users. Among them, one of the salient service offered is cloud cost to the user. The auditing technique uses public key based
storage. In cloud storage users data is stored on multiple third homomorphic authenticator with random masking along with
party servers, not on a dedicated server as used in traditional bilinear aggregate signature to extend result to a multi user
network data storage, which can be accessed as and when setting. J. Li et al proposes a dynamic data integrity scheme
required from any part of the world. based on Merkle hash tree [6] for the data stored in the
cloud. A new cloud storage architecture is proposed in [7].
Despite of the offered benefits, there are some privacy In this architecture, there are two independent cloud servers:
concerns in the services provided by the cloud. The cloud storage server & audit server. Audit server, on behalf of
service provider can access the stored data any time and can cloud users, pre-process the data before uploading it to the
modify or delete it [3]. They can share the stored information storage server and later verifies the integrity. This eliminates
with third parties, if necessary, which is permitted in their the user involvement in auditing and in pre-processing phase.
privacy policy. Thus the major concern is the integrity and In [8], a novel public verification mechanism for data integrity
privacy of stored data. verification of multi owner data in an untrusted cloud using
978-1-5386-6835-1/18/$31.00 ©2018 IEEE
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
multi-signatures is proposed where the verification time & rely on the cloud provider for the computations of
storage overhead of signatures is independent from the number data as well as security maintenance on it. It stores
of owners. In [9], ways to reduce the damage of clients key encrypted data to the remote cloud server and to
exposure in storage auditing are investigated. They proposes ensure integrity of the data, it enquires frequently
a protocol which employs binary tree structure & preorder about it to the TPA (Third Party Auditor).
traversal technique to update the secret keys of client. It
• Cloud Service Provider (CSP): Manages the cloud
develops a novel authenticator scheme which supports forward
server which has huge storage space and flexible
security & blockless verifiability. Another approach to verify
resources that can help to keep the client’s data. It
data integrity is proposed in [10] where the third party auditor
is often considered to be an untrusted party.
verifies data without asking for any copy and does not brings
any vulnerability to it. A number of existing data auditing • Third Party Auditor (TPA): TPA has expertise and
protocols today faces challenges in key management. The capabilities of auditing and to convince both the Client
complexity is well addressed in [11] where fuzzy identity and the CSP. It is considered to be a trusted party. It
based auditing is done. Here a users identity is viewed as a acts as an interface between the CSP and the Client.
set of descriptive attributes. A paradigm named strong key It has an understanding of the service level agreement
exposure resilient auditing is put forward in [12] for secure between the CSP and the Client.
cloud storage where the security of cloud data auditing before
and later key exposure can be preserved. [13] investigates • User: One who will be using the data from cloud.
a protocol developed for cloud storage which is based on
the discrete logarithm problem and extends it to support data C. Techniques used
dynamics by employing an index vector & third party auditing • Fragmentation: The aim of this technique is to divide a
by using a random masking number. file into smaller blocks. The main advantage of using
it is that the large file after fragmentation becomes
III. P ROBLEM S TATEMENT manageable in size, the second advantage is the I/O
cost, that dominates while data is being exchanged
Cloud computing or computing as a utility, facilitate users
between Client and TPA or TPA and CSP, is reduced.
to store their data on a remote storage server for later retrieval
The main challenging task during fragmentation of
when needed. By outsourcing their data, users are relieved
data is On what basis are we going to fragment the
from the burden of local storage of data and its maintenance.
data? such that each fragmented data has its own
But the remotely stored data is no longer in control of the
physical meaning. The data should be divided in such
owner. This makes data integrity protection a difficult task
a way that each fragmented part has its own meaning
and potentially the security of data is also at risk, especially
& can be updated without the help of other parts of
for those users who have limited computing resources and
that fragmented file.
capabilities.
The main challenging task in fragmentation is the
This paper put forwards an approach to ensure the integrity pattern in which data is splitted such that it can be
of the data stored at remote cloud server with the help of a uploaded dynamically.
third party auditor, and also applies aggregation for further
• Encryption: AES is used to encrypt the data which is
optimization of the scheme.
symmetric key Algorithm, which means that the same
key will be used for both encryption and decryption.
IV. P RELIMINARIES The proposed scheme encrypts different fragments
A. Bilinear Pairing of the same file with different keys. Assume we
have m keys and a file F . Fragment the file using
Let us consider three groups G1 , G2 and GT which are fragmentation into F1 , F2 and Fn . The probability
cyclic of order prime say r. Let the generators of group G1 & that the file can be decrypted or can be made readable
G2 be g1 & g2 respectively. Let ψ be an isomorphism from G2 is very low.
to G1 with ψ(g2 ) = g1 & e : G1 XG2 −→ GT be a bilinear
map with properties: • Hashing: Proposed scheme uses message digest as a
secure one way hash function which computes the
• Countability: An efficient algorithm to compute e hash value of the files during the integrity check of
exists. data.
• Bilinearity: ∀u ∈ G1 , v ∈ G2 and a, b ∈ Zp ,
V. P ROPOSED SCHEME
e(ua , v b ) = e(u, v)ab .
Let connsider three groups G1 , G2 , GT of some prime
• Non-degeneracy: e(g1 , g2 ) 6= 1
order p, and a bilinear map e that always takes some element
from G1 and some element from G2, and it outputs exactly
B. Modules one element of GT . Bilinear pairing is used to outcome the
The proposed scheme comprises of three entities: integrity test for which some system parameter g is chosen as
a random element from G2 . If P wishes to sign some message,
• Client: Can be an individual customer or an organi- she generates her own public and private keys. Her private key
zation, of the cloud who has a large amount of data is a random element x of Zr, and her public key is g x . If P
(encrypted) and wants to upload at the cloud and will wishes to sign a message, P will hash the message(h) to an
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
exactly one element h of G1 , and then finally calculates the
signature (σ)= hx . If Q wishes to verify her signature , Q
checks that whether e(h, g x ) = e(σ, g).
Now the above algorithm is converted to C code using
PBC library, and scheme relatively improves the BLS to be
used. The code used to check the integrity in the cloud has
different phases:
• Initialization phase: Variables to be used will be
defined in this phase. Fig. 1: Random masking based checking
• Binding phase: Here, each variable defined will be
allocated to a group through the pairing parameter.
for all j ∈ [1, s]. For each block it computes a tag(ti ) as:
!x
• Generator creation: The generators of both the groups mi,j
are generated. ti = h(W ).uj (1)
where Wi = F ID||i which denotes the concatenation of file
• Import the data: The data whose hash is to be
id and block number.
calculated is imported.
Proof: From challenge it receives few parameters as input
which are used to generate a proof µ as follows:
• Hash creation: The hash of the message is created Y
using a function provided by PBC library. σ= σivi (2)
i∈I
• Private key generation: Private key is generated using For generating the data proof, it generates the sector linear
the random function provided by the PBC library. combination of the total challenged blocks and generates a R
value: X
• Public key: Public key is generated using the µ0 = vi .mi (3)
i∈Q
generator of the group and the public key.
R = e(u, v)r γ = h(R) (4)
• Tag/Signature: The tag of the message is created Finally the proof is generated:
using hash of the message and secret key.
µ = r + γµ0 modp (5)
• Mapping the elements: The elements from the group Verify: The verification takes challenge, proof, secret hash
G1 and group G2 are mapped to an element of group key & public tag key and the abstract information of the data,
GT using the function e. and computes the identifier hash values of all the challenged
data blocks and verifies the proof by the response sent from
• Comparison: Finally the comparison happens to check server using the equation:
the integrity. R.e(σ γ , g) = e((H(Wi )vi )γ.uµ , v) (6)
A file is taken from the user as input and is divided into either If the equation maps, the output is 1 which signifies that the
logical or physical parts. The part of the key and tag generation data is intact. Otherwise, the output is 0 denoting data has been
of each file is stored and the files are encrypted. The encrypted altered.
files are then sent to the cloud server along with their generated
public keys. Random masking is used to provide data privacy 1) Proof of algorithm: The correctness of the algorithm of
from third party auditor during the auditing process. The Random Masking Integrity Checking Protocol is as follows:
proposed scheme uses cryptographic method using bilinear
pairing for providing data privacy instead of using random
masking.
A. Random masking based checking
KeyGen: This algorithm generates a secret key (spk, ssk),
and random elements x ← Zp , u ← G1 and computes v ← g x .
Thus, this algorithm returns secret key = (x, ssk) and public
values pk = (v, u, g, e(u, v), spk).
Challenge: This takes the above parameters as input.
x
Firstly it chooses a random value and computes uj = g1 j ∈ G1
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
= e(g α.β .M r , g ran ) . e(g ran , M −r )
= e(g α.β .M r .M −r , g ran )
= e(g α , g β )ran
C. File splitting
Document Splitter is a project which does not require
establishment and can be utilized to part documents to various
Fig. 2: Proof of random masking based checking pieces and to combine numerous lumps into a solitary record
[14]. It is utilized to part the clients record as per the client
determining size. It is exceptionally hard to exchange one
major document starting with one end then onto the next
B. Proposed algorithm through any media like web or little stockpiling like floppy and
A Third Party Auditing(TPA) protocol is designed which so on. This programming beats this issue. The split segments
checks the integrity based on ring signature without random of document may convey a few interim data to mean the
oracles. This algorithm is similar to other auditing algorithms quantity of split part and aggregate number of parts and so
and is divided into four parts: forth. This thought is utilized to part enormous documents to
little pieces for exchanging reason, transferring and so forth.
KeyGen: Choose random α, β ← Zn and calculate g1 ← In the destination side, these parts of document can be joined
g α and g2 ← g β . The public key pk is (g1 , g2 ) ∈ G2 and the to frame the first source record. Mainly there exists two types
private key sk is (α,β). of file splitting:
Signature: This algorithm takes g1 , g2 and g as input and • Logical splitting: For text files logical splitting is the
compute the signature of each data block. A random r value best way of partitioning a document. The file splitter
(r ∈ G) and ui is generated which is a set for each data block will count the total number of words present in the
where u1 , u2 , ...ui ∈ G. Two signatures S1 , S2 are computed document. It will divide the files in such a way that
such that: every part of the file will have same number of words.
It will result in almost similar size in every file.
• Physical splitting: A d-multiplexer for advanced media
documents or media d-multiplexer regularly called a
record splitter is programming that d-multiplexes or
When challenge is sent to the server, the signatures of the parts singular surges of a media record and sends them
respective challenged blocks are multiplied and a challenge tag to their separate decoders for genuine unravelling.
is made which is used for proof generation: Media document splitters are not decoders themselves
but rather isolate program streams from a record
and supply them to their individual sound, video or
subtitles decoders.
Proof: This algorithm takes as input the stored data and
the received challenge C, and calculates a proof & sends it to
the auditor for checking:
Verification: The auditor checks with the metadata present
at auditors end using the following equation: Fig. 3: Block diagram of file splittion
If the above equation satisfies, the data is intact.
1) Proof of algorithm: We have,
Qn
e(S1 ran , g).e(S2 −1 , i=0 ui mi j )ran
Qn
Let M = i=0 ui mi j
= e(S1 ran , g).e(S2−1 , M )ran
= e(S1 ran , g) . e(S2−1 , M ran )
= e(g α.β .M r , g)ran . e(g −r , M ran )
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
Algorithm 1 Algorithm proposed protocol is taking a significantly less time for com-
Input: File path, size of file (SOF) puting proof by the server. Though it requires some negligi-
Output: File partition ble amount of time for verification greator than the random
1: If file is to be splitted, goto step 2 else merge the fragments masking technique by the auditor hence it is considered for
of the file & goto step 10 optimization for a better performance.
2: Input sourcepath, destinationpath, SOF
3: size = size of source file
4: If size>SOF, goto step 6 else print ”File cannot be
VII. C ONCLUSION
splitted” and goto step 10 The proposed auditing protocol successfully checks the
5: Split into fragment = SOF integrity of stored users data without retrieving it from the
6: Print size cloud storage. The scheme also allows to check whether the
7: if size<SOF then data stored in cloud is lost or is being tampered. Moreover
8: goto step 6 there is too less computation cost for the users as well as for the
9: else cloud service provider in the proposed method. The protocol
10: print ”File cannot be splitted” and goto step 10 presented is much more secure against every tamper attack,
11: end if curiosity and loss attack. The analysis of the efficiency has
shown that the proposed protocol along with the aggregation
methodology can fix the security pitfall without high cost.
VI. P ERFORMANCE EVALUATION Also, the performance analysis of the scheme shows that
From the execution of both the algorithms, the time com- the scheme is more efficient than existing random masking
plexity for generating proof and verification of data blocks technique.
(50, 100, 200, 400, 500) from the mean of 20 trails is cal-
culated. The variation in the computation for large number
VIII. F UTURE WORK
of files could not be determined because of the limits on
the configuration of the device. Intel(R) Core(TM) i7-4770 This paper presents a novel data integrity auditing scheme
CPU @3.40GHz 3.40Ghz with a RAM of 8GB is used for but more modifications can be done to it. The scheme can
simulation. be further improved by adding shared encryption keys. To
serve the purpose, group key agreement protocol can be
used. Moreover, some more optimizations are possible in the
proposed algorithms to reduce overall computation overhead.
R EFERENCES
[1] The NIST Definition of Cloud Computing ,
http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf
[2] Hassan Qusay, ”Demystifying Cloud Computing”, The Journal of
Defense Software Engineering (CrossTalk) 2011
Fig. 4: Time complexity for proof generation [3] ”Cloud Computing Privacy Concerns on Our Doorstep”,
http://cacm.acm.org/magazines/2011/1/103200-cloud-computing-
privacy-concerns-on-our-doorstep/fulltext
[4] Wentao Liu, Research on Cloud Computing Security Problem and
Strategy, Consumer Electronics, Communications and Networks
(CECNet), 2012 2nd International Conference, pp. 1216-1219
[5] C. Wang, Q. Wang, K. Ren and W. Lou, ”Privacy-Preserving Public
Auditing for Data Storage Security in Cloud Computing,” 2010
Proceedings IEEE INFOCOM, San Diego, CA, 2010, pp. 1-9.
[6] L. Chen and H. Chen, ”Ensuring Dynamic Data Integrity with Public
Auditability for Cloud Storage,” 2012 International Conference on
Computer Science and Service System, Nanjing, 2012, pp. 711-714.
[7] J. Li, X. Tan, X. Chen and D. S. Wong, ”An Efficient Proof of
Retrievability with Public Auditing in Cloud Computing,” 2013 5th
International Conference on Intelligent Networking and Collaborative
Fig. 5: Time complexity for proof verification Systems, Xi’an, 2013, pp. 93-98.
[8] B. Wang, H. Li, X. Liu, F. Li and X. Li, ”Efficient public verification
on the integrity of multi-owner data in the cloud,” in Journal of
From the graphs, a clear comparison between the random Communications and Networks, vol. 16, no. 6, pp. 592-599, Dec. 2014.
masking technique and the proposed scheme is inferred. The
Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), 2-4 August, 2018, Noida, India
[9] J. Yu, K. Ren, C. Wang and V. Varadharajan, ”Enabling Cloud Storage
Auditing With Key-Exposure Resistance,” in IEEE Transactions on
Information Forensics and Security, vol. 10, no. 6, pp. 1167-1179, June
2015.
[10] S. D. Thosar and N. A. Mhetre, ”Integrity checking privacy preserving
approach to cloud using third party auditor,” 2015 International
Conference on Pervasive Computing (ICPC), Pune, 2015, pp. 1-4.
[11] Y. Li, Y. Yu, G. Min, W. Susilo, J. Ni and K. K. R. Choo, ”Fuzzy
Identity-Based Data Integrity Auditing for Reliable Cloud Storage
Systems,” in IEEE Transactions on Dependable and Secure Computing,
vol. PP, no. 99, pp. 1-1.
[12] J. Yu and H. Wang, ”Strong Key-Exposure Resilient Auditing for
Secure Cloud Storage,” in IEEE Transactions on Information Forensics
and Security, vol. 12, no. 8, pp. 1931-1940, Aug. 2017.
[13] Jian Zhang, Y. Yang, Yanjiao Chen and F. Chen, ”A secure cloud
storage system based on discrete logarithm problem,” 2017 IEEE/ACM
25th International Symposium on Quality of Service (IWQoS), Vilanova
i la Geltru, 2017, pp. 1-10.
[14] Document Splitter,
https://gregmaxey.com/word tip pages/document splitter.html