SlideShare a Scribd company logo
1 of 5
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1039
Improved deduplication with keysandchunksin HDFSstorage providers
Ramya Ramesh1, K.O Sumitha2, Dr. S Subburam3, R.K Kapila Vani 4
1, 2Student, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy
Engineering College, Chennai, India
3Professor, Department, of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy
Engineering College, Chennai, India
4Assistant Professor, Department of Computer Science and Engineering, Prince Dr. K. Vasudevan College of
Engineering and Technology, Chennai, India
-------------------------------------------------------------------------------***---------------------------------------------------------------------------------
Abstract - Cloud computing is one type of computing
platform that is Internet based. One important feature of a
Cloud service is Cloud storage. The cloud users may upload
sensitive data in the Cloud and allow the Cloud server to
maintain these data. It is critical to manage large volumes of
duplicated data hence many deduplication techniques were
developed. Existing solutions of deduplication failed in
providing security and reliabilityefficiently. Weproposeanew
distributed deduplication system in HDFS storage providers
that ensures higher reliability with the help of ownership
verification and providesreliabledistributedkeymanagement
servers to maintain the keys storage. Thus the new scheme
achieves deduplication in HDFS storage easily.
Key Words: deduplication, data chunks, key management,
distributed block servers, tags.
1. INTRODUCTION
Cloud storage is a model of storage of data with a
highly virtualized infrastructure made up of many
distributed resources. In the cloud storage, the data is
remotely maintained and managed. Theuserscanstoretheir
data online and are able to access from any part of the world
via the Internet connection. The biggestadvantageof storing
data in Cloud is that the users are provided with a broader
range of access to distributed resources in a cost-efficient
manner. This is because it eliminates the capital expenses
such as buying the hardware, software and setting it up.
Hence the cloud storage provides benefits of higher
reliability, good performance, better usability, greater
accessibility and secured environment. The rise of Cloud
Computing led to the emergence of Big Data. Big Data refers
to the term of dealing with extremely large amount of data
that maybe structured or semi-structured or unstructured.
Now-a-days, massive amount of informational data are
generated due to digitalization and technological growth,
hence it becomes essential to handle such Big Data in an
efficient and economical way using Cloud.Toprocessthe Big
Data, a Hadoop Java-based framework is used that helps to
support large data processing in a parallel and distributed
environment. Hadoop runs in a master-slave setup where
the master distributes the jobs to its cluster and processes
the map and reduce tasks sequentially.
The most important requirement of Cloud storage is to
manage the large sets of data efficiently. Though the cloud
storage is efficient, in order to make it moreeffectivethereis
a need for eliminating the duplicate copies of data that are
stored in Cloud. The duplicate copies may arise in scenarios
where the data are shared among multiple users. For
example, consider an email server system, where 200
instances of data are sent to all the members of an
organization. If all the members back-up their inboxes each
of their instances are saved creating duplicate copies of the
same file which increases the storage space in Cloud. This
greatly wastes the network bandwidth, complicates
management of data and energy consumption. This
introduces the concept of deduplication in Cloud storage.
Deduplication is a compression technique where it
eliminates the duplicate copies of data by storing only one
physical copy of the data and referring other duplicate data
to that copy. The techniqueachievesminimizednetwork and
storage overhead by improvising the storage capacity.
Deduplication schemeshaveprovedtoreduce90-95percent
storage for backup and up to 68 percent in file systems.
Thereby, the overall storage needs has been reduced by 80
percent for both files and backups. Existing solutions on
deduplication suffered from brute-force attacks [1], [2] and
also it cannot ensure security, reliability, data privacy, data
access control and data revocation.
In this paper, we propose a new scheme for deduplication
with higher reliability in which the data chunks are
distributed across HDFS and a reliable key management in
secure deduplication using slavenodes. The restofthepaper
has the details as follows. Section 2 gives an overview of
related work. Section 3 discusses about the current
methodology.Section4introducesthesystemmodel.Section
5 gives detailed description of our proposed scheme and
finally conclusion is provided in the Section 6.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1040
2. RELATED WORK
Many traditional encryption techniques for
deduplication were introduced but they are not suitable
since the file suffers from dictionary attacks [3].The data
privacy is an important constraint in dealing with Cloud
storage. Hence to ensure the data security and privacy, a
good way is to outsource encrypted data. DeDu [4] is a
system that was introduced to solve duplication efficiently
but it was not able to handle encrypted data. In order to
resist brute-force attacks, a secure deduplicated storage
called DupLESS [5] was proposed. But the drawback of the
system is that it cannot control over the data access of other
data users in a convenient way.
To improve restore performance, History Aware Rewriting
(HAR) algorithm [6] was proposed to accurateidentifiesand
rewrite fragmented chunks. To protect the data security, an
authorized data deduplication[7]wasproposedbyincluding
differential privileges of users in the duplicate check in
hybrid cloud architecture. But, it cannot flexibly support the
data access control by data holders especially for data
revocation process. ClouDedup [8], a secure and efficient
storage service which assures block-level deduplicationand
data confidentiality was proposed. This scheme failed since
it cannot solve the issue caused by data deletion where the
data holder can access the data as it still knows the data
encryption key if the data is not completely removed from
the Cloud.
A new proposal was developed based on Provable
Ownership of File (POF) [9]. This is a cryptographically
secure and efficient scheme for a clienttoprovetotheserver
based on actual possession of the entire file instead of only
partial information about it. Thus, it helps in reducing the
burden of the client. To have a secure and constant cost
public Cloud storage [10], a new scheme introduced
supported data integrity as well as storage deduplication at
the same time. But, the drawback being that it doesn’t
discuss about the feasibility of supporting deduplication
with Big Data. For reducing the workloads due to duplicate
files, index name server (INS) [11] was proposed but the
work doesn’t concentrate on deduplication of encrypted
data.
In this paper, we perform both file-level and block-level
deduplication and use MD5 algorithmtogeneratesignatures
to perform the ownership verification. We also apply Triple
Data Encryption algorithm (3DES) using convergent key for
the data to be more secure and reliable against hackers.
Thus, we achieve deduplication with higher security and
reliability in an efficient manner in Cloud.
3. EXISTING METHODOLOGY
Most of the previous methodologies have only
considered in a single-server system. Another important
factor to be considered is that deduplication systems must
provide higher reliability especially whilehandlingdata that
are sensitive or critical. At times, it is required that these
data should be preserved over longer time periods.
Therefore the deduplication system has to be such, it
achieves higher reliability as well as higher security at the
same time compared to that of the other high-available
systems. In the existing methodology, eachuserhasa unique
master key for preserving the data privacy.
But, this approach is unreliable, since each user has to
dedicatedly protect his own master key. Accidentally, if the
master key is lost then the user can never be recovered and
hence it paves a way for attackers to involve in data leakage.
Yet another problem is that the number of keys gets
increased with increasing number of users and it becomes
difficult to manage not only for the storage of contentaswell
as for the storage of keys.
Disadvantages
.Deduplication is not scalable as enormous numbers of
keys are required with the increasing number of users.
. Cost increases to the storage of content as well as
storage of keys.
.Lack of proper key management.
. Security becomes crucial if the master key is not
protected.
.Easily prone to data leakage by attackers if the master
key is lost.
.Less reliability due to lack of key management.
4. SYSTEM MODEL
We propose a new scheme to deduplicate the
encrypted data ensuring higher reliability by ownership
verification of the file with the help ofa uniquetaggenerated
using MD5 algorithm.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1041
In the figure 1, a detailed overview of system architecture of
our proposed scheme is shown.
As described in the figure 1, the system consists of four
different entities namely the User, Cloud server, N-key
management servers and N-distributed block servers.
User:
User is an entity that makes use of the Cloud service for
storage, retrieval and management of data in a cost-efficient
environment. Cloud serviceallowstheuserto remotelystore
the data online and can access the data from any part of the
world via the internet connection. The users are provided
with a priority of data security, enjoy a cost-effective service
and also take the advantage of an unlimited storage space.
Cloud server:
Cloud server is an important entity built with a logical
server over the internet. Cloud server is also known as the
virtual server. It possess the capabilities similar to that of a
normal server but can be remotely accessed from anywhere
in the world via the internet access.
An Example of a Cloud server that stores Electronic health
records for doctors to access the patient’s records instantly
in a Military based health system.
N-Key management servers:
Distributed key management servers are helpful in
managing the keys storage in Cloud. Since the workload of
storing keys in Cloud storage space adds overhead, the
concept of distributedkeymanagementserverisintroduced.
It provides improved scalability of key management with
increasing number of keys and helps to protect sensitive
data against the attackers.
N-Distributed block servers:
These servers are used to store the files as blocks in a
distributed environment and theblocksarerequestedbythe
Cloud server in behalf of the Cloud user after successful
ownership verification.
These are the componentsthataremajorlyinvolvedin
our system. The Cloud user will initially registertotheCloud
server and logins to their account to perform storage,
retrieval or deletion of the data. When the data is uploaded
onto the storage, each file and its block will beprovidedwith
a unique tag generated by MD5 algorithm. The concept of
ownership verification is performed to identify a valid user.
The Cloud server generates convergent keys based on the
hash and stores it in the key management server.Thesekeys
are later used to encrypt and decrypt the data content. The
Cloud server in behalf of the Cloud user requests the blocks
to the distributed block servers and provides it to the user.
Thus providing an efficient deduplication system that
improves scalability and reliability.
5. PROPOSED METHODOLOGY
The proposed methodology aims at providing a new
deduplication system with higher reliability by splitting the
file into several blocks and performing both file-level and
block-level duplication checking.
Our scheme ensures the security by means of an ownership
verification. The ownership verification is done bytheCloud
server to identify the valid user when uploading or
downloading the file from preventing against attackers. A
unique tag is assigned for each file and each block that is
generated using an MD5 algorithm.
Using the hash value, convergent key is created that
performs encryption and decryption of the data content
using Triple Data Encryption algorithm (3DES). Triple DES
ensures high security and consistency to the data stored.
Convergent encryption suffers from offline brute-force
attacks, hence we generate the convergent key alone for our
scheme and use it in anothersecure encryption3DESinstead
of using convergent encryption for the entire data. An
efficient distributed key management server helps in
managing the convergent keys stored.
A csv file that contains the metadata on the file such as hash
value is also maintained to check the duplication. One of the
main key features of using this new deduplicated system is
that the data deletion will only delete the reference to that
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1042
copy but not the entire content and can be accessed by other
users.
File Upload:
Suppose that a user wants to upload its file to the Cloud
storage. The user initially has to make a registration to the
Cloud and login with their account then he/she chooses an
option to upload the file to the Cloud. Once after the file is
uploaded, a unique signature tag is generated using MD5
algorithm. The produced hash tag is unique to only unique
data contents.
Suppose that another user wants to upload thesamefileinto
the Cloud storage. Initially registration is done and then
he/she chooses an option to upload the same file. Once after
the upload process is completed, a unique signature is
generated that matches with the previous signature
indicating it’s a duplicate copy. Now, the Cloud server stores
only one physical copy of the same file after performing the
ownership verification. Thus, file-level deduplication is
achieved.
If the ownership verification becomes pass, the server
provides the reference of the file to the requested user. If it
fails, the upload operation is aborted.
For block-level deduplication, the file uploaded is split into
several blocks and stored in the distributed block servers.
Each block is associated with a unique tag generated by the
algorithm. If the same block is being uploaded, the server
checks the ownership of the user to guarantee the validityof
the owner and provides access rights to the users for the
same block. The server fails to upload if the verificationfails.
Thus, block-level deduplication is also achieved.
File Download: Consider the user who wants to download
the data from the Cloud. The user requests for the download
option to the server. Upon ownership verification, the Cloud
provides the user to download its data contents. If the user
requests for a block, the ownership is verifiedandblocksare
provided from the distributed block servers to the Cloud
servers which in turn serves the users with the requested
block. The user upon receiving the cipher text decrypts the
data with its private key to get the original data.
If the ownership verification fails, the downloadoperationis
aborted.
Data deletion: If the user requests for the data to bedeleted
from the Cloud storage, the server only deletesthereference
of that user to the data but the entire file is not deleted for
providing accesses for other users.
Algorithm 1: checking duplicate files:
Figure 2: MD5 algorithm
--The input file uploaded is processed bytheserver.
--Generates a unique tag for the file using MD5.
--Server maintains a csv metadata file with stored
tags.
--It becomes a Duplicate data, if the tag matches.
-- Not a Duplicate data, otherwise.
Algorithm 2: performing encryption:
--Generating convergent keys using the stored
hashes of file.
--Perform 3DES encryption using convergent keys.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1043
Figure 3: Triple data encryption algorithm
6. CONCLUSION
We propose a new scheme that efficiently supports
Big Data in HDFS storage providers. An effective approachis
implemented to verify data ownership through verification
process. We integrate the scheme with data access control
and deduplication in a simple way over a secure challenging
environment.
The privacy of the data is also preserved by making use of
strongly effective algorithms. The data chunks are
distributed across the HDFS storage ensuring higher
reliability. A reliable distributed key management is
accomplished where the overhead of storing the keys in the
Cloud storage is lessened.
Thus a new distributed deduplication system is achieved
with higher reliability and security of data in HDFS storage.
6. REFERENCES
[1] Z. O. Wilcox, “Convergent encryption reconsidered,”
2011. [Online]. Available:
http://www.mailarchive.com/cryptography@metzdowd.co
m/msg08949.html
[2] The Freenet Project, Freenet. (2016). [Online].Available:
https:// freenetproject.org/
[3] Atul Adya, William J Bolosky, Miguel Castro, Gerald
Cermak, Ronnie Chaiken, John R Douceur, Jon Howell, Jacob
R Lorch, Marvin Theimer, and Roger P Wattenhofer. Farsite:
Federated, available,andreliablestorageforanincompletely
trusted environment. ACM SIGOPS Operating Systems
Review, 36(SI):1–14, 2002.
[4] Z. Sun, J. Shen, and J. M. Yong, “DeDu: Building a
deduplication storage system over cloud computing,” in
Proc. IEEE Int. Conf. Comput. Supported Cooperative Work
Des.,2011,pp.348–355,doi:10.1109/CSCWD.2011.5960097
[5] M. Bellare, S. Keelveedhi, and T. Ristenpart, “DupLESS:
Server aided encryption for deduplicated storage,” in Proc.
22nd USENIX Conf. Secur., 2013, pp. 179–194.
[6]M.Fuetal,
“Acceleratingrestoreandgarbagecollectionindeduplication-
based backup systems via exploiting historical information,”
inProc. USENIX Annu. Tech. Conf., 2014, pp.181–192.
[7] J. Li, Y. K. Li, X. F. Chen, P. P. C. Lee, and W. J. Lou, “A
hybrid cloud approach for secure authorizeddeduplication,”
IEEE Trans. Parallel Distrib. Syst., vol. 26, no. 5, pp. 1206–
1216, May 2015, doi:10.1109/TPDS.2014.2318320.
[8] P. Puzio, R. Molva, M. Onen, and S. Loureiro, “ClouDedup:
Secure deduplication with encrypteddata forcloudstorage,”
in Proc. IEEE Int. Cof. Cloud Comput. Technol. Sci., 2013, pp.
363–370, doi:10.1109/CloudCom.2013.54.
[9] C. Yang, J. Ren, and J. F. Ma, “Provable ownership of file in
deduplication cloud storage,” in Proc. IEEE Global Commun.
Conf., 2013, pp. 695–700,
doi:10.1109/GLOCOM.2013.6831153.
[10] J.W.Yuanand S.C.Yu, “Secure and constant cost public
cloud storage auditing with deduplication,” in Proc. IEEE
Int.Conf. Communic. Netw.Secur.,2013,pp.145–
153,doi:10.1109/CNS.2013.6682702.
[11] T. Y. Wu, J. S. Pan, and C. F. Lin, “Improving accessing
efficiency of cloud storageusingde-duplicationandfeedback
schemes,” IEEE Syst. J., vol. 8, no. 1, pp. 208–218, Mar. 2014,
doi:10.1109/ JSYST.2013.2256715.

More Related Content

What's hot

A Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationA Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationEditor IJMTER
 
Secure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilitySecure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilityPvrtechnologies Nellore
 
Review on Key Based Encryption Scheme for Secure Data Sharing on Cloud
Review on Key Based Encryption Scheme for Secure Data Sharing on CloudReview on Key Based Encryption Scheme for Secure Data Sharing on Cloud
Review on Key Based Encryption Scheme for Secure Data Sharing on CloudIRJET Journal
 
Secure distributed deduplication systems with improved reliability 2
Secure distributed deduplication systems with improved reliability 2Secure distributed deduplication systems with improved reliability 2
Secure distributed deduplication systems with improved reliability 2Rishikesh Pathak
 
A hybrid cloud approach for secure authorized
A hybrid cloud approach for secure authorizedA hybrid cloud approach for secure authorized
A hybrid cloud approach for secure authorizedNinad Samel
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...IEEEGLOBALSOFTTECHNOLOGIES
 
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...
IRJET-  	  Privacy Preserving Cloud Storage based on a Three Layer Security M...IRJET-  	  Privacy Preserving Cloud Storage based on a Three Layer Security M...
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...IRJET Journal
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageIJERA Editor
 
Survey on Data Security with Time Constraint in Clouds
Survey on Data Security with Time Constraint in CloudsSurvey on Data Security with Time Constraint in Clouds
Survey on Data Security with Time Constraint in CloudsIRJET Journal
 
Ijarcet vol-2-issue-3-951-956
Ijarcet vol-2-issue-3-951-956Ijarcet vol-2-issue-3-951-956
Ijarcet vol-2-issue-3-951-956Editor IJARCET
 
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...IJSRED
 
Improving availability and reducing redundancy using deduplication of cloud s...
Improving availability and reducing redundancy using deduplication of cloud s...Improving availability and reducing redundancy using deduplication of cloud s...
Improving availability and reducing redundancy using deduplication of cloud s...dhanarajp
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...dbpublications
 
Encryption based multi user manner secured data sharing and storing in cloud
Encryption based multi user manner secured data sharing and storing in cloudEncryption based multi user manner secured data sharing and storing in cloud
Encryption based multi user manner secured data sharing and storing in cloudprjpublications
 

What's hot (18)

A Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-DuplicationA Hybrid Cloud Approach for Secure Authorized De-Duplication
A Hybrid Cloud Approach for Secure Authorized De-Duplication
 
Secure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilitySecure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliability
 
Review on Key Based Encryption Scheme for Secure Data Sharing on Cloud
Review on Key Based Encryption Scheme for Secure Data Sharing on CloudReview on Key Based Encryption Scheme for Secure Data Sharing on Cloud
Review on Key Based Encryption Scheme for Secure Data Sharing on Cloud
 
[IJET-V2I1P12] Authors:Nikesh Pansare, Akash Somkuwar , Adil Shaikh and Satya...
[IJET-V2I1P12] Authors:Nikesh Pansare, Akash Somkuwar , Adil Shaikh and Satya...[IJET-V2I1P12] Authors:Nikesh Pansare, Akash Somkuwar , Adil Shaikh and Satya...
[IJET-V2I1P12] Authors:Nikesh Pansare, Akash Somkuwar , Adil Shaikh and Satya...
 
Secure distributed deduplication systems with improved reliability 2
Secure distributed deduplication systems with improved reliability 2Secure distributed deduplication systems with improved reliability 2
Secure distributed deduplication systems with improved reliability 2
 
A hybrid cloud approach for secure authorized
A hybrid cloud approach for secure authorizedA hybrid cloud approach for secure authorized
A hybrid cloud approach for secure authorized
 
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...
JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...
 
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...
IRJET-  	  Privacy Preserving Cloud Storage based on a Three Layer Security M...IRJET-  	  Privacy Preserving Cloud Storage based on a Three Layer Security M...
IRJET- Privacy Preserving Cloud Storage based on a Three Layer Security M...
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
 
Survey on Data Security with Time Constraint in Clouds
Survey on Data Security with Time Constraint in CloudsSurvey on Data Security with Time Constraint in Clouds
Survey on Data Security with Time Constraint in Clouds
 
An4201262267
An4201262267An4201262267
An4201262267
 
Ijarcet vol-2-issue-3-951-956
Ijarcet vol-2-issue-3-951-956Ijarcet vol-2-issue-3-951-956
Ijarcet vol-2-issue-3-951-956
 
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
A Privacy Preserving Three-Layer Cloud Storage Scheme Based On Computational ...
 
Improving availability and reducing redundancy using deduplication of cloud s...
Improving availability and reducing redundancy using deduplication of cloud s...Improving availability and reducing redundancy using deduplication of cloud s...
Improving availability and reducing redundancy using deduplication of cloud s...
 
C017421624
C017421624C017421624
C017421624
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
Role Based Access Control Model (RBACM) With Efficient Genetic Algorithm (GA)...
 
Encryption based multi user manner secured data sharing and storing in cloud
Encryption based multi user manner secured data sharing and storing in cloudEncryption based multi user manner secured data sharing and storing in cloud
Encryption based multi user manner secured data sharing and storing in cloud
 

Similar to Improved deduplication with keys and chunks in HDFS storage providers

Dynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for CloudDynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for CloudAM Publications
 
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in CloudAn Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in CloudIRJET Journal
 
Guaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostGuaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostIRJET Journal
 
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...
IRJET -  	  Multi Authority based Integrity Auditing and Proof of Storage wit...IRJET -  	  Multi Authority based Integrity Auditing and Proof of Storage wit...
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...IRJET Journal
 
Efficient and Empiric Keyword Search Using Cloud
Efficient and Empiric Keyword Search Using CloudEfficient and Empiric Keyword Search Using Cloud
Efficient and Empiric Keyword Search Using CloudIRJET Journal
 
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingDistributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingAIRCC Publishing Corporation
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGAIRCC Publishing Corporation
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd Iaetsd
 
Secure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilitySecure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilityPvrtechnologies Nellore
 
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...IRJET Journal
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. IJCERT JOURNAL
 
IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
 IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
IRJET - Efficient and Verifiable Queries over Encrypted Data in CloudIRJET Journal
 
IRJET- Secure Data Deduplication and Auditing for Cloud Data Storage
IRJET-  	  Secure Data Deduplication and Auditing for Cloud Data StorageIRJET-  	  Secure Data Deduplication and Auditing for Cloud Data Storage
IRJET- Secure Data Deduplication and Auditing for Cloud Data StorageIRJET Journal
 
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...IRJET Journal
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSIRJET Journal
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET Journal
 
Improving Data Storage Security in Cloud using Hadoop
Improving Data Storage Security in Cloud using HadoopImproving Data Storage Security in Cloud using Hadoop
Improving Data Storage Security in Cloud using HadoopIJERA Editor
 

Similar to Improved deduplication with keys and chunks in HDFS storage providers (20)

Dynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for CloudDynamic Resource Allocation and Data Security for Cloud
Dynamic Resource Allocation and Data Security for Cloud
 
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in CloudAn Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
 
Guaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostGuaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient Cost
 
50120140507005 2
50120140507005 250120140507005 2
50120140507005 2
 
50120140507005
5012014050700550120140507005
50120140507005
 
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...
IRJET -  	  Multi Authority based Integrity Auditing and Proof of Storage wit...IRJET -  	  Multi Authority based Integrity Auditing and Proof of Storage wit...
IRJET - Multi Authority based Integrity Auditing and Proof of Storage wit...
 
Efficient and Empiric Keyword Search Using Cloud
Efficient and Empiric Keyword Search Using CloudEfficient and Empiric Keyword Search Using Cloud
Efficient and Empiric Keyword Search Using Cloud
 
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud ComputingDistributed Scheme to Authenticate Data Storage Security in Cloud Computing
Distributed Scheme to Authenticate Data Storage Security in Cloud Computing
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
 
Secure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliabilitySecure distributed deduplication systems with improved reliability
Secure distributed deduplication systems with improved reliability
 
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...Revocation based De-duplication Systems for Improving Reliability in Cloud St...
Revocation based De-duplication Systems for Improving Reliability in Cloud St...
 
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud. A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
A Secure Multi-Owner Data Sharing Scheme for Dynamic Group in Public Cloud.
 
IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
 IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
IRJET - Efficient and Verifiable Queries over Encrypted Data in Cloud
 
IRJET- Secure Data Deduplication and Auditing for Cloud Data Storage
IRJET-  	  Secure Data Deduplication and Auditing for Cloud Data StorageIRJET-  	  Secure Data Deduplication and Auditing for Cloud Data Storage
IRJET- Secure Data Deduplication and Auditing for Cloud Data Storage
 
E045026031
E045026031E045026031
E045026031
 
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
Privacy Preserving Data Analytics using Cryptographic Technique for Large Dat...
 
Deduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFSDeduplication on Encrypted Big Data in HDFS
Deduplication on Encrypted Big Data in HDFS
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Improving Data Storage Security in Cloud using Hadoop
Improving Data Storage Security in Cloud using HadoopImproving Data Storage Security in Cloud using Hadoop
Improving Data Storage Security in Cloud using Hadoop
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 

Recently uploaded (20)

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 

Improved deduplication with keys and chunks in HDFS storage providers

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1039 Improved deduplication with keysandchunksin HDFSstorage providers Ramya Ramesh1, K.O Sumitha2, Dr. S Subburam3, R.K Kapila Vani 4 1, 2Student, Department of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Chennai, India 3Professor, Department, of Computer Science and Engineering, Prince Shri Venkateshwara Padmavathy Engineering College, Chennai, India 4Assistant Professor, Department of Computer Science and Engineering, Prince Dr. K. Vasudevan College of Engineering and Technology, Chennai, India -------------------------------------------------------------------------------***--------------------------------------------------------------------------------- Abstract - Cloud computing is one type of computing platform that is Internet based. One important feature of a Cloud service is Cloud storage. The cloud users may upload sensitive data in the Cloud and allow the Cloud server to maintain these data. It is critical to manage large volumes of duplicated data hence many deduplication techniques were developed. Existing solutions of deduplication failed in providing security and reliabilityefficiently. Weproposeanew distributed deduplication system in HDFS storage providers that ensures higher reliability with the help of ownership verification and providesreliabledistributedkeymanagement servers to maintain the keys storage. Thus the new scheme achieves deduplication in HDFS storage easily. Key Words: deduplication, data chunks, key management, distributed block servers, tags. 1. INTRODUCTION Cloud storage is a model of storage of data with a highly virtualized infrastructure made up of many distributed resources. In the cloud storage, the data is remotely maintained and managed. Theuserscanstoretheir data online and are able to access from any part of the world via the Internet connection. The biggestadvantageof storing data in Cloud is that the users are provided with a broader range of access to distributed resources in a cost-efficient manner. This is because it eliminates the capital expenses such as buying the hardware, software and setting it up. Hence the cloud storage provides benefits of higher reliability, good performance, better usability, greater accessibility and secured environment. The rise of Cloud Computing led to the emergence of Big Data. Big Data refers to the term of dealing with extremely large amount of data that maybe structured or semi-structured or unstructured. Now-a-days, massive amount of informational data are generated due to digitalization and technological growth, hence it becomes essential to handle such Big Data in an efficient and economical way using Cloud.Toprocessthe Big Data, a Hadoop Java-based framework is used that helps to support large data processing in a parallel and distributed environment. Hadoop runs in a master-slave setup where the master distributes the jobs to its cluster and processes the map and reduce tasks sequentially. The most important requirement of Cloud storage is to manage the large sets of data efficiently. Though the cloud storage is efficient, in order to make it moreeffectivethereis a need for eliminating the duplicate copies of data that are stored in Cloud. The duplicate copies may arise in scenarios where the data are shared among multiple users. For example, consider an email server system, where 200 instances of data are sent to all the members of an organization. If all the members back-up their inboxes each of their instances are saved creating duplicate copies of the same file which increases the storage space in Cloud. This greatly wastes the network bandwidth, complicates management of data and energy consumption. This introduces the concept of deduplication in Cloud storage. Deduplication is a compression technique where it eliminates the duplicate copies of data by storing only one physical copy of the data and referring other duplicate data to that copy. The techniqueachievesminimizednetwork and storage overhead by improvising the storage capacity. Deduplication schemeshaveprovedtoreduce90-95percent storage for backup and up to 68 percent in file systems. Thereby, the overall storage needs has been reduced by 80 percent for both files and backups. Existing solutions on deduplication suffered from brute-force attacks [1], [2] and also it cannot ensure security, reliability, data privacy, data access control and data revocation. In this paper, we propose a new scheme for deduplication with higher reliability in which the data chunks are distributed across HDFS and a reliable key management in secure deduplication using slavenodes. The restofthepaper has the details as follows. Section 2 gives an overview of related work. Section 3 discusses about the current methodology.Section4introducesthesystemmodel.Section 5 gives detailed description of our proposed scheme and finally conclusion is provided in the Section 6.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1040 2. RELATED WORK Many traditional encryption techniques for deduplication were introduced but they are not suitable since the file suffers from dictionary attacks [3].The data privacy is an important constraint in dealing with Cloud storage. Hence to ensure the data security and privacy, a good way is to outsource encrypted data. DeDu [4] is a system that was introduced to solve duplication efficiently but it was not able to handle encrypted data. In order to resist brute-force attacks, a secure deduplicated storage called DupLESS [5] was proposed. But the drawback of the system is that it cannot control over the data access of other data users in a convenient way. To improve restore performance, History Aware Rewriting (HAR) algorithm [6] was proposed to accurateidentifiesand rewrite fragmented chunks. To protect the data security, an authorized data deduplication[7]wasproposedbyincluding differential privileges of users in the duplicate check in hybrid cloud architecture. But, it cannot flexibly support the data access control by data holders especially for data revocation process. ClouDedup [8], a secure and efficient storage service which assures block-level deduplicationand data confidentiality was proposed. This scheme failed since it cannot solve the issue caused by data deletion where the data holder can access the data as it still knows the data encryption key if the data is not completely removed from the Cloud. A new proposal was developed based on Provable Ownership of File (POF) [9]. This is a cryptographically secure and efficient scheme for a clienttoprovetotheserver based on actual possession of the entire file instead of only partial information about it. Thus, it helps in reducing the burden of the client. To have a secure and constant cost public Cloud storage [10], a new scheme introduced supported data integrity as well as storage deduplication at the same time. But, the drawback being that it doesn’t discuss about the feasibility of supporting deduplication with Big Data. For reducing the workloads due to duplicate files, index name server (INS) [11] was proposed but the work doesn’t concentrate on deduplication of encrypted data. In this paper, we perform both file-level and block-level deduplication and use MD5 algorithmtogeneratesignatures to perform the ownership verification. We also apply Triple Data Encryption algorithm (3DES) using convergent key for the data to be more secure and reliable against hackers. Thus, we achieve deduplication with higher security and reliability in an efficient manner in Cloud. 3. EXISTING METHODOLOGY Most of the previous methodologies have only considered in a single-server system. Another important factor to be considered is that deduplication systems must provide higher reliability especially whilehandlingdata that are sensitive or critical. At times, it is required that these data should be preserved over longer time periods. Therefore the deduplication system has to be such, it achieves higher reliability as well as higher security at the same time compared to that of the other high-available systems. In the existing methodology, eachuserhasa unique master key for preserving the data privacy. But, this approach is unreliable, since each user has to dedicatedly protect his own master key. Accidentally, if the master key is lost then the user can never be recovered and hence it paves a way for attackers to involve in data leakage. Yet another problem is that the number of keys gets increased with increasing number of users and it becomes difficult to manage not only for the storage of contentaswell as for the storage of keys. Disadvantages .Deduplication is not scalable as enormous numbers of keys are required with the increasing number of users. . Cost increases to the storage of content as well as storage of keys. .Lack of proper key management. . Security becomes crucial if the master key is not protected. .Easily prone to data leakage by attackers if the master key is lost. .Less reliability due to lack of key management. 4. SYSTEM MODEL We propose a new scheme to deduplicate the encrypted data ensuring higher reliability by ownership verification of the file with the help ofa uniquetaggenerated using MD5 algorithm.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1041 In the figure 1, a detailed overview of system architecture of our proposed scheme is shown. As described in the figure 1, the system consists of four different entities namely the User, Cloud server, N-key management servers and N-distributed block servers. User: User is an entity that makes use of the Cloud service for storage, retrieval and management of data in a cost-efficient environment. Cloud serviceallowstheuserto remotelystore the data online and can access the data from any part of the world via the internet connection. The users are provided with a priority of data security, enjoy a cost-effective service and also take the advantage of an unlimited storage space. Cloud server: Cloud server is an important entity built with a logical server over the internet. Cloud server is also known as the virtual server. It possess the capabilities similar to that of a normal server but can be remotely accessed from anywhere in the world via the internet access. An Example of a Cloud server that stores Electronic health records for doctors to access the patient’s records instantly in a Military based health system. N-Key management servers: Distributed key management servers are helpful in managing the keys storage in Cloud. Since the workload of storing keys in Cloud storage space adds overhead, the concept of distributedkeymanagementserverisintroduced. It provides improved scalability of key management with increasing number of keys and helps to protect sensitive data against the attackers. N-Distributed block servers: These servers are used to store the files as blocks in a distributed environment and theblocksarerequestedbythe Cloud server in behalf of the Cloud user after successful ownership verification. These are the componentsthataremajorlyinvolvedin our system. The Cloud user will initially registertotheCloud server and logins to their account to perform storage, retrieval or deletion of the data. When the data is uploaded onto the storage, each file and its block will beprovidedwith a unique tag generated by MD5 algorithm. The concept of ownership verification is performed to identify a valid user. The Cloud server generates convergent keys based on the hash and stores it in the key management server.Thesekeys are later used to encrypt and decrypt the data content. The Cloud server in behalf of the Cloud user requests the blocks to the distributed block servers and provides it to the user. Thus providing an efficient deduplication system that improves scalability and reliability. 5. PROPOSED METHODOLOGY The proposed methodology aims at providing a new deduplication system with higher reliability by splitting the file into several blocks and performing both file-level and block-level duplication checking. Our scheme ensures the security by means of an ownership verification. The ownership verification is done bytheCloud server to identify the valid user when uploading or downloading the file from preventing against attackers. A unique tag is assigned for each file and each block that is generated using an MD5 algorithm. Using the hash value, convergent key is created that performs encryption and decryption of the data content using Triple Data Encryption algorithm (3DES). Triple DES ensures high security and consistency to the data stored. Convergent encryption suffers from offline brute-force attacks, hence we generate the convergent key alone for our scheme and use it in anothersecure encryption3DESinstead of using convergent encryption for the entire data. An efficient distributed key management server helps in managing the convergent keys stored. A csv file that contains the metadata on the file such as hash value is also maintained to check the duplication. One of the main key features of using this new deduplicated system is that the data deletion will only delete the reference to that
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1042 copy but not the entire content and can be accessed by other users. File Upload: Suppose that a user wants to upload its file to the Cloud storage. The user initially has to make a registration to the Cloud and login with their account then he/she chooses an option to upload the file to the Cloud. Once after the file is uploaded, a unique signature tag is generated using MD5 algorithm. The produced hash tag is unique to only unique data contents. Suppose that another user wants to upload thesamefileinto the Cloud storage. Initially registration is done and then he/she chooses an option to upload the same file. Once after the upload process is completed, a unique signature is generated that matches with the previous signature indicating it’s a duplicate copy. Now, the Cloud server stores only one physical copy of the same file after performing the ownership verification. Thus, file-level deduplication is achieved. If the ownership verification becomes pass, the server provides the reference of the file to the requested user. If it fails, the upload operation is aborted. For block-level deduplication, the file uploaded is split into several blocks and stored in the distributed block servers. Each block is associated with a unique tag generated by the algorithm. If the same block is being uploaded, the server checks the ownership of the user to guarantee the validityof the owner and provides access rights to the users for the same block. The server fails to upload if the verificationfails. Thus, block-level deduplication is also achieved. File Download: Consider the user who wants to download the data from the Cloud. The user requests for the download option to the server. Upon ownership verification, the Cloud provides the user to download its data contents. If the user requests for a block, the ownership is verifiedandblocksare provided from the distributed block servers to the Cloud servers which in turn serves the users with the requested block. The user upon receiving the cipher text decrypts the data with its private key to get the original data. If the ownership verification fails, the downloadoperationis aborted. Data deletion: If the user requests for the data to bedeleted from the Cloud storage, the server only deletesthereference of that user to the data but the entire file is not deleted for providing accesses for other users. Algorithm 1: checking duplicate files: Figure 2: MD5 algorithm --The input file uploaded is processed bytheserver. --Generates a unique tag for the file using MD5. --Server maintains a csv metadata file with stored tags. --It becomes a Duplicate data, if the tag matches. -- Not a Duplicate data, otherwise. Algorithm 2: performing encryption: --Generating convergent keys using the stored hashes of file. --Perform 3DES encryption using convergent keys.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 02 | Feb -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 1043 Figure 3: Triple data encryption algorithm 6. CONCLUSION We propose a new scheme that efficiently supports Big Data in HDFS storage providers. An effective approachis implemented to verify data ownership through verification process. We integrate the scheme with data access control and deduplication in a simple way over a secure challenging environment. The privacy of the data is also preserved by making use of strongly effective algorithms. The data chunks are distributed across the HDFS storage ensuring higher reliability. A reliable distributed key management is accomplished where the overhead of storing the keys in the Cloud storage is lessened. Thus a new distributed deduplication system is achieved with higher reliability and security of data in HDFS storage. 6. REFERENCES [1] Z. O. Wilcox, “Convergent encryption reconsidered,” 2011. [Online]. Available: http://www.mailarchive.com/cryptography@metzdowd.co m/msg08949.html [2] The Freenet Project, Freenet. (2016). [Online].Available: https:// freenetproject.org/ [3] Atul Adya, William J Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken, John R Douceur, Jon Howell, Jacob R Lorch, Marvin Theimer, and Roger P Wattenhofer. Farsite: Federated, available,andreliablestorageforanincompletely trusted environment. ACM SIGOPS Operating Systems Review, 36(SI):1–14, 2002. [4] Z. Sun, J. Shen, and J. M. Yong, “DeDu: Building a deduplication storage system over cloud computing,” in Proc. IEEE Int. Conf. Comput. Supported Cooperative Work Des.,2011,pp.348–355,doi:10.1109/CSCWD.2011.5960097 [5] M. Bellare, S. Keelveedhi, and T. Ristenpart, “DupLESS: Server aided encryption for deduplicated storage,” in Proc. 22nd USENIX Conf. Secur., 2013, pp. 179–194. [6]M.Fuetal, “Acceleratingrestoreandgarbagecollectionindeduplication- based backup systems via exploiting historical information,” inProc. USENIX Annu. Tech. Conf., 2014, pp.181–192. [7] J. Li, Y. K. Li, X. F. Chen, P. P. C. Lee, and W. J. Lou, “A hybrid cloud approach for secure authorizeddeduplication,” IEEE Trans. Parallel Distrib. Syst., vol. 26, no. 5, pp. 1206– 1216, May 2015, doi:10.1109/TPDS.2014.2318320. [8] P. Puzio, R. Molva, M. Onen, and S. Loureiro, “ClouDedup: Secure deduplication with encrypteddata forcloudstorage,” in Proc. IEEE Int. Cof. Cloud Comput. Technol. Sci., 2013, pp. 363–370, doi:10.1109/CloudCom.2013.54. [9] C. Yang, J. Ren, and J. F. Ma, “Provable ownership of file in deduplication cloud storage,” in Proc. IEEE Global Commun. Conf., 2013, pp. 695–700, doi:10.1109/GLOCOM.2013.6831153. [10] J.W.Yuanand S.C.Yu, “Secure and constant cost public cloud storage auditing with deduplication,” in Proc. IEEE Int.Conf. Communic. Netw.Secur.,2013,pp.145– 153,doi:10.1109/CNS.2013.6682702. [11] T. Y. Wu, J. S. Pan, and C. F. Lin, “Improving accessing efficiency of cloud storageusingde-duplicationandfeedback schemes,” IEEE Syst. J., vol. 8, no. 1, pp. 208–218, Mar. 2014, doi:10.1109/ JSYST.2013.2256715.