0% found this document useful (0 votes)

62 views7 pages

A Survey On: Application-Aware Big Data Deduplication in Cloud Environment

This document discusses application-aware big data deduplication in cloud environments. It proposes AppDedupe, a scalable distributed deduplication framework that exploits application awareness, data similarity, and locality. AppDedupe breaks files into small chunks, assigns each chunk a fingerprint stored in a lookup table, and performs deduplication at the node level in parallel to achieve high efficiency and scalability. The methodology implements a two-tier routing scheme including file-level routing decisions and super-chunk level similarity to meet scalability challenges in deduplication.

Uploaded by

Rajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views7 pages

A Survey On: Application-Aware Big Data Deduplication in Cloud Environment

Uploaded by

Rajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

A survey on: Application-Aware Big Data

Deduplication in Cloud Environment

Mrs.Vani B Associate Professor, Dept of CSE, Sambhram Institute of Technology,Bangalore,
Hridesh Chaudhary, Rajan Boudel, Sanjeev kumar Raut, Santosh Lama, Dept of CSE,
Sambhram Institute of Technology, Bangalore

Abstract: Deduplication process is currently the widely used process in the cloud storage to
improve the IT resources and the efficiency. There are many techniques designed to solve the
problem of deduplication but many of those techniques has failed to struck down the problem
of efficient deduplication process. They have the problem of low duplicate elimination ratio
and are less scalable. To eliminate this problem, we introduce the concept of AppDedupe. It is
an Application-aware scalable inline distributed deduplication framework in cloud
environment. We can meet this challenge by exploiting application awareness, data similarity
and locality to optimize distributed deduplication with inter-node two-tiered data routing and
intra-node application-aware deduplication. In application aware deduplication, the file is
broken down into a small chunks. Each small chunks of a file is assigned with a hand print and
the hand print is stored in the lookup table. The handprint is assigned to speed up the process
of deduplication with the high efficiency. The AppDedupe process has the highest
deduplication efficiency with the high deduplication effectiveness in comparison to other
traditional deduplication process.

Key Terms: big data deduplication, application awareness, data routing, hand printing,
similarity index

Introduction
The ongoing technological development in the field of computing environment has led to flash
flood of data from the various domains over the past few years. The data centres are flooded
with the thousands of bytes of data and the entanglement of data management led us to the big
data. According to the IDC, the 75% of our data in the world of computerized storage is
duplicate data. The common example of bulge of duplicate data is Good morning message in
WhatsApp. The same data is forwarded to same people for many numbers of time and the data
will be stored in the user’s system multiple time which leads to the data duplication. In big
data there are many data which can be simply copy of the same data. The copy of same data
will lead to the problem of storage capacity and the data retrieval. The deduplication technology
helps us to overcome this problem by eliminating the duplicate data and decreases the
Administration time, operational complexity, power consumption and human error. In the
deduplication process the large data is divided into small parts which is also known as chunks.
The each chunk of data is assigned with the finger prints. The finger print will be stored in the
lookup table. When the duplicate chunks is intercepted by the application, it will eliminate the
duplicate data with the finger prints and only the unique data will be stored in the storage to
increase the communication and storage efficiency. The source inline deduplication is effective
because it eliminates the duplicate data before the data reaches the storage system which
reduces the requirement of physical storage capacity. The deduplication in wide area network
is more tougher than the deduplication within the same system or small network of system.
The both inter node and intra node deduplication framework in large scale faces the problem
of communication overhead in inter node and chunk index lookup disk bottleneck in intranode.
Because of this the parallel deduplication in high rate is not possible. So to eliminate this
problem, The AppDedupe is introduced which is a scalable source inline distributed
deduplication framework by leveraging application awareness. It is a middleware application
and they are deployed by the big data centre. The elimination of duplicate data takes place
before it reaches the storage system only the link of the data will be stored in the storage system.
It maintains the high data deduplication efficiency and performs deduplication in each node
independently and in parallel.

Related work
Initially, “The View of Cloud Computing” has been proposed where the process and the
mechanism which are used in the process of cloud computing are explained. “The Experiencing
Data De-Duplication” has been carried out where the improvement of the efficiency and
reducing capacity requirements had been explained. The next work related to the data de-
duplication is “An Application-Aware Framework for Video De-Duplication”, where the video
to be stored in the cloud environment are prevented to be reoccurring in the cloud. “Multi-level
Comparison of Data De-duplication in a Backup Scenario” is the next work that had been
carried out. This deals with the comparison of the data in the different levels for the appropriate
backup”.

Extreme Binning: Scalable, Parallel De-duplication for the chunk Based File Backup” is the
mechanism that had been carried out for the data de-duplication process. This helps to check
the redundancy of the data by dividing them into the chunk before storing it in the cloud. The
next work “A Framework for Analysing and Improving Content Based Chunking Algorithm”
had been proposed which helps to analyse and improve the storing of unique data in the virtual
environment by an algorithm.

The next work related to our project is “Avoiding the Disk Bottleneck in Data Domain De-
duplication File System”, which mainly deals with the parallel processing while the data is to
be stored in the cloud. Due to the bottleneck, there occurs a data traffic at the last moment of
the storage. Hence , to avoid such bottleneck the task was proposed for the File System. The
next work that had been related to our project is “Fast and Secure Laptop Backups With
Encrypted De-Duplication”. Generally, this task deals with the efficient and appropriate data
de-duplication from the laptop with proper speed and security. The data that are to be stored
Iin the cloud are encrypted which provide the secure Data De-Duplication. Similarly, there is
also “Primary Data De-Duplication, Large Scale Study and System Design” which provides
the primary steps that are required for the de-duplication process in the field of large-scale
study. For the largescale study and system design, there is a vital role of data de-duplication to
provide efficient storage by eradicating the redundant data, hence primary data de-duplication
had been carried out. The next work is “Content-Aware Load Balancing for Distributed
Backup” which helps to analyse the content of the data and distribute them accordingly so that
the load to be stored in the cloud are balanced.

The next task which is related to our project is “AA-Dedupe: An Application-Aware Source
De-duplication Approach for Cloud Backup Services in the Personal computing Environment”.
As we know that, in the present era, the implementation of personal computing Environment
has been widely deployed. Hence there must be the provision of the De-duplication Approach
in the data while storing it in the cloud for the backup purpose.

Existing System

Researchers have proposed many dynamic PoS schemes in single-user environments. In the
single user environment, deduplication process will take place within the user’s own data but
not the with the other user’s data.

In multi-user cloud storage system needs the secure client-side cross-user deduplication
technique, which allows a user to skip the uploading process and obtain the ownership of the
files immediately, when other owners of the same files have uploaded them to the cloud server
Proposed system

In the pre-process phase, users intend to upload their local files. Files will be divided in to the
blocks and for each and every block, hash code will be generated, and blocks will be sent to
the deduplication phase.

In the deduplication phase, it will verify all the blocks of the file whether the block is already
uploaded to the cloud or not based on the hash code generated to each of the block. If already
the block is uploaded then the user will get the ownership of the block, new blocks will be
sent to the upload phase.

AppDedupe Design
The basic things to understand about the word AppDedupe is the mechanism where an
application is created which is used to prevent the data repeatability while storing it in the
virtual environment. Basically there are three components like Capacity, Throughput and
Scalability on which the AppDedupe has been configured. The Capacity deals with the passing
of the similar data through the same deduplication node hence provide the adequate result in
the prevention of the duplicate data to be stored multiple times. The next is throughput, which
defines the efficiency of the process of deduplication. And the last component is the Scalability,
which basically demonstrate that the app that is used for the deduplication should be scaled
inorder to handle the huge amount of data to as per the requirement. Our main moto is to
achieve the high deduplication throughput with the excellent capacity and scalability. These
can be achieved by the design of the inline distributed deduplication framework which means
that the deduplication should be done for each and every small chunk of the data that are to be
stored in the cloud.

Methodology
As to meet the challenge faced in de-duplication we implemented the two-tiered data routing
scheme to obtain scalable performance with high ratio of deduplication efficiency including
file-level application aware routing decision in director components for managing files
information and keep track of files and other one is super-chunk level similarity aware data
routing in client components to measure whether the chunk is duplicate or not before sending
the data chunk and only the unique data chunks will transferred over the interconnection
network.
Based on these two-tiered data routing two algorithms were implemented and exerted one is
Application Aware Routing Algorithm which will implement in application aware routing
decision module of director and other one is Hand-printing Based Stateful Data Routing which
will improve load balance for dedupe storage node by avoiding or not altering the already
stored data when we try to add or delete the node in storage cluster.

Algorithm 1. Application Aware Routing Algorithm

Input: the full name of a file, full name, and a list of all dedupe storage nodes {S1, S2, …, SN}
Output: a ID list of application storage node, ID_list = {A1, A2 … ,Am}

1. Extract the filename extension as the application type from the file full name fullname, sent
from client side;

2. Query the application route table in director, and find the dedupe storage node Ai that have
stored the same type of application data; We get the corresponding application storage nodes
ID_list = {A1, A2, … , Am} ⊆ {S1, S2, …, SN};

3. Check the node list: if ID_list= ∅ or all nodes in ID_list are overloaded, then add the dedupe
storage node SL with lightest workload into the list ID_list={SL};

4. Return the result ID_list to the client.

Algorithm 2. Handprinting Based Stateful Data Routing

Input: a chunk fingerprint list of super-chunk S in a file, {fp1, fp2, … , fpc}, and the
corresponding application storage node ID list of the file, ID_list={A1, A2, … , Am}

Output: a target node ID, i

1. Select the k smallest chunk fingerprints {rfp1, rfp2, …, rfpk} as a handprint for the super-
chunk S by sorting the chunk fingerprint list {fp1, fp2, …,fpc}, and sent the handprint to k
candidate nodes with IDs mapped by consistent hashing in the m corresponding application
storage nodes;

2. Obtain the count of the existing representative fingerprints of the super-chunk S in the k
candidate nodes by comparing the representative fingerprints of the previously stored super-
chunks in the application aware similarity index, are denoted as {r1, r2, …, rk};
3. Calculate the relative storage usage, which is a node storage usage value divided by the
average storage usage value, to balance the capacity load in the k candidate nodes, are denoted
as {w1, w2, …, wk};

4. Choose the dedupe storage node with ID i that satisfies ri/wi = max{r1/w1, r2/w2, …, rk/wk}
as the target node.

Conclusion

In this paper, we describe App-Dedupe, an application-aware scalable inline distributed

deduplication frame-work for big data management, which achieves a trade-off between
scalable performance and distributed deduplication effectiveness by exploiting application
awareness, data similarity and locality.

It implements a two-tiered data routing scheme to route data at the super-chunk granularity to
reduce cross-node data redundancy providing good load balance and low communication
overhead, and employs application-aware similarity index based optimization to improve
deduplication efficiency in each node with very low RAM usage.

The main advantage of AppDedupe design framework is in the state-of-the-art distributed

deduplication for large clusters of data by outperforming the stateful tight coupling scheme in
cluster wide deduplication and improves stateless loose coupling schemes with high scalability
and low overhead communication.

References

[1] K. Srinivasan, T. Bisson, G. Goodson, and K. Voruganti. “iDedup: Latency-aware, inline data deduplication

for primary storage,” Proc. of the 10th USENIX Conference on File and Storage Technologies (FAST’12). Feb .

2012.

[2] D. Bhagwat, K. Eshghi, D.D. Long, M. Lillibridge, “Extreme Binning: Scalable, Parallel Deduplication for
Chunk-based File Backup,” Proc. of the 17th IEEE International Symposium on Modeling, Analysis, and
Simulation of Computer and Telecommunication Systems (MASCOTS’09), pp.1-9, Sep. 2009.
[3] W. Dong, F. Douglis, K. Li, H. Patterson, S. Reddy, P. Shilane, “Tradeoffs in Scalable Data Routing for
Deduplication Clusters,” Proc. of the 9th USENIX Conf. on File and Storage Technologies (FAST’11), pp. 15-
29, Feb. 2011.

[4] T. Yang, H. Jiang, D. Feng, Z. Niu, K. Zhou, Y. Wan, “DEBAR: a Scalable High-Performance
Deduplication Storage System for Backup and Archiving,” Proc. of the 24th IEEE International Parallel and
Distributed Processing Symposium (IPDPS’10), pp. 1-12, Apr. 2010.

[5] Y. Fu, H. Jiang, N. Xiao, “A Scalable Inline Cluster Deduplication Framework for Big Data Protection,”
Proc. of the 13th ACM/IFIP/ USENIX Conf. on Middleware (Middleware’12), pp. 354-373, Dec. 2012.

[6] M. Lillibridge, K. Eshghi, D. Bhagwat, “Improving Restore Speed for Backup Systems that Use Inline
Chunk-Based Deduplication,” Proc. of the 11th USNIX Conf. on File and Storage Technologies (FAST’13),
Feb. 2013.

[7] Min Fu, Dan Feng, Yu Hua, Xubin He, Zuoning Chen, Wen Xia, Yucheng Zhang, Yujuan Tan. "Design
Tradeoffs for Data Deduplication Performance in Backup Workloads," Proc. of the 13th USNIX Conf. on File
and Storage Technologies (FAST’13), pp. 331-344, Feb. 2015.

[8] W. Xia, H. Jiang, D. Feng, Y. Hua, “Silo: a Similarity-locality based Near-exact Deduplication Scheme with
Low RAM Overhead and High Throughput,” Proc. of 2011 USENIX Annual Technical Conference (ATC’11),
pp. 285-298, Jun. 2011

[9] Y. Fu, H. Jiang, N. Xiao, L. Tian, F. Liu, “AA-Dedupe: An Application-Aware Source Deduplication
Approach for Cloud Backup Services in the Personal Computing Environment,” Proc. of the 13th IEEE Conf.
on Cluster Computing (Cluster’11), pp. 112-120, Sep. 2011.

[10] B. Zhu, K. Li, H. Patterson, “Avoiding the Disk Bottleneck in the Data Domain Deduplication File
System,” Proc. of the 6th USENIX Conf. on File and Storage Technologies (FAST‘08), pp. 269-282, Feb. 2008.

Cloud Backup Deduplication Guide
No ratings yet
Cloud Backup Deduplication Guide
3 pages
Ijctt V3i3p108
No ratings yet
Ijctt V3i3p108
6 pages
Iaetsd Controlling Data Deuplication in Cloud Storage
No ratings yet
Iaetsd Controlling Data Deuplication in Cloud Storage
6 pages
Secure Destributed De-Duplication System in Reliable Cloud Storage1
No ratings yet
Secure Destributed De-Duplication System in Reliable Cloud Storage1
45 pages
Solving Data De-Duplication Issues On Cloud Using Hashing and Md5 Techniques
No ratings yet
Solving Data De-Duplication Issues On Cloud Using Hashing and Md5 Techniques
11 pages
Sathe 2018
No ratings yet
Sathe 2018
4 pages
CLOUD COMPUTING - Using Fog Computing To Deduplicate Data For Efficient Cloud Storage.
No ratings yet
CLOUD COMPUTING - Using Fog Computing To Deduplicate Data For Efficient Cloud Storage.
4 pages
Amol PCX - Report
No ratings yet
Amol PCX - Report
15 pages
Unit 2 and 3 (2 Part)
No ratings yet
Unit 2 and 3 (2 Part)
9 pages
MJNW06 - Final Document
No ratings yet
MJNW06 - Final Document
77 pages
Authorized Data Deduplication in Cloud
No ratings yet
Authorized Data Deduplication in Cloud
11 pages
Block
No ratings yet
Block
16 pages
Cloud Data de Duplication in Multiuser Environment DeposM2
No ratings yet
Cloud Data de Duplication in Multiuser Environment DeposM2
3 pages
Secure Data Deduplication and Auditing For Cloud Data Storage 1 1
No ratings yet
Secure Data Deduplication and Auditing For Cloud Data Storage 1 1
4 pages
A Survey: Enhanced Block Level Message Locked Encryption For Data Deduplication
No ratings yet
A Survey: Enhanced Block Level Message Locked Encryption For Data Deduplication
4 pages
File Sharing and Data Duplication Removal in Cloud Using File Checksum
No ratings yet
File Sharing and Data Duplication Removal in Cloud Using File Checksum
3 pages
A Study On Data Deduplication Techniques For Optimized Storage
No ratings yet
A Study On Data Deduplication Techniques For Optimized Storage
7 pages
Paper Publish
No ratings yet
Paper Publish
6 pages
Secure Distributed Deduplication Systems
No ratings yet
Secure Distributed Deduplication Systems
46 pages
Cloud Data Deduplication Techniques
No ratings yet
Cloud Data Deduplication Techniques
7 pages
An Efficient Framework and Techniques of Data Deduplication in Data Center
No ratings yet
An Efficient Framework and Techniques of Data Deduplication in Data Center
5 pages
Protected Steadfast Deduplication in Crossbreed Cloud Technique
No ratings yet
Protected Steadfast Deduplication in Crossbreed Cloud Technique
5 pages
Research Paper
No ratings yet
Research Paper
6 pages
Deduplication Review
No ratings yet
Deduplication Review
18 pages
Data Routing Strategy in Cloud Environment (Autorecovered)
No ratings yet
Data Routing Strategy in Cloud Environment (Autorecovered)
5 pages
Deduplication 2023
No ratings yet
Deduplication 2023
14 pages
Ayesha
No ratings yet
Ayesha
43 pages
Final Paper v2.5
No ratings yet
Final Paper v2.5
6 pages
(IJCST-V10I5P53) :MR D Purushothaman, M Naveen
No ratings yet
(IJCST-V10I5P53) :MR D Purushothaman, M Naveen
8 pages
1822 B.E Cse Batchno 141
No ratings yet
1822 B.E Cse Batchno 141
56 pages
Understanding Data Deduplication
No ratings yet
Understanding Data Deduplication
4 pages
Secure Auditing and Deduplicating Data in Cloud: Akhila N P
No ratings yet
Secure Auditing and Deduplicating Data in Cloud: Akhila N P
14 pages
Final Conference Paper Plagiarism Verified
No ratings yet
Final Conference Paper Plagiarism Verified
10 pages
Deduplication Using Hadoop and Hbase
100% (1)
Deduplication Using Hadoop and Hbase
18 pages
SRS NNN
No ratings yet
SRS NNN
16 pages
Revocation Based De-Duplication Systems For Improving Reliability in Cloud Storage
No ratings yet
Revocation Based De-Duplication Systems For Improving Reliability in Cloud Storage
6 pages
Conference CloudFileOptimizer
No ratings yet
Conference CloudFileOptimizer
4 pages
Cloud Security & Deduplication Analysis
No ratings yet
Cloud Security & Deduplication Analysis
5 pages
Data Deduplication Strategies in Cloud Computing
No ratings yet
Data Deduplication Strategies in Cloud Computing
5 pages
Sample Final Project Documentation
No ratings yet
Sample Final Project Documentation
87 pages
Deduplication On Encrypted Data in Cloud Computing
No ratings yet
Deduplication On Encrypted Data in Cloud Computing
4 pages
Edge Data Deduplication Under Uncertainties A Robust Optimization Approach
No ratings yet
Edge Data Deduplication Under Uncertainties A Robust Optimization Approach
12 pages
TSC 2024 10 0911 - Proof - Hi
No ratings yet
TSC 2024 10 0911 - Proof - Hi
15 pages
Storage Management System Using Block Level Deduplication Technique in Cloud Computing Final
No ratings yet
Storage Management System Using Block Level Deduplication Technique in Cloud Computing Final
5 pages
2022 V13i1173
No ratings yet
2022 V13i1173
8 pages
A Hybrid Cloud Approach For Secure Authorized Deduplication
100% (4)
A Hybrid Cloud Approach For Secure Authorized Deduplication
9 pages
23enhanced Storage Optimization System (SoS) For IaaS Cloud Storage
No ratings yet
23enhanced Storage Optimization System (SoS) For IaaS Cloud Storage
5 pages
Cloud Data Backup & Recovery Strategies
No ratings yet
Cloud Data Backup & Recovery Strategies
8 pages
Ajay Kumar A Synopsis On-Cloud Computing
No ratings yet
Ajay Kumar A Synopsis On-Cloud Computing
10 pages
Probabilistic Data Deduplication Study
No ratings yet
Probabilistic Data Deduplication Study
5 pages
A New Cloud Arechitecutre For Secure Verified Deduplication
No ratings yet
A New Cloud Arechitecutre For Secure Verified Deduplication
33 pages
Li 2020
No ratings yet
Li 2020
13 pages
Keywords:: De-Duplication, Authorized Duplicate Check, Confidentiality
No ratings yet
Keywords:: De-Duplication, Authorized Duplicate Check, Confidentiality
1 page
191 1496476036 - 03-06-2017 PDF
No ratings yet
191 1496476036 - 03-06-2017 PDF
5 pages
A Hybrid Cloud Approach For Secure Authorized Deduplication
No ratings yet
A Hybrid Cloud Approach For Secure Authorized Deduplication
15 pages
Detecting Replicated Files in The Cloud
No ratings yet
Detecting Replicated Files in The Cloud
9 pages
Deduplication On Encrypted Big Data in Cloud
No ratings yet
Deduplication On Encrypted Big Data in Cloud
13 pages
Data Deduplication for IT Pros
No ratings yet
Data Deduplication for IT Pros
16 pages
Asplos 2000
No ratings yet
Asplos 2000
12 pages
Internet-Based Services: Load Balancing Is A
No ratings yet
Internet-Based Services: Load Balancing Is A
7 pages
Ab Initio (Company)
No ratings yet
Ab Initio (Company)
4 pages
1Z0-908 03
No ratings yet
1Z0-908 03
2 pages
Db2 Udb For Os-390 and Z-Os
No ratings yet
Db2 Udb For Os-390 and Z-Os
590 pages
Your Hermes Returns QR Code For Mandm Direct LTD Is Ready.: Important - How To Return Your Parcel
No ratings yet
Your Hermes Returns QR Code For Mandm Direct LTD Is Ready.: Important - How To Return Your Parcel
2 pages
Cambridge IGCSE: 0417/12 Information and Communication Technology
No ratings yet
Cambridge IGCSE: 0417/12 Information and Communication Technology
20 pages
2bb1012a35095a2f1efd90bc902b1ebf
No ratings yet
2bb1012a35095a2f1efd90bc902b1ebf
29 pages
Advanced Excel & VBA Macros
100% (2)
Advanced Excel & VBA Macros
7 pages
Some Python List Methods: These Methods Return A Value and Do Not Change The List
No ratings yet
Some Python List Methods: These Methods Return A Value and Do Not Change The List
1 page
Chapter 2 Component of The System Unit
No ratings yet
Chapter 2 Component of The System Unit
43 pages
Pharma Export Validation Guide
No ratings yet
Pharma Export Validation Guide
17 pages
Pallavi Model School, Alwal Class Xi (Ip) Worksheet - 2 (Mysql Queries)
No ratings yet
Pallavi Model School, Alwal Class Xi (Ip) Worksheet - 2 (Mysql Queries)
5 pages
Design and Implementation Mechanism
No ratings yet
Design and Implementation Mechanism
7 pages
PHP File Upload: DOC & PDF Guide
0% (1)
PHP File Upload: DOC & PDF Guide
3 pages
Introduction To Ibm Filenet Content Manager
No ratings yet
Introduction To Ibm Filenet Content Manager
32 pages
DS Unit-3 Notes
No ratings yet
DS Unit-3 Notes
28 pages
How To Reset The Administrator Password in Business Intelligence Platform 4
No ratings yet
How To Reset The Administrator Password in Business Intelligence Platform 4
3 pages
MySQL Numeric Data Types Guide
No ratings yet
MySQL Numeric Data Types Guide
3 pages
Hexadecimal in Microprocessors
No ratings yet
Hexadecimal in Microprocessors
20 pages
Chapter 17 Binary IO
No ratings yet
Chapter 17 Binary IO
59 pages
Modbus Com User Manual en
No ratings yet
Modbus Com User Manual en
140 pages
HBase Overview and Architecture
No ratings yet
HBase Overview and Architecture
15 pages
Hashicorp Certified Terraform Associate Practice Questions - Advanced
No ratings yet
Hashicorp Certified Terraform Associate Practice Questions - Advanced
32 pages
Users Manual 641592 1 PDF
No ratings yet
Users Manual 641592 1 PDF
129 pages
Assignment 5: Write A Program To Generate Fractal Generation: Koch Curve. Code
No ratings yet
Assignment 5: Write A Program To Generate Fractal Generation: Koch Curve. Code
30 pages
Stack Queue
No ratings yet
Stack Queue
7 pages
Pronto
No ratings yet
Pronto
6 pages
Fs Mini Project Report
No ratings yet
Fs Mini Project Report
25 pages
Advanced Informatica 8.6 Training Plan
No ratings yet
Advanced Informatica 8.6 Training Plan
6 pages

A Survey On: Application-Aware Big Data Deduplication in Cloud Environment

Uploaded by

A Survey On: Application-Aware Big Data Deduplication in Cloud Environment

Uploaded by

A survey on: Application-Aware Big Data

Deduplication in Cloud Environment

Algorithm 1. Application Aware Routing Algorithm

4. Return the result ID_list to the client.

Algorithm 2. Handprinting Based Stateful Data Routing

Output: a target node ID, i

In this paper, we describe App-Dedupe, an application-aware scalable inline distributed

The main advantage of AppDedupe design framework is in the state-of-the-art distributed

You might also like