SlideShare a Scribd company logo
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
Reducing Fragmentation for In-line Deduplication
Backup Storage via Exploiting Backup History
And Cache Knowledge
ABSTRACT
In backup systems, the chunks of each backup are physically scattered after
deduplication, which causes a challenging fragmentation problem. We observe that
the fragmentation comes into sparse and out-of-order containers. The sparse
container decreases restore performance and garbage collection efficiency, while
the out-of-order container decreases restore performance if the restore cache is
small. In order to reduce the fragmentation, we propose History-Aware Rewriting
algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical
information in backup systems to accurately identify and reduce sparse containers,
and CAF exploits restore cache knowledge to identify the out-of-order containers
that hurt restore performance. CAF efficiently complements HAR in datasets
where out-of-order containers are dominant. To reduce the metadata overhead of
the garbage collection, we further propose a Container-Marker Algorithm (CMA)
to identify valid containers instead of valid chunks. Our extensive experimental
results from real-world datasets show HAR significantly improves the restore
performance by 2.84-175.36at a cost of only rewriting 0.5-2.03% data
Existing System:
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
In backup systems, the chunks of each backup are physically scattered after
deduplication, which causes a challenging fragmentation problem. We observe that
the fragmentation comes into sparse and out-of-order containers. The sparse
container decreases restore performance and garbage collection efficiency, while
the out-of-order container decreases restore performance if the restore cache is
small. In order to reduce the fragmentation
Proposed System
we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter
(CAF). HAR exploits historical information in backup systems to accurately
identify and reduce sparse containers, and CAF exploits restore cache knowledge
to identify the out-of-order containers that hurt restore performance. CAF
efficiently complements HAR in datasets where out-of-order containers are
dominant. To reduce the metadata overhead of the garbage collection, we further
propose a Container-Marker Algorithm (CMA) to identify valid containers instead
of valid chunks. Our extensive experimental results from real-world datasets show
HAR significantly improves the restore performance by 2.84-175.36at a cost of
only rewriting 0.5-2.03% data.
propose a hybrid rewriting algorithm as complements of HAR to reduce the
negative impacts of out-of-order containers. HAR, as well as OPT, improves
restore performance by 2.84- 175.36 at an acceptable cost in deduplication ratio.
HAR outperforms the state-of-the-art work in terms of both deduplication ratio and
restore performance. The hybrid scheme is helpful to further improve restore
performance in datasets where out-of-order containers are dominant. To avoid a
significant decrease of deduplication ratio in the hybrid scheme, we develop a
Cache-Aware Filter (CAF) to exploit cache knowledge. With the help of CAF, the
hybrid scheme significantly improves the deduplication ratio without decreasing
the restore performance. Note that CAF can be used as an optimization of existing
rewriting algorithms.
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
MODULES
Architecture:
MODULES
 Storage system
 Data Deduplication
 chunk fragmentation,
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
 performance evaluation
Storage system
Cloud storage is a model of data storage where the digital data is stored in
logical pools, the physical storage spans multiple servers (and often locations),
and the physical environment is typically owned and managed by a hosting
company. These cloud storage providers are responsible for keeping the data
available and accessible, and the physical environment protected and running.
People and organizations buy or lease storage capacity from the providers to
store end user, organization, or application data
Data Deduplication
multiple backups, the chunks of a backup unfortunately become physically
scattered in different containers, which is known as fragmentation. The negative
impacts of the fragmentation are two-fold. First, the fragmentation severely
decreases restore performance. The infrequent restore is important and the main
concern from users Moreover, data replication, which is important for disaster
recovery, requires reconstructions of original backup streams from
deduplication systems and thus suffers from a performance problem similar to
the restore operation. Second, the fragmentation results in invalid chunks (not
referenced by any backups) becoming physically scattered in different
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
containers when users delete expired backups. Existing garbage collection
solutions first identify valid chunks and the containers holding only a few valid
chunks (i.e., reference management Then, a merging operation is required to
copy the valid chunks in the identified containers to new containers Finally, the
identified containers are reclaimed. Unfortunately, the metadata space overhead
of reference management is proportional to the number of chunks, and the
merging operation is the most timeconsuming phase in garbage collection We
observe that the fragmentation comes in two categories of containers: sparse
containers and out-of-order containers, which have different negative impacts
and require dedicated solutions.
Chunk fragmentation
The fragmentation problem in deduplication systems has received many
attentions. iDedup eliminates sequential and duplicate chunks in the context of
primary storage systems. Nam et al. propose a quantitative metric to measure the
fragmentation level of deduplication systems , and a selective deduplication
scheme for backup workloads. SAR stores hot chunks in SSD to accelerate
reads. RevDedup employs a hybrid inline and out-of-line deduplication scheme
to improve restore performance of latest backups. The Context-Based Rewriting
algorithm (CBR) [17] and the capping algorithm (Capping) are recently proposed
rewriting algorithms to address the fragmentation problem. Both of them buffer
a small part of the on-going backup stream during a backup, and identify
fragmented chunks within the buffer (generally 10-20 MB). For example,
Capping divides the backup stream into fixed-sized segments (e.g., 20 MB), and
conjectures the fragmentation within each segment. Capping limits the
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
maximum number (say T) of containers a segment can refer to. Suppose a new
segment refers to N containers and N > T, the chunks in the N �T containers that
hold the least chunks in the segment are rewritten. Reference management for
the garbage collection is complicated since each chunk can be referenced by
multiple backups. The offline approaches traverse all fingerprints (including the
fingerprint index and recipes) when the system is idle. For example, Botelho et
al. [14] build a perfect hash vector as a compact representation of all chunks.
Since recipes need to occupy significantly large storage space , the traversing
operation is timeconsuming. The inline approaches maintain additional
metadata during backup to facilitate the garbage collection. Maintaining a
reference counter for each chunk is expensive and error-prone . Grouped Mark-
and- Sweep (GMS) uses a bitmap to mark which chunks in a container are used
by a backup.
performance evaluation
Four datasets, including Kernel, VMDK, RDB, and Synthetic, are used for
evaluation. Their characteristics are listed in Table 1. Each backup stream is
divided into variable-sized chunks via Content-Defined Chunking . Kernel,
downloaded from the web is a commonly used public dataset [. It consists of
258 consecutive versions of unpacked Linux codes. Each version is 412:78 MB on
average. Two consecutive versions are generally 99% identical except when
there are major revision upgrades. There are only a few self-references and
hence sparse containers are dominant. VMDK is from a virtual machine
installed Ubuntu 12.04LTS, which is a common use-case in real-world . We
compiled source code, patched the system, and ran an HTTP server on the
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
virtual machine. VMDK consists of 126 full backups. Each full backup is 15:36
GB in size on average, and 90-98% identical to its adjacent backups. Each backup
contains about 15% self-referred chunks. Out-of-order containers are dominant
and sparse containers are less severe. RDB consists of snapshots of a Redis
database The database has 5 million records, 5 GB in space. We ran YCSB to
update the database in a Zipfian distribution. The update ratio is of 1% on
average. After each run, we archived the uncompressed dump.rdb file that is the
on-disk snapshot of the database. Finally, we got 212 versions of snapshots.
There is no self-reference and hence sparse containers are dominant. Synthetic
was generated according to existing approaches We simulated common
operations of file systems, such as file creation/deletion/modification. We
finally obtained a 4:5 TB dataset with 400 versions. There is no self-reference in
Synthetic and sparse containers are dominant.
History-Aware Rewriting Algorithm
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
At the beginning of a backup, HAR loads IDs of all inherited sparse containers
to construct the in-memory Sinherited structure During the backup, HAR
rewrites all duplicate chunks whose container IDs exist in Sinherited.
Additionally, HAR maintains an in-memory structure, Semerging (included in
collected info in Figure 3), to monitor the utilizations of all the containers
referenced by the backup. Semerging is a set of utilization records, and each
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
record consists of a container ID and the current utilization of the container.
After the backup concludes, HAR removes the records of higher utilizations
than the utilization threshold from Semerging. Semerging then contains IDs of
all emerging sparse containers. In most cases, Semerging can be flushed directly
to disks as the Sinherited of the next backup, because the size of Semerging is
generally small due to our second observation. However, there are two spikes in
. A large number of emerging sparse containers indicates that we have many
fragmented chunks to be rewritten in next backup. It would change the
performance bottleneck to data writing and hurt the backup performance that is
of top priority . To address this problem, HAR sets a rewrite limit, such as 5%, to
avoid too much rewrites in next backup. HAR uses the rewrite limit to
determine whether there are too many sparse containers in Semerging. (1) HAR
calculates an estimated rewrite ratio (defined as the size of rewritten data
divided by the backup size) for the next backup. Specifically, HAR first
calculates the
System Configuration:
HARDWARE REQUIREMENTS:
Hardware - Pentium
Speed - 1.1 GHz
RAM - 1GB
Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore
www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
SOFTWARE REQUIREMENTS:
Operating System : Windows
Technology : Java and J2EE
Web Technologies : Html, JavaScript, CSS
IDE : My Eclipse
Web Server : Tomcat
Tool kit : Android Phone
Database : My SQL
Java Version : J2SDK1.5

More Related Content

What's hot

Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoop
João Gabriel Lima
 
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCEPERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
ijdpsjournal
 
An experimental evaluation of performance
An experimental evaluation of performanceAn experimental evaluation of performance
An experimental evaluation of performance
ijcsa
 
Change data capture the journey to real time bi
Change data capture the journey to real time biChange data capture the journey to real time bi
Change data capture the journey to real time bi
Asis Mohanty
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database Systems
Alexey Smirnov
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIESSTUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
ijdpsjournal
 
Mobile computing unit 5
Mobile computing  unit 5Mobile computing  unit 5
Mobile computing unit 5
Assistant Professor
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduce
eSAT Publishing House
 
Cjoin
CjoinCjoin
Cjoin
blogboy
 
Erlang Cache
Erlang CacheErlang Cache
Erlang Cache
ice j
 
Sqlmr
SqlmrSqlmr
Sqlmr
Ajay Ohri
 
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATIONA BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
cscpconf
 
Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...
eSAT Publishing House
 
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Nandhitha B
 
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
IJCSIS Research Publications
 
HA Summary
HA SummaryHA Summary
HA Summary
Zhang Fan
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed Database
Md. Shamsur Rahim
 
Active / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data GuardActive / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data Guard
Aris Prassinos
 
Chapter25
Chapter25Chapter25
Chapter25
gourab87
 

What's hot (19)

Integrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoopIntegrating dbm ss as a read only execution layer into hadoop
Integrating dbm ss as a read only execution layer into hadoop
 
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCEPERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
 
An experimental evaluation of performance
An experimental evaluation of performanceAn experimental evaluation of performance
An experimental evaluation of performance
 
Change data capture the journey to real time bi
Change data capture the journey to real time biChange data capture the journey to real time bi
Change data capture the journey to real time bi
 
RDB - Repairable Database Systems
RDB - Repairable Database SystemsRDB - Repairable Database Systems
RDB - Repairable Database Systems
 
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIESSTUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
STUDY ON EMERGING APPLICATIONS ON DATA PLANE AND OPTIMIZATION POSSIBILITIES
 
Mobile computing unit 5
Mobile computing  unit 5Mobile computing  unit 5
Mobile computing unit 5
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduce
 
Cjoin
CjoinCjoin
Cjoin
 
Erlang Cache
Erlang CacheErlang Cache
Erlang Cache
 
Sqlmr
SqlmrSqlmr
Sqlmr
 
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATIONA BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
A BRIEF REVIEW ALONG WITH A NEW PROPOSED APPROACH OF DATA DE DUPLICATION
 
Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...Architecture and implementation issues of multi core processors and caching –...
Architecture and implementation issues of multi core processors and caching –...
 
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
 
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
A Survey on Rule-Based Systems and the significance of Fault Tolerance for Hi...
 
HA Summary
HA SummaryHA Summary
HA Summary
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed Database
 
Active / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data GuardActive / Active configurations with Oracle Active Data Guard
Active / Active configurations with Oracle Active Data Guard
 
Chapter25
Chapter25Chapter25
Chapter25
 

Viewers also liked

lista de chequeo y hoja de vida biblioteca 2
lista de chequeo y hoja de vida biblioteca 2lista de chequeo y hoja de vida biblioteca 2
lista de chequeo y hoja de vida biblioteca 2
paoalejaipiespa
 
Ad walter mondragon m.
Ad walter mondragon m.Ad walter mondragon m.
Ad walter mondragon m.
cahefeva
 
#Cybermedia2015 Ilaria Sangregorio
#Cybermedia2015 Ilaria Sangregorio#Cybermedia2015 Ilaria Sangregorio
#Cybermedia2015 Ilaria Sangregorio
CyberMediAvola
 
Hoja de vida y lista de chequeo paula franpasescuela
Hoja de vida y lista de chequeo paula franpasescuelaHoja de vida y lista de chequeo paula franpasescuela
Hoja de vida y lista de chequeo paula franpasescuela
paoalejaipiespa
 
використання блогів в управлінні персоналом
використання блогів в управлінні персоналомвикористання блогів в управлінні персоналом
використання блогів в управлінні персоналом
pedenko
 
Infortis news
Infortis newsInfortis news
Infortis news
Balthazar Halfbreed
 
#Cybermedia2015 Cenzina Salemi
#Cybermedia2015 Cenzina Salemi#Cybermedia2015 Cenzina Salemi
#Cybermedia2015 Cenzina Salemi
CyberMediAvola
 
Equilibrio y le chatelier
Equilibrio y le chatelierEquilibrio y le chatelier
Equilibrio y le chatelier
Joselyn Bosquez
 
Aumentando a sua produtividade com ZTD – Zen to Done
Aumentando a sua produtividade com ZTD – Zen to DoneAumentando a sua produtividade com ZTD – Zen to Done
Aumentando a sua produtividade com ZTD – Zen to Done
Gerente Bem Informado
 
Health benefits of butter
Health benefits of butterHealth benefits of butter
Health benefits of butter
riksana riki
 
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
EUPATI
 
Roadies
RoadiesRoadies
Roadies
Dhruv009
 
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
Billy Nascimento
 
552124
552124552124
552124
Forum100plus
 
Презентация 100+ Forum Russia 2015
Презентация 100+ Forum Russia 2015Презентация 100+ Forum Russia 2015
Презентация 100+ Forum Russia 2015
Forum100plus
 
Aula 6 Prescricao De Exercicio E Treinamento Fisico
Aula 6   Prescricao De Exercicio E Treinamento FisicoAula 6   Prescricao De Exercicio E Treinamento Fisico
Aula 6 Prescricao De Exercicio E Treinamento Fisico
Felipe P Carpes - Universidade Federal do Pampa
 

Viewers also liked (16)

lista de chequeo y hoja de vida biblioteca 2
lista de chequeo y hoja de vida biblioteca 2lista de chequeo y hoja de vida biblioteca 2
lista de chequeo y hoja de vida biblioteca 2
 
Ad walter mondragon m.
Ad walter mondragon m.Ad walter mondragon m.
Ad walter mondragon m.
 
#Cybermedia2015 Ilaria Sangregorio
#Cybermedia2015 Ilaria Sangregorio#Cybermedia2015 Ilaria Sangregorio
#Cybermedia2015 Ilaria Sangregorio
 
Hoja de vida y lista de chequeo paula franpasescuela
Hoja de vida y lista de chequeo paula franpasescuelaHoja de vida y lista de chequeo paula franpasescuela
Hoja de vida y lista de chequeo paula franpasescuela
 
використання блогів в управлінні персоналом
використання блогів в управлінні персоналомвикористання блогів в управлінні персоналом
використання блогів в управлінні персоналом
 
Infortis news
Infortis newsInfortis news
Infortis news
 
#Cybermedia2015 Cenzina Salemi
#Cybermedia2015 Cenzina Salemi#Cybermedia2015 Cenzina Salemi
#Cybermedia2015 Cenzina Salemi
 
Equilibrio y le chatelier
Equilibrio y le chatelierEquilibrio y le chatelier
Equilibrio y le chatelier
 
Aumentando a sua produtividade com ZTD – Zen to Done
Aumentando a sua produtividade com ZTD – Zen to DoneAumentando a sua produtividade com ZTD – Zen to Done
Aumentando a sua produtividade com ZTD – Zen to Done
 
Health benefits of butter
Health benefits of butterHealth benefits of butter
Health benefits of butter
 
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
EUPATI 2013 Conference: Vision on Patient involvement in medicines R&D: Here...
 
Roadies
RoadiesRoadies
Roadies
 
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
Neurociência do consumo: Entendendo o que é Neuromarketing. Aula 3 - Atenção ...
 
552124
552124552124
552124
 
Презентация 100+ Forum Russia 2015
Презентация 100+ Forum Russia 2015Презентация 100+ Forum Russia 2015
Презентация 100+ Forum Russia 2015
 
Aula 6 Prescricao De Exercicio E Treinamento Fisico
Aula 6   Prescricao De Exercicio E Treinamento FisicoAula 6   Prescricao De Exercicio E Treinamento Fisico
Aula 6 Prescricao De Exercicio E Treinamento Fisico
 

Similar to Reducing fragmentation for in line deduplication

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
ijesajournal
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
ijesajournal
 
High performance operating system controlled memory compression
High performance operating system controlled memory compressionHigh performance operating system controlled memory compression
High performance operating system controlled memory compression
Mr. Chanuwan
 
Power minimization of systems using Performance Enhancement Guaranteed Caches
Power minimization of systems using Performance Enhancement Guaranteed CachesPower minimization of systems using Performance Enhancement Guaranteed Caches
Power minimization of systems using Performance Enhancement Guaranteed Caches
IJTET Journal
 
Charm a cost efficient multi cloud data hosting scheme with high availability
Charm a cost efficient multi cloud data hosting scheme with high availabilityCharm a cost efficient multi cloud data hosting scheme with high availability
Charm a cost efficient multi cloud data hosting scheme with high availability
Pvrtechnologies Nellore
 
IRJET- A Futuristic Cache Replacement using Hybrid Regression
IRJET-  	  A Futuristic Cache Replacement using Hybrid RegressionIRJET-  	  A Futuristic Cache Replacement using Hybrid Regression
IRJET- A Futuristic Cache Replacement using Hybrid Regression
IRJET Journal
 
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
IRJET Journal
 
E24099025李宇洋_專題報告.pdf
E24099025李宇洋_專題報告.pdfE24099025李宇洋_專題報告.pdf
E24099025李宇洋_專題報告.pdf
KerzPAry137
 
Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory Interface
IOSRJVSP
 
lecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memorylecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memory
floraaluoch3
 
P2P Cache Resolution System for MANET
P2P Cache Resolution System for MANETP2P Cache Resolution System for MANET
P2P Cache Resolution System for MANET
IJCSIS Research Publications
 
Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...
IRJET Journal
 
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
Mshari Alabdulkarim
 
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
DataCore APAC
 
Assimilating sense into disaster recovery databases and judgement framing pr...
Assimilating sense into disaster recovery databases and  judgement framing pr...Assimilating sense into disaster recovery databases and  judgement framing pr...
Assimilating sense into disaster recovery databases and judgement framing pr...
IJECEIAES
 
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP OpsIRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET Journal
 
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET Journal
 
Virtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged StorageVirtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged Storage
DataCore Software
 
Different Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache MemoryDifferent Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache Memory
Dhritiman Halder
 

Similar to Reducing fragmentation for in line deduplication (20)

International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
High performance operating system controlled memory compression
High performance operating system controlled memory compressionHigh performance operating system controlled memory compression
High performance operating system controlled memory compression
 
Power minimization of systems using Performance Enhancement Guaranteed Caches
Power minimization of systems using Performance Enhancement Guaranteed CachesPower minimization of systems using Performance Enhancement Guaranteed Caches
Power minimization of systems using Performance Enhancement Guaranteed Caches
 
Charm a cost efficient multi cloud data hosting scheme with high availability
Charm a cost efficient multi cloud data hosting scheme with high availabilityCharm a cost efficient multi cloud data hosting scheme with high availability
Charm a cost efficient multi cloud data hosting scheme with high availability
 
IRJET- A Futuristic Cache Replacement using Hybrid Regression
IRJET-  	  A Futuristic Cache Replacement using Hybrid RegressionIRJET-  	  A Futuristic Cache Replacement using Hybrid Regression
IRJET- A Futuristic Cache Replacement using Hybrid Regression
 
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
A Host Selection Algorithm for Dynamic Container Consolidation in Cloud Data ...
 
E24099025李宇洋_專題報告.pdf
E24099025李宇洋_專題報告.pdfE24099025李宇洋_專題報告.pdf
E24099025李宇洋_專題報告.pdf
 
Robust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory InterfaceRobust Fault Tolerance in Content Addressable Memory Interface
Robust Fault Tolerance in Content Addressable Memory Interface
 
lecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memorylecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memory
 
P2P Cache Resolution System for MANET
P2P Cache Resolution System for MANETP2P Cache Resolution System for MANET
P2P Cache Resolution System for MANET
 
Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...Investigations on Implementation of Ternary Content Addressable Memory Archit...
Investigations on Implementation of Ternary Content Addressable Memory Archit...
 
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-As...
 
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
 
Assimilating sense into disaster recovery databases and judgement framing pr...
Assimilating sense into disaster recovery databases and  judgement framing pr...Assimilating sense into disaster recovery databases and  judgement framing pr...
Assimilating sense into disaster recovery databases and judgement framing pr...
 
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP OpsIRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
IRJET - The 3-Level Database Architectural Design for OLAP and OLTP Ops
 
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
IRJET-A Review on Trends in Multicore Processor Based on Cache and Power Diss...
 
Virtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged StorageVirtual SAN- Deep Dive Into Converged Storage
Virtual SAN- Deep Dive Into Converged Storage
 
Different Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache MemoryDifferent Approaches in Energy Efficient Cache Memory
Different Approaches in Energy Efficient Cache Memory
 

More from Pvrtechnologies Nellore

A High Throughput List Decoder Architecture for Polar Codes
A High Throughput List Decoder Architecture for Polar CodesA High Throughput List Decoder Architecture for Polar Codes
A High Throughput List Decoder Architecture for Polar Codes
Pvrtechnologies Nellore
 
Performance/Power Space Exploration for Binary64 Division Units
Performance/Power Space Exploration for Binary64 Division UnitsPerformance/Power Space Exploration for Binary64 Division Units
Performance/Power Space Exploration for Binary64 Division Units
Pvrtechnologies Nellore
 
Hybrid LUT/Multiplexer FPGA Logic Architectures
Hybrid LUT/Multiplexer FPGA Logic ArchitecturesHybrid LUT/Multiplexer FPGA Logic Architectures
Hybrid LUT/Multiplexer FPGA Logic Architectures
Pvrtechnologies Nellore
 
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
Pvrtechnologies Nellore
 
2016 2017 ieee matlab project titles
2016 2017 ieee matlab project titles2016 2017 ieee matlab project titles
2016 2017 ieee matlab project titles
Pvrtechnologies Nellore
 
2016 2017 ieee vlsi project titles
2016   2017 ieee vlsi project titles2016   2017 ieee vlsi project titles
2016 2017 ieee vlsi project titles
Pvrtechnologies Nellore
 
2016 2017 ieee ece embedded- project titles
2016   2017 ieee ece  embedded- project titles2016   2017 ieee ece  embedded- project titles
2016 2017 ieee ece embedded- project titles
Pvrtechnologies Nellore
 
A High-Speed FPGA Implementation of an RSD-Based ECC Processor
A High-Speed FPGA Implementation of an RSD-Based ECC ProcessorA High-Speed FPGA Implementation of an RSD-Based ECC Processor
A High-Speed FPGA Implementation of an RSD-Based ECC Processor
Pvrtechnologies Nellore
 
6On Efficient Retiming of Fixed-Point Circuits
6On Efficient Retiming of Fixed-Point Circuits6On Efficient Retiming of Fixed-Point Circuits
6On Efficient Retiming of Fixed-Point Circuits
Pvrtechnologies Nellore
 
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Pre encoded multipliers based on non-redundant radix-4 signed-digit encodingPre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Pvrtechnologies Nellore
 
Quality of-protection-driven data forwarding for intermittently connected wir...
Quality of-protection-driven data forwarding for intermittently connected wir...Quality of-protection-driven data forwarding for intermittently connected wir...
Quality of-protection-driven data forwarding for intermittently connected wir...
Pvrtechnologies Nellore
 
11.online library management system
11.online library management system11.online library management system
11.online library management system
Pvrtechnologies Nellore
 
06.e voting system
06.e voting system06.e voting system
06.e voting system
Pvrtechnologies Nellore
 
New web based projects list
New web based projects listNew web based projects list
New web based projects list
Pvrtechnologies Nellore
 
Power controlled medium access control
Power controlled medium access controlPower controlled medium access control
Power controlled medium access control
Pvrtechnologies Nellore
 
IEEE PROJECTS LIST
IEEE PROJECTS LIST IEEE PROJECTS LIST
IEEE PROJECTS LIST
Pvrtechnologies Nellore
 
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
Pvrtechnologies Nellore
 
Control cloud data access privilege and anonymity with fully anonymous attrib...
Control cloud data access privilege and anonymity with fully anonymous attrib...Control cloud data access privilege and anonymity with fully anonymous attrib...
Control cloud data access privilege and anonymity with fully anonymous attrib...
Pvrtechnologies Nellore
 
Cloud keybank privacy and owner authorization
Cloud keybank  privacy and owner authorizationCloud keybank  privacy and owner authorization
Cloud keybank privacy and owner authorization
Pvrtechnologies Nellore
 
Circuit ciphertext policy attribute-based hybrid encryption with verifiable
Circuit ciphertext policy attribute-based hybrid encryption with verifiableCircuit ciphertext policy attribute-based hybrid encryption with verifiable
Circuit ciphertext policy attribute-based hybrid encryption with verifiable
Pvrtechnologies Nellore
 

More from Pvrtechnologies Nellore (20)

A High Throughput List Decoder Architecture for Polar Codes
A High Throughput List Decoder Architecture for Polar CodesA High Throughput List Decoder Architecture for Polar Codes
A High Throughput List Decoder Architecture for Polar Codes
 
Performance/Power Space Exploration for Binary64 Division Units
Performance/Power Space Exploration for Binary64 Division UnitsPerformance/Power Space Exploration for Binary64 Division Units
Performance/Power Space Exploration for Binary64 Division Units
 
Hybrid LUT/Multiplexer FPGA Logic Architectures
Hybrid LUT/Multiplexer FPGA Logic ArchitecturesHybrid LUT/Multiplexer FPGA Logic Architectures
Hybrid LUT/Multiplexer FPGA Logic Architectures
 
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
Input-Based Dynamic Reconfiguration of Approximate Arithmetic Units for Video...
 
2016 2017 ieee matlab project titles
2016 2017 ieee matlab project titles2016 2017 ieee matlab project titles
2016 2017 ieee matlab project titles
 
2016 2017 ieee vlsi project titles
2016   2017 ieee vlsi project titles2016   2017 ieee vlsi project titles
2016 2017 ieee vlsi project titles
 
2016 2017 ieee ece embedded- project titles
2016   2017 ieee ece  embedded- project titles2016   2017 ieee ece  embedded- project titles
2016 2017 ieee ece embedded- project titles
 
A High-Speed FPGA Implementation of an RSD-Based ECC Processor
A High-Speed FPGA Implementation of an RSD-Based ECC ProcessorA High-Speed FPGA Implementation of an RSD-Based ECC Processor
A High-Speed FPGA Implementation of an RSD-Based ECC Processor
 
6On Efficient Retiming of Fixed-Point Circuits
6On Efficient Retiming of Fixed-Point Circuits6On Efficient Retiming of Fixed-Point Circuits
6On Efficient Retiming of Fixed-Point Circuits
 
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Pre encoded multipliers based on non-redundant radix-4 signed-digit encodingPre encoded multipliers based on non-redundant radix-4 signed-digit encoding
Pre encoded multipliers based on non-redundant radix-4 signed-digit encoding
 
Quality of-protection-driven data forwarding for intermittently connected wir...
Quality of-protection-driven data forwarding for intermittently connected wir...Quality of-protection-driven data forwarding for intermittently connected wir...
Quality of-protection-driven data forwarding for intermittently connected wir...
 
11.online library management system
11.online library management system11.online library management system
11.online library management system
 
06.e voting system
06.e voting system06.e voting system
06.e voting system
 
New web based projects list
New web based projects listNew web based projects list
New web based projects list
 
Power controlled medium access control
Power controlled medium access controlPower controlled medium access control
Power controlled medium access control
 
IEEE PROJECTS LIST
IEEE PROJECTS LIST IEEE PROJECTS LIST
IEEE PROJECTS LIST
 
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
Control cloud-data-access-privilege-and-anonymity-with-fully-anonymous-attrib...
 
Control cloud data access privilege and anonymity with fully anonymous attrib...
Control cloud data access privilege and anonymity with fully anonymous attrib...Control cloud data access privilege and anonymity with fully anonymous attrib...
Control cloud data access privilege and anonymity with fully anonymous attrib...
 
Cloud keybank privacy and owner authorization
Cloud keybank  privacy and owner authorizationCloud keybank  privacy and owner authorization
Cloud keybank privacy and owner authorization
 
Circuit ciphertext policy attribute-based hybrid encryption with verifiable
Circuit ciphertext policy attribute-based hybrid encryption with verifiableCircuit ciphertext policy attribute-based hybrid encryption with verifiable
Circuit ciphertext policy attribute-based hybrid encryption with verifiable
 

Recently uploaded

20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 

Recently uploaded (20)

20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 

Reducing fragmentation for in line deduplication

  • 1. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 Reducing Fragmentation for In-line Deduplication Backup Storage via Exploiting Backup History And Cache Knowledge ABSTRACT In backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. We observe that the fragmentation comes into sparse and out-of-order containers. The sparse container decreases restore performance and garbage collection efficiency, while the out-of-order container decreases restore performance if the restore cache is small. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. CAF efficiently complements HAR in datasets where out-of-order containers are dominant. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2.84-175.36at a cost of only rewriting 0.5-2.03% data Existing System:
  • 2. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 In backup systems, the chunks of each backup are physically scattered after deduplication, which causes a challenging fragmentation problem. We observe that the fragmentation comes into sparse and out-of-order containers. The sparse container decreases restore performance and garbage collection efficiency, while the out-of-order container decreases restore performance if the restore cache is small. In order to reduce the fragmentation Proposed System we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. CAF efficiently complements HAR in datasets where out-of-order containers are dominant. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Our extensive experimental results from real-world datasets show HAR significantly improves the restore performance by 2.84-175.36at a cost of only rewriting 0.5-2.03% data. propose a hybrid rewriting algorithm as complements of HAR to reduce the negative impacts of out-of-order containers. HAR, as well as OPT, improves restore performance by 2.84- 175.36 at an acceptable cost in deduplication ratio. HAR outperforms the state-of-the-art work in terms of both deduplication ratio and restore performance. The hybrid scheme is helpful to further improve restore performance in datasets where out-of-order containers are dominant. To avoid a significant decrease of deduplication ratio in the hybrid scheme, we develop a Cache-Aware Filter (CAF) to exploit cache knowledge. With the help of CAF, the hybrid scheme significantly improves the deduplication ratio without decreasing the restore performance. Note that CAF can be used as an optimization of existing rewriting algorithms.
  • 3. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 MODULES Architecture: MODULES  Storage system  Data Deduplication  chunk fragmentation,
  • 4. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457  performance evaluation Storage system Cloud storage is a model of data storage where the digital data is stored in logical pools, the physical storage spans multiple servers (and often locations), and the physical environment is typically owned and managed by a hosting company. These cloud storage providers are responsible for keeping the data available and accessible, and the physical environment protected and running. People and organizations buy or lease storage capacity from the providers to store end user, organization, or application data Data Deduplication multiple backups, the chunks of a backup unfortunately become physically scattered in different containers, which is known as fragmentation. The negative impacts of the fragmentation are two-fold. First, the fragmentation severely decreases restore performance. The infrequent restore is important and the main concern from users Moreover, data replication, which is important for disaster recovery, requires reconstructions of original backup streams from deduplication systems and thus suffers from a performance problem similar to the restore operation. Second, the fragmentation results in invalid chunks (not referenced by any backups) becoming physically scattered in different
  • 5. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 containers when users delete expired backups. Existing garbage collection solutions first identify valid chunks and the containers holding only a few valid chunks (i.e., reference management Then, a merging operation is required to copy the valid chunks in the identified containers to new containers Finally, the identified containers are reclaimed. Unfortunately, the metadata space overhead of reference management is proportional to the number of chunks, and the merging operation is the most timeconsuming phase in garbage collection We observe that the fragmentation comes in two categories of containers: sparse containers and out-of-order containers, which have different negative impacts and require dedicated solutions. Chunk fragmentation The fragmentation problem in deduplication systems has received many attentions. iDedup eliminates sequential and duplicate chunks in the context of primary storage systems. Nam et al. propose a quantitative metric to measure the fragmentation level of deduplication systems , and a selective deduplication scheme for backup workloads. SAR stores hot chunks in SSD to accelerate reads. RevDedup employs a hybrid inline and out-of-line deduplication scheme to improve restore performance of latest backups. The Context-Based Rewriting algorithm (CBR) [17] and the capping algorithm (Capping) are recently proposed rewriting algorithms to address the fragmentation problem. Both of them buffer a small part of the on-going backup stream during a backup, and identify fragmented chunks within the buffer (generally 10-20 MB). For example, Capping divides the backup stream into fixed-sized segments (e.g., 20 MB), and conjectures the fragmentation within each segment. Capping limits the
  • 6. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 maximum number (say T) of containers a segment can refer to. Suppose a new segment refers to N containers and N > T, the chunks in the N �T containers that hold the least chunks in the segment are rewritten. Reference management for the garbage collection is complicated since each chunk can be referenced by multiple backups. The offline approaches traverse all fingerprints (including the fingerprint index and recipes) when the system is idle. For example, Botelho et al. [14] build a perfect hash vector as a compact representation of all chunks. Since recipes need to occupy significantly large storage space , the traversing operation is timeconsuming. The inline approaches maintain additional metadata during backup to facilitate the garbage collection. Maintaining a reference counter for each chunk is expensive and error-prone . Grouped Mark- and- Sweep (GMS) uses a bitmap to mark which chunks in a container are used by a backup. performance evaluation Four datasets, including Kernel, VMDK, RDB, and Synthetic, are used for evaluation. Their characteristics are listed in Table 1. Each backup stream is divided into variable-sized chunks via Content-Defined Chunking . Kernel, downloaded from the web is a commonly used public dataset [. It consists of 258 consecutive versions of unpacked Linux codes. Each version is 412:78 MB on average. Two consecutive versions are generally 99% identical except when there are major revision upgrades. There are only a few self-references and hence sparse containers are dominant. VMDK is from a virtual machine installed Ubuntu 12.04LTS, which is a common use-case in real-world . We compiled source code, patched the system, and ran an HTTP server on the
  • 7. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 virtual machine. VMDK consists of 126 full backups. Each full backup is 15:36 GB in size on average, and 90-98% identical to its adjacent backups. Each backup contains about 15% self-referred chunks. Out-of-order containers are dominant and sparse containers are less severe. RDB consists of snapshots of a Redis database The database has 5 million records, 5 GB in space. We ran YCSB to update the database in a Zipfian distribution. The update ratio is of 1% on average. After each run, we archived the uncompressed dump.rdb file that is the on-disk snapshot of the database. Finally, we got 212 versions of snapshots. There is no self-reference and hence sparse containers are dominant. Synthetic was generated according to existing approaches We simulated common operations of file systems, such as file creation/deletion/modification. We finally obtained a 4:5 TB dataset with 400 versions. There is no self-reference in Synthetic and sparse containers are dominant. History-Aware Rewriting Algorithm
  • 8. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 At the beginning of a backup, HAR loads IDs of all inherited sparse containers to construct the in-memory Sinherited structure During the backup, HAR rewrites all duplicate chunks whose container IDs exist in Sinherited. Additionally, HAR maintains an in-memory structure, Semerging (included in collected info in Figure 3), to monitor the utilizations of all the containers referenced by the backup. Semerging is a set of utilization records, and each
  • 9. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 record consists of a container ID and the current utilization of the container. After the backup concludes, HAR removes the records of higher utilizations than the utilization threshold from Semerging. Semerging then contains IDs of all emerging sparse containers. In most cases, Semerging can be flushed directly to disks as the Sinherited of the next backup, because the size of Semerging is generally small due to our second observation. However, there are two spikes in . A large number of emerging sparse containers indicates that we have many fragmented chunks to be rewritten in next backup. It would change the performance bottleneck to data writing and hurt the backup performance that is of top priority . To address this problem, HAR sets a rewrite limit, such as 5%, to avoid too much rewrites in next backup. HAR uses the rewrite limit to determine whether there are too many sparse containers in Semerging. (1) HAR calculates an estimated rewrite ratio (defined as the size of rewritten data divided by the backup size) for the next backup. Specifically, HAR first calculates the System Configuration: HARDWARE REQUIREMENTS: Hardware - Pentium Speed - 1.1 GHz RAM - 1GB
  • 10. Head office: 3nd floor, Krishna Reddy Buildings, OPP: ICICI ATM, Ramalingapuram, Nellore www.pvrtechnology.com, E-Mail: pvrieeeprojects@gmail.com, Ph: 81432 71457 Hard Disk - 20 GB Floppy Drive - 1.44 MB Key Board - Standard Windows Keyboard Mouse - Two or Three Button Mouse Monitor - SVGA SOFTWARE REQUIREMENTS: Operating System : Windows Technology : Java and J2EE Web Technologies : Html, JavaScript, CSS IDE : My Eclipse Web Server : Tomcat Tool kit : Android Phone Database : My SQL Java Version : J2SDK1.5