SlideShare a Scribd company logo
1 of 1
Data deduplication
• Deduplication
– Detect and eliminate duplicated data
• Deduplication overhead
– Disk fragmentation(read latency ), data comparison
cost(write latency )
• Mitigate deduplication overhead
– Decentralize dedup process
– Use cache for fingerprint(hash) of data
– Use large Dedup unit size
• File, sequence of data blocks, and larger block size than 4KB
• (but, dedup rate )

More Related Content

What's hot

What's hot (17)

Holographic data storage
Holographic data storageHolographic data storage
Holographic data storage
 
Redis 101
Redis 101Redis 101
Redis 101
 
Medialap
MedialapMedialap
Medialap
 
Building a userspace filesystem in node.js
Building a userspace filesystem in node.jsBuilding a userspace filesystem in node.js
Building a userspace filesystem in node.js
 
HDFS
HDFSHDFS
HDFS
 
Holographic data storage
Holographic data storageHolographic data storage
Holographic data storage
 
Introduction to Hadoop : A bird eye's view | Abhishek Mukherjee
Introduction to Hadoop : A bird eye's view | Abhishek MukherjeeIntroduction to Hadoop : A bird eye's view | Abhishek Mukherjee
Introduction to Hadoop : A bird eye's view | Abhishek Mukherjee
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
 
Isbd namespaces
Isbd namespacesIsbd namespaces
Isbd namespaces
 
Kudu and Rust
Kudu and RustKudu and Rust
Kudu and Rust
 
Holographic Data Storage
Holographic Data StorageHolographic Data Storage
Holographic Data Storage
 
Modern software design in Big data era
Modern software design in Big data eraModern software design in Big data era
Modern software design in Big data era
 
iOS: Using persistant storage
iOS: Using persistant storageiOS: Using persistant storage
iOS: Using persistant storage
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Ce202 Storage
Ce202 StorageCe202 Storage
Ce202 Storage
 
Serving Images with GridFS
Serving Images with GridFSServing Images with GridFS
Serving Images with GridFS
 
Fundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and HiveFundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and Hive
 

Viewers also liked

Viewers also liked (14)

Museo del hincha 2 1
Museo del hincha 2 1Museo del hincha 2 1
Museo del hincha 2 1
 
Ricerca farfalla cobra molinaro-corcinschi
Ricerca farfalla cobra molinaro-corcinschiRicerca farfalla cobra molinaro-corcinschi
Ricerca farfalla cobra molinaro-corcinschi
 
Relaciona la-imagen-a-la-palabra-que-corresponda
Relaciona la-imagen-a-la-palabra-que-correspondaRelaciona la-imagen-a-la-palabra-que-corresponda
Relaciona la-imagen-a-la-palabra-que-corresponda
 
Ricerca sulle formiche
Ricerca sulle formicheRicerca sulle formiche
Ricerca sulle formiche
 
Preguntas
PreguntasPreguntas
Preguntas
 
Cuestionario de-macro-nutrimentos-y-micro-nutrimentos
Cuestionario de-macro-nutrimentos-y-micro-nutrimentosCuestionario de-macro-nutrimentos-y-micro-nutrimentos
Cuestionario de-macro-nutrimentos-y-micro-nutrimentos
 
La farfalla
La farfallaLa farfalla
La farfalla
 
Tabla para bitacoras 1
Tabla para bitacoras 1Tabla para bitacoras 1
Tabla para bitacoras 1
 
La apuesta
La apuestaLa apuesta
La apuesta
 
Poster -
Poster - Poster -
Poster -
 
Resume_Jared Borda 2014-10
Resume_Jared Borda 2014-10Resume_Jared Borda 2014-10
Resume_Jared Borda 2014-10
 
Music genres
Music genresMusic genres
Music genres
 
Digipack final
Digipack finalDigipack final
Digipack final
 
Agile requirements discovery
Agile requirements discoveryAgile requirements discovery
Agile requirements discovery
 

Deduplication nhnent

  • 1. Data deduplication • Deduplication – Detect and eliminate duplicated data • Deduplication overhead – Disk fragmentation(read latency ), data comparison cost(write latency ) • Mitigate deduplication overhead – Decentralize dedup process – Use cache for fingerprint(hash) of data – Use large Dedup unit size • File, sequence of data blocks, and larger block size than 4KB • (but, dedup rate )