SlideShare a Scribd company logo
1 of 52
Comparing HDFS & ASM
Jason Arneil
Copyright © 2017 Accenture All rights reserved.
Platform little more than 'skunkworks'
outside tech industries
John Mertic
Open Data Platform Initiative
Data Growth
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Explosive data growth well known
Many Exabytes of data created every day
Disk Size Increases
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
HDD size has increased
Cost per GB decreased
Disk Read Speed
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Sequential read speed has not improved at same rate
Time to read entire 8TB drive is circa 12 hours!
Space Consumption
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Lots of Drives = More drive failures
Need to store redundant copies of data
Rebalancing
Failure Rate
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
1000 drives means a drive fails a week
Data Protection
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Could use hardware RAID
HDFS & ASM give protection via software
HDFS History
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
HDFS part of Hadoop
HDFS is 10 years old
HDFS History
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Hadoop ecosystem builds on HDFS
ASM History
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
ASM released in 2003
ASM
Commodity
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
HDFS designed to run on large number of nodes
HDFS is a distributed filesystem written in java
HDFS Goals
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Designed for very large files
Designed for sequential access
NON HDFS Use Cases
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Low latency
Lots of small files
NO SYMBOL
Metadata
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
File system metadata critical
ASM Instance
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
ASM Instance manages metadata
ASM Architecture
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Node Types
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Client JVM
HDFS
Client
Name Node
dn-1 dn-2 dn-3 dn-4 dn-5 dn-6 dn-7 dn-8 dn-9
Flex ASM Architecture
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
NameNode RAM
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
apps
/
users jfa
spark
hive
Hierarchical
Namespace
dn-2 dn-3blk_123 dn-1
dn-8 dn-9dn-7blk_456
Block
Manager
heartbeat
disk used
disk free
dn-1
heartbeat
disk used
disk free
dn-2
heartbeat
disk used
disk free
dn-3
Live
DataNodes
Namespace Durability
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Name Node
Image
Checkpoint
Edit Log
Formatting
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
hdfs namenode -format
Blocks
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
HDFS block size 128MB by default
Blocks
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Client Access
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
DataNode
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
NameNode Resilience
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
ASM Resilience
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
ASM Cluster Pool of Storage
Disk Group BDisk Group A
Shared Disk Groups
Wide File Striping
Databases share ASM
instances
ASM Instance
Database Instance
ASM Disk
Node5Node4Node3Node2Node1 Node5
Node5 runs as
ASM Client to
Node4
Node1 runs as
ASM Client to
Node2
Node1 runs as
ASM Client to
Node4
Node2 runs as
ASM Client to
Node3
NameNode Backup
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Preventing Namenode SPOF
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Standby Name Node
Active Name Node
Edit Log Image
Edit Log
Image
NFS
Secondary NameNode
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
NameNode HA
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Quorum Journal Manager
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
NameNode Failover & Fencing
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
File Permissions
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
POSIX like
Replica Placement
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Rack 1 Rack 3Rack 2
Database I/O
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Reading Data
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Client Node
Client JVM
HDFS
Client
Distributed
filesystem
FSData
InputStream
heartbeat
disk used
disk free
dn-1
heartbeat
disk used
disk free
dn-2
heartbeat
disk used
disk free
dn-3
Name Node
Image
CheckpointEdit Log
Writing Data
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Client Node
Client JVM
HDFS
Client
Distributed
filesystem
FSData
OutputStream
Name Node
heartbeat
disk used
disk free
dn-1
heartbeat
disk used
disk free
dn-2
heartbeat
disk used
disk free
dn-3
write write
Name Node
Image
CheckpointJournal
ack ack
Rebalancing
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Rebalancing
ASM Rebalance
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
alter diskgroup rebalance
HDFS Balancer
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Rebalancing
start-balancer.sh
HDFS Balancer
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
dfs.datanode.balance.bandwidthPerSec
Bit Rot
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
DataBlockScanner
CheckSum
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
What’s Coming
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Hadoop 3.0
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Erasure Coding
Hadoop 3.0
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Intra-datanode balancer
Hadoop 3.0
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
More than 2 NameNodes
Conclusion
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Conclusion
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
Questions?
Comparing HDFS & ASM
Copyright © 2017 Accenture All rights reserved.
http://hadoop.apache.org/docs/current/
Hadoop: The Definitive Guide
Tom White

More Related Content

What's hot

BeeGFS Enterprise Deployment
BeeGFS Enterprise Deployment BeeGFS Enterprise Deployment
BeeGFS Enterprise Deployment Dirk Petersen
 
BeeGFS - Dealing with Extreme Requirements in HPC
BeeGFS - Dealing with Extreme Requirements in HPCBeeGFS - Dealing with Extreme Requirements in HPC
BeeGFS - Dealing with Extreme Requirements in HPCinside-BigData.com
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraDataWorks Summit
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Community
 
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...inwin stack
 
Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Community
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike WayAerospike, Inc.
 
HDFS presented by VIJAY
HDFS presented by VIJAYHDFS presented by VIJAY
HDFS presented by VIJAYthevijayps
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongCeph Community
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph SolutionsRed_Hat_Storage
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Community
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed_Hat_Storage
 
DAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceDAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceAmazon Web Services
 
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph Ceph Community
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Community
 

What's hot (20)

BeeGFS Enterprise Deployment
BeeGFS Enterprise Deployment BeeGFS Enterprise Deployment
BeeGFS Enterprise Deployment
 
BeeGFS - Dealing with Extreme Requirements in HPC
BeeGFS - Dealing with Extreme Requirements in HPCBeeGFS - Dealing with Extreme Requirements in HPC
BeeGFS - Dealing with Extreme Requirements in HPC
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
Intel - optimizing ceph performance by leveraging intel® optane™ and 3 d nand...
 
Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data Ceph Day San Jose - Object Storage for Big Data
Ceph Day San Jose - Object Storage for Big Data
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike Way
 
HDFS presented by VIJAY
HDFS presented by VIJAYHDFS presented by VIJAY
HDFS presented by VIJAY
 
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu YongUnlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
Unlock Bigdata Analytic Efficiency with Ceph Data Lake - Zhang Jian, Fu Yong
 
Redis database
Redis databaseRedis database
Redis database
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
Architecting Ceph Solutions
Architecting Ceph SolutionsArchitecting Ceph Solutions
Architecting Ceph Solutions
 
Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage Ceph Day KL - Ceph on All-Flash Storage
Ceph Day KL - Ceph on All-Flash Storage
 
Red Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference ArchitecturesRed Hat Storage Day New York - New Reference Architectures
Red Hat Storage Day New York - New Reference Architectures
 
DAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL PerformanceDAT316_Report from the field on Aurora PostgreSQL Performance
DAT316_Report from the field on Aurora PostgreSQL Performance
 
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph
Ceph Day Tokyo - Bit-Isle's 3 years footprint with Ceph
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash Storage
 

Similar to HDFS & ASM

What is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaWhat is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaEdureka!
 
Aem asset optimizations & best practices
Aem asset optimizations & best practicesAem asset optimizations & best practices
Aem asset optimizations & best practicesKanika Gera
 
Cloud Expo NYC 2017: Running Databases in Containers
Cloud Expo NYC 2017: Running Databases in Containers Cloud Expo NYC 2017: Running Databases in Containers
Cloud Expo NYC 2017: Running Databases in Containers Ocean9, Inc.
 
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)Amazon Web Services
 
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...Edureka!
 
Are you a Tortoise or a Hare?
Are you a Tortoise or a Hare?Are you a Tortoise or a Hare?
Are you a Tortoise or a Hare?ArangoDB Database
 
Amazon Elastic File System (EFS) for File Storage
Amazon Elastic File System (EFS) for File StorageAmazon Elastic File System (EFS) for File Storage
Amazon Elastic File System (EFS) for File StorageAmazon Web Services
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax
 
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Amazon Web Services
 
Leveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN PerformanceLeveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN Performancebrettallison
 
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...Amazon Web Services
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonDataWorks Summit/Hadoop Summit
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 
Disaster Recovery Options with AWS - AWS Online Tech Talks
Disaster Recovery Options with AWS - AWS Online Tech TalksDisaster Recovery Options with AWS - AWS Online Tech Talks
Disaster Recovery Options with AWS - AWS Online Tech TalksAmazon Web Services
 
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in MinutesSRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in MinutesAmazon Web Services
 
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...Amazon Web Services
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon Web Services
 

Similar to HDFS & ASM (20)

What is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | EdurekaWhat is HDFS | Hadoop Distributed File System | Edureka
What is HDFS | Hadoop Distributed File System | Edureka
 
Aem asset optimizations & best practices
Aem asset optimizations & best practicesAem asset optimizations & best practices
Aem asset optimizations & best practices
 
dNFS for DBA's
dNFS for DBA'sdNFS for DBA's
dNFS for DBA's
 
Amazon EFS 深入採討
Amazon EFS 深入採討Amazon EFS 深入採討
Amazon EFS 深入採討
 
Cloud Expo NYC 2017: Running Databases in Containers
Cloud Expo NYC 2017: Running Databases in Containers Cloud Expo NYC 2017: Running Databases in Containers
Cloud Expo NYC 2017: Running Databases in Containers
 
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)
STG307_Deep Dive on Amazon Elastic File System (Amazon EFS)
 
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
Hadoop vs Spark | Which One to Choose? | Hadoop Training | Spark Training | E...
 
Are you a Tortoise or a Hare?
Are you a Tortoise or a Hare?Are you a Tortoise or a Hare?
Are you a Tortoise or a Hare?
 
Amazon Elastic File System (EFS) for File Storage
Amazon Elastic File System (EFS) for File StorageAmazon Elastic File System (EFS) for File Storage
Amazon Elastic File System (EFS) for File Storage
 
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
DataStax | Building a Spark Streaming App with DSE File System (Rocco Varela)...
 
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
Best Practices for Running PostgreSQL on AWS - DAT314 - re:Invent 2017
 
Leveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN PerformanceLeveraging Open Source to Manage SAN Performance
Leveraging Open Source to Manage SAN Performance
 
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...
STG314-Case Study Learn How HERE Uses JFrog Artifactory w Amazon EFS Support ...
 
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC IsilonImproving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Disaster Recovery Options with AWS - AWS Online Tech Talks
Disaster Recovery Options with AWS - AWS Online Tech TalksDisaster Recovery Options with AWS - AWS Online Tech Talks
Disaster Recovery Options with AWS - AWS Online Tech Talks
 
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in MinutesSRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
SRV314_Building a Serverless Pipeline to Transcode a Two-Hour Video in Minutes
 
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...
Amazon EFS: Leverage the Power of a Distributed Shared File System in the Clo...
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017
 
Hadoop Research
Hadoop Research Hadoop Research
Hadoop Research
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

HDFS & ASM

Editor's Notes

  1. BIG DATA is all the rage Almost as popular as Cloud This is where we are dealing with datasets in the hundreds of TB’s to Petabytes And using 100s to 1000’s of CPUs in parallel to process this data Aggregating the power of many servers as a single resource The idea of this presentation is to show how concepts you are familiar with (in ASM) Carry over to the world of HDFS As DBAs & Systems folks you are in prime position to manage this coming wave
  2. My name is jason Arneil Been in IT for around 18 years, both as an Oracle DBA and a System Administrator The last 4 ½ years exclusively worked on the Exadata platform. I’m really just dipping my toes in the Big Data world – it is all rage though! Blogged a bit in the past You can find me on twitter Became an Oracle ACE a couple of years ago Now work in the Accenture Enkitec Group
  3. I Was quite struck when I saw this quote last month. To me that smells of opportunity
  4. Exponential data growth is a well known phenomena Many exabytes stored every day worldwide This creates a storage problem
  5. This does help us store more data We now have 8TB as fairly standard enterprise HDD’s
  6. Speed at very best roughly 200MB/s So to be able to run analysison 10’s or 100’s of TB’s of data In a reasonable time frame You are going to need LOTS of drives – 100’s or 1000’s of drives The more concurrency you have the more drives you will need
  7. More drives leads to more drive failure So we need a mechanism to protect our data from drive failure Storing redundant copies of data actually leads to even more drives being used And more drive failures
  8. Cloud Storage company Backblaze have over 50,000 drives in their DataCenters They publish drive reliability stats from this real world situation – in a proper Air conditioned DC While Drive failure varies with age, their average failure rate was going on 5% Source: https://www.backblaze.com/blog/hard-drive-reliability-q3-2015/
  9. We have to have some way of protecting our Data Hardware RAID is an expensive solution – particularly at 100”s of TB Doesn’t provide data locality for analysing the data Transfer of huge quantities of data to servers would be massive bottleneck Analysis of huge amounts of data more efficient if executed near the data it is operating on
  10. Hadoop Distributed File System (HDFS™) Hadoop is an open source project from the apache software foundation Has had a reasonable amount of time to develop, evolve and mature – but filesytems generally have a long (multi-decade) lifespan Has its roots from google - though the elephant logo is from a toy owned by son of a yahoo engineer – Doug cutting Note the Distributed part – a filesystem that manages storage across a range of machines That is the storage of those individual machines is presented as an aggregate
  11. You can think of various layers in the hadoop world With storage as the base layer Followed by a method of allocating resources and scheduling tasks across the cluster – Yet Another resource Negotiator Then various applications used for data analysis that can take advantage of these Hadoop scales computation, storage and I/O bandwidth
  12. ASM has it’s genesis all the way back in 1996 – initial problem that led to it was related to video streaming! Took 7 years from initial idea to released product ASM is a clustered filesystem – not a distributed filesystem Design goal was to be able to stripe data across 1000’s of disks It would also be fault tolerant
  13. HDFS is designed to be portable from one platform to another It is designed to run on commodity hardware A key goal is linear scalability both on data size and compute resources: Doubling numbers of nodes should half processing time on same volume of data Likewise doubling the data volume and the number of nodes should result in constant processing time Essentially it uses a divide and conquer approach You can buy it from Oracle it runs on the Big Data Appliance
  14. “Very large” here means files that are hundreds of megabytes, gigabytes, or terabytes in size” Petabyte sized clusters not unheard off makes it easy to store large files:  optimises sequential reading of data over latency It’s likely on HDFS that analysis will read large percentage of entire dataset – very different from typical RDBMS usage Reading most of the dataset efficiently is more important than the latency of reading first record HDFS applications need a write-once-read-many access model for files. A file once created, written, and closed need not be changed except for appends and truncates Can append to a file, but cannot update at arbitrary point
  15. HDFS not designed for low latency access Lots of small files does not scale well on HDFS
  16. metadata is critical to the operation of a filesystem Essentially you can’t access the files stored without the metadata
  17. When using an oracle Database with ASM We have an ASM instance in addition to the database instance This ASM instance has a small portion of the RDBMS code ASM that manages the metadata for the Datafiles Note metadata is stored (and protected) with the data in the diskgroups
  18. ASM architecture upto 12c looks like this Database on every node, ASM instance on every node All accessing the same underlying drives where the data is There is only 1 type of node and all nodes are identical
  19. When it comes to the world of HDFS we have 2 types of nodes Namenode: - Minimum of 1, mostly 2 for redundancy – we’ll come on to that Namenode manages the filesystem namespace Maintains filesystem tree and metadata for all files and directories Regulates access to files by clients The other type of node we have is the datanode Many Datanodes in cluster – these are where the data is stored and where the computations and analysis are executed These are all just standard servers – are likely to spread across multiple racks in the datacenter because you have so many of them Datanodes responsible for serving read/write operations from HDFS clients Datanodes also perform block creation/delition and replication upon instruction from Namenode
  20. But How different is it really from the 12c flex ASM architecture Here you no longer have ASM instances running on all nodes, and DB instances can run on nodes that don’t have ASM instances Think of ASM as the “namenodes” – managing the metadata and the databases being clients of the ASM instances Analogy even better if you think about exadata Where the storage is on standard servers running linux and where even some computation normally done at the database is offloaded to Hadoop is extending this idea all the way – all computation done where the storage resides
  21. NameNode is critical in HDFS Metadata is stored persistently on disk on namenode in 2 files: Namespace image – name space is hierarchy of files and directories + Edit log The metadata is decoupled from the data Namenode also knows on which datanode all blocks for a given file will reside – remember the same block will exist on multiple datanodes Block locations not stored permanently on namenode – This info can be reconstructed from datanodes provide periodic block reports This is stored in memory and with many files this can become limiting factor for scalability Though we can federate the namespace – so multiple namenodes each manage a portion of the filesystem – This is NOT HA though DataNodes send heartbeats every 3 secs - No heartbeat in 10 mins – node presumed dead namenode schedules re-replication of lost replicas
  22. Durability of namespace maintained by write-ahead journal and checkpoints Journal transactions persisted into edit log before replying to client This records every change that occurs to file system metadata The entire file system namespace, including the mapping of blocks to files and file system properties, is stored in a file called the FsImage. Checkpoints periodically written to image file Block locations discovered from DataNodes via block reports – these are NOT persisted on NameNode This can lead to slow startup times of namenode
  23. Creating a diskgroup in ASM implicitly creates filesystem Size is not specified and data is spread evenly across all disks A new hdfs installtion needs to be formatted The formatting process creates an empty filesystem by creating the storage directories and the initial versions of the namenode’s persistent data structures Datanodes are not involved in the initial formatting process as the namenode manages the filesystem metadata You don’t need to say how large a filesystem to create as it’s determined by number of members of the cluster So filesystem size can be increased with additional cluster members long after creation
  24. disk has a block size 512bytes typical or 4K (modern)  - minimum amount of data that can  read/write                         Filesystem data block is multiple of disk block size, typically few KB in size IN ASM files written as a collection of extents Extents are multiples of Allocation Units, typically going from 1MB up to 64MB but can be set higher                     HDFS as a filesystem has concept of a block - 128MB by default but it is configurable                   Files in HDFS broken into block sized chunks stored as independent units                   File smaller than a full block DOES NOT occupy a full block of space Reason for such a large block size is to minimise seek costs
  25. Having a block abstraction enables a file to span multiple disks Nothing to require all blocks from the same file to be on same drive Blocks are fixed size which simplifies metadata management – metadata don’t need to be stored with blocks Easy to calculate how many blocks can fit on a disk Block concept also useful when it comes to replication and fault tolerance
  26. A Client accesses the filesystem on behalf of a User by communicating with namenode and datanodes The client can present a POSIX like filesystem to user – user code does not need to know about namenode/datanodes to function HDFS interaction mediated through a JAVA API Can interact with filesystem via HTTP REST API – but slower than java Also a C library
  27. Datanode is workhorse of HDFS Store and retrieve blocks when told by clients or namenode Report back periodically to namenode with lists of blocks they are storing
  28. Filesystem cannot function without NAMENODE If the namenode were destroyed all files on the filesystem would be lost as No way of reconstructing files from blocks on datanodes Vital to ensure resilience of namenode There are a number of different options for ensuring NameNode resilience ASM way ahead in terms of resilience
  29. With non-flex ASM if we lose an ASM instance we only lose the DBs on that node Other nodes keep working – definite advantage of cluster technology And it’s even better with flex ASM if we loose an ASM instance all databases can carry on processing. ASM Instance can also relocate if node fails
  30. What we need to protect is the edit log and the image file Hadoop can be configured to ensure namenode wites persistent metadata to multiple filesystems These are synchronous and atomic writes Usual choice is to write to local disk and NFS mount
  31. Active name nodes writes updates both locally and to NFS Share Standby Name node also has access to NFS share Even with a secondary namenode it won’t be able to service requests until 1 Namespace image is loaded into memory 2 Edit log is replayed 3 received enough block reports from datanodes On decent sized cluster this could be 30 mins! – Not really high availability
  32. One step up the availability ladder is to run secondary namenode Does NOT act as a namenode – does not serve requests Job is to merge namespace image with edit log Copy of this merged namenode image can be used if primary namenode fails Note this has a datalag so some data can be lost This is still not high availability And causes problem for routine maintenance and planned downtime
  33. Previous options does not provide high availability of the filesystem HA can be accomplished with a pair of namenodes in Active-Standby configuration Standby can take over from Active node without significant delay Namenodes MUST have highly available shared storage Datanodes MUST send block reports to both nodes – Remember block mappings stored in MEMORY on namenode Clients must be enabled to handle namenode failover If active namenode fails Standby can take over quickly because it has latest state available in memory Both edit log and block mappings Can use NFS filer or a Quorum Journal Manager (QJM) QJM is recommended choice
  34. QJM is a dedicated HDFS implementation Solely designed for purpose of providing HA for edit log QJM runs a group of journal nodes Each edit must be written to a majority of these
  35. Transition managed by a failover controller Default implementation uses Zookeeper to ensure only 1 namenode active Each namenode runs a heartbeat process Can’t have active-active as we don’t have a cluster filesystem – can’t have multiple nodes writing to same file Previously active namenode can be fenced – can use STONITH
  36. HDFS has permission model for files and directories very POSIX like 3 types of perms: r, w, x X is ignored for a file (no concept of executing a file) but is needed for directory access Each file has owner, group and mode (mode is perms for ower,group, and others) Note by default Hadoop runs with security disabled
  37. placement of replicas is critical to HDFS reliability and performance purpose of a rack-aware replica placement policy is to improve data reliability, availability, and network bandwidth utilization. HDFS’s placement policy is to put one replica on one node in the local rack another on a different node in a different rack, and the last on a different node in the same rack as the preivous This policy cuts the inter-rack write traffic which generally improves write performance The chance of rack failure is far less than that of node failure – so doesn’t reduce data availability does reduce the aggregate network bandwidth used when writing data since a block is placed in only two unique racks rather than three As long as you have even chance of starting with node in each rack the data will be evenly distributed across all racks
  38. I/O from Database goes direct to Disks does not go via ASM
  39. To read a block client requests the list of replica locations from NameNode For each block, the namenode returns the addresses of the datanodes that have a copy of that block. Client caches replica locations Datanode Locations sorted by proximity to client Data read from the dataNodes
  40. A client request to create a file does not reach the NameNode immediately HDFS client caches the file data into a temp local file Application writes are transparently redirected to this temp local file Once local file accumulates data worth over one HDFS block size, client contacts NameNode NameNode inserts the file name into the file system hierarchy and allocates a data block for it client flushes the block of data from the local temporary file to the first DataNode in small portions First Datanode sends the portions to the second datanode Second datanode sends to third Data is pipelined from one DataNode to the next. Data nodes tell namenode which blocks they have via block reports
  41. HDFS & ASM BOTH work best when blocks of a file are spread evenly across all disks This gives best I/O performance
  42. In ASM if new disks are added (or dropped) A rebalance can ensure the data is evenly spread across all the disks
  43. The balancer program is a Hadoop daemon that redistributes blocks Moves blocks from overutilized datanodes to underutilized ones Still adheres to block replica placement policies cluster is deemed to be balanced, which means that the utilization of every datanode (ratio of used space on the node to total capacity of the node) differs from the utilization of the cluster (ratio of used space on the cluster to total capacity of the cluster) by no more than a given threshold percentage”
  44. Only one balancer operation may run on cluster at one time Balancer designed to run in background Limits bandwidth used to move blocks around
  45. Explain Bit rot As organisations store more data the possibility of silent disk corruptions grows Can set the CONTENT.CHECK attribute on a diskgroup to ensure a rebalance will perform this logical content checking “each datanode runs a DataBlockScanner in a background thread that periodically verifies all the blocks stored on the datanode. This is to guard against corruption due to “bit rot” in the physical storage media” “Because HDFS stores replicas of blocks, it can “heal” corrupted blocks by copying one of the good replicas to produce a new, uncorrupt replica”
  46. HDFS checksums all data written to it, and by default when reading data A separate checksum is created for every 512 bytes (by default) CRC is 4 bytes long less than 1% storage overhead When clients read data from datanodes checksum verified
  47. One thing about Hadoop in additon to all the whacky names for things Is that the pace of change is phenomenal in comparison to the old school RDBMS world I wanted to show a couple of snazzy things that are coming with the next HDFS release
  48. It’s pretty inefficient space wise having to store 3 copies of the same data Just to guarantee protection for your data Erasure Coding is a way of encoding data so that the original data can be recovered with just a subset of the original It sounds awfully similar to RAID 5/6 but with parity stored with the data not a separate device Should consume way less space than triple mirroring with similar failure rates However this will trade CPU cycles for space gains
  49. A single DataNode manages multiple disks. During normal write operation, disks will be filled up evenly. However, adding or replacing disks can lead to significant skew within a DataNode This situation is not handled by the existing HDFS balancer, which concerns itself with inter- that is BETWEEN different data nodes, not intra-, DN skew – i.e between disks within a data node!!
  50. With hadoop 3 you can increase the availability of of your cluster be having an increased number of namenodes
  51. It might even be the case that the Big Data world evolves so fast that HDFS is being to be superseded A new kid on the storage block is KUDU Which takes the best of HDFS sequential performance along with low latency random access
  52. You may have heard enough by now As DBAs and Systems Folks HDFS is likely to feature in your organisations And we are likely to be the folks managing that infrastructure So best to be prepared!
  53. Put a link and a book recommendation Questions?
  54. “DEFLATE is a compression algorithm whose standard implementation is zlib.” Gzip normally used to produce deflate format files Concept of splittable is very important – splittable format allows you to seek to any point in the stream A non splittable file format will have to have all it’s blocks processed by the same process – rather than by distributed processes