Hdfs ha using journal nodes

•Download as PPTX, PDF•

17 likes•9,963 views

Evans Ye

Technology Business

Introducing Journal Nodes
Manual
Failover
7/26/2013 Copyright 2013 Trend Micro Inc.

Architecture
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
DN DNDNDN
Block locations map

• When any namespace modification is performed
it durably logs a record of the modification to JNs
• The Standby reads the edits from the JNs and applies
them to its own namespace
JournalNodes’ job
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
Edits Edits Edits
Edits Edits EditsEdits Edits Edits
Safe
Mode

• Specify path on local disk
• tolerate at most (N - 1) / 2 failures
JournalNodes’ storage
7/26/2013 Copyright 2013 Trend Micro Inc.

• JournalNodes will only allow a single NameNode to be a
writer at a time.
• no potential for corrupting the file system metadata from
a split-brain scenario.
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
NN
Standby
WRITE READ

• Whenever a NameNode becomes active, it first
generate an epoch number.
• first active NameNode after the namespace is initialized
starts with epoch number 1
• any failovers or restarts result in an increment of the
epoch number
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.

• When a new NameNode becomes active, it has an
epoch number higher than any previous NameNode
• Call JournalNodes to increment their promised epochs
• Fencing:
– JNs receive newer epoch
 update majority of JNs’ promised epochs  accept
– JNs receive older epoch
 reject
JournalNodes’ fencing
7/26/2013 Copyright 2013 Trend Micro Inc.

• previous Active NameNode could serve read requests
to clients which may be out of date until a write access
performed
• You can specify some fencing method to avoid this
happened
But…
7/26/2013 Copyright 2013 Trend Micro Inc.

Fencing
Method
7/26/2013 Copyright 2013 Trend Micro Inc.

• sshfence
SSH to the Active NameNode and kill the process
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.

• shell
run a shell command to fence the Active NameNode
• The script may have properties with the '_' character
replacing any '.'
ex : dfs_namenode_rpc-address
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.

• Additional environment variable
Fencing Method
7/26/2013 Copyright 2013 Trend Micro Inc.

Automatic
Failover
7/26/2013 Copyright 2013 Trend Micro Inc.

7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3

• Health monitoring
– the ZKFC pings its local NameNode on a periodic basis with a
health-check command. (healthy/unhealthy)
• ZooKeeper session management
– when the local NameNode is healthy, the ZKFC holds a session
open in ZooKeeper.
– If the local NameNode is active, it also holds a special "lock"
znode.
– if the session expires, the lock node will be automatically
deleted.
ZKFailoverController
7/26/2013 Copyright 2013 Trend Micro Inc.

• ZooKeeper-based election
– if the local NameNode is healthy, and no other node currently
holds the lock znode, it will itself try to acquire the lock.
– If it succeeds, then it has "won the election“
• Failover
– the previous active is fenced
– local NameNode transitions to active state.
ZKFailoverController
7/26/2013 Copyright 2013 Trend Micro Inc.

7/26/2013 Copyright 2013 Trend Micro Inc.
JN
1
JN
2
JN
3
NN
Active
1
2
3
4
5 6
7

Client
Side
7/26/2013 Copyright 2013 Trend Micro Inc.

• Client connect to Active Namenode via proxy
• When Active Namenode down, client receive Exception
 retry and send RPC to another namenode
(implement by ConfiguredFailoverProxyProvider)
Client Failover
7/26/2013 Copyright 2013 Trend Micro Inc.

Steps to Apply
HDFS HA
7/26/2013 Copyright 2013 Trend Micro Inc.

• If setting up a fresh HDFS cluster,
hdfs namenode –format
• copy over the contents of your NameNode metadata
directories to the other
hdfs namenode –bootstrapStandby
./format-failover-namenode.sh
• hdfs –initializeSharedEdits to initialize edits log in
journalnode
• Startup both Namenode
converting a non-HA-enabled cluster
to be HA-enabled
7/26/2013 Copyright 2013 Trend Micro Inc.

• http://hadoop.apache.org/docs/current/hadoop-
yarn/hadoop-yarn-
site/HDFSHighAvailabilityWithQJM.html#Deployment
• http://blog.cloudera.com/blog/2012/10/quorum-based-
journaling-in-cdh4-1/
• https://issues.apache.org/jira/secure/attachment/125475
98/qjournal-design.pdf
Reference
7/26/2013 Copyright 2013 Trend Micro Inc.

What's hot

Apache Spark ArchitectureAlexey Grishchenko

Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc

Time-Series Apache HBaseHBaseCon

Consumer offset management in KafkaJoel Koshy

Dual write strategies for microservicesBilgin Ibryam

Top 5 Mistakes When Writing Spark ApplicationsSpark Summit

Hive + Tez: A Performance Deep DiveDataWorks Summit

Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama

Apache KafkaSaroj Panyasrivanit

Understanding oracle rac internals part 1 - slidesMohamed Farouk

Common Patterns of Multi Data-Center Architectures with Apache Kafkaconfluent

Wars of MySQL Cluster ( InnoDB Cluster VS Galera ) Mydbops

PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016Tomas Vondra

Spark (Structured) Streaming vs. Kafka StreamsGuido Schmutz

SQL Server Clustering Part1Sql Trainer Kareem

Introduction to RedisDvir Volk

Intro to HBasealexbaranau

Free Training: How to Build a LakehouseDatabricks

Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah

What's hot (20)

Apache Spark Architecture

Migrating from InnoDB and HBase to MyRocks at Facebook

Time-Series Apache HBase

Consumer offset management in Kafka

Dual write strategies for microservices

Top 5 Mistakes When Writing Spark Applications

Hive + Tez: A Performance Deep Dive

Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud

Apache Kafka

Understanding oracle rac internals part 1 - slides

Common Patterns of Multi Data-Center Architectures with Apache Kafka

Wars of MySQL Cluster ( InnoDB Cluster VS Galera )

PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016

Spark (Structured) Streaming vs. Kafka Streams

SQL Server Clustering Part1

Introduction to Redis

Intro to HBase

Free Training: How to Build a Lakehouse

Under the Hood of a Shard-per-Core Database Architecture

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013

Similar to Hdfs ha using journal nodes

Zookeeper Introducejhao niu

Hadoop Summit 2012 | HDFS High AvailabilityCloudera, Inc.

DrbdPierre Mavro

Disaster recovery of OpenStack Cinder using DRBDViswesuwara Nathan

Gnr writepath v1.0Tomer Perry

Tuning parallelcodeonsolaris005dflexer

From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...Imperva Incapsula

HDFS - What's New and FutureDataWorks Summit

Elasticsearch Data AnalysesAlaa Elhadba

Migrating to XtraDB Clusterpercona2013

Life in the fast lane. Full speed XPagesUlrich Krause

brief introduction of drbd in SLE12SP2Nick Wang

Romanticos com drbd 2eiichi2009

SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)

Art of the Possible_Tim Faulkes.pdfAerospike, Inc.

Apache Hadoop- Hadoop Basics.pptxMiraj Godha

Distributed replicated block deviceChanaka Lasantha

Android Forensics: Exploring Android Internals and Android AppsMoe Tanabian

Migrating to XtraDB Clusterpercona2013

Benchmarking Solr PerformanceLucidworks

Similar to Hdfs ha using journal nodes (20)

Zookeeper Introduce

Hadoop Summit 2012 | HDFS High Availability

Drbd

Disaster recovery of OpenStack Cinder using DRBD

Gnr writepath v1.0

Tuning parallelcodeonsolaris005

From 1000/day to 1000/sec: The Evolution of Incapsula's BIG DATA System [Surg...

HDFS - What's New and Future

Elasticsearch Data Analyses

Migrating to XtraDB Cluster

Life in the fast lane. Full speed XPages

brief introduction of drbd in SLE12SP2

Romanticos com drbd 2

SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

Art of the Possible_Tim Faulkes.pdf

Apache Hadoop- Hadoop Basics.pptx

Distributed replicated block device

Android Forensics: Exploring Android Internals and Android Apps

Migrating to XtraDB Cluster

Benchmarking Solr Performance

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

How to write a Business Continuity PlanDatabarracks

Take control of your SAP testing with UiPath Test SuiteDianaGray10

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely

unit 4 immunoblotting technique complete.pptxBkGupta21

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Advanced Computer Architecture – An IntroductionDilum Bandara

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

Connect Wave/ connectwave Pitch Deck Presentation

"Debugging python applications inside k8s environment", Andrii Soldatenko

How to write a Business Continuity Plan

Take control of your SAP testing with UiPath Test Suite

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf

unit 4 immunoblotting technique complete.pptx

WordPress Websites for Engineers: Elevate Your Brand

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

DevEX - reference for building teams, processes, and platforms

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Unraveling Multimodality with Large Language Models.pdf

Scanning the Internet for External Cloud Exposures via SSL Certs

Developer Data Modeling Mistakes: From Postgres to NoSQL

Unleash Your Potential - Namagunga Girls Coding Club

Dev Dives: Streamline document processing with UiPath Studio Web

Advanced Computer Architecture – An Introduction

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Hdfs ha using journal nodes

1. HDFS HA using Journal Nodes

4. • When any namespace modification is performed it durably logs a record of the modification to JNs • The Standby reads the edits from the JNs and applies them to its own namespace JournalNodes’ job 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby Edits Edits Edits Edits Edits EditsEdits Edits Edits Safe Mode

6. • JournalNodes will only allow a single NameNode to be a writer at a time. • no potential for corrupting the file system metadata from a split-brain scenario. JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc. JN 1 JN 2 JN 3 NN Active NN Standby WRITE READ

7. • Whenever a NameNode becomes active, it first generate an epoch number. • first active NameNode after the namespace is initialized starts with epoch number 1 • any failovers or restarts result in an increment of the epoch number JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.

8. • When a new NameNode becomes active, it has an epoch number higher than any previous NameNode • Call JournalNodes to increment their promised epochs • Fencing: – JNs receive newer epoch  update majority of JNs’ promised epochs  accept – JNs receive older epoch  reject JournalNodes’ fencing 7/26/2013 Copyright 2013 Trend Micro Inc.

9. • previous Active NameNode could serve read requests to clients which may be out of date until a write access performed • You can specify some fencing method to avoid this happened But… 7/26/2013 Copyright 2013 Trend Micro Inc.

12. • shell run a shell command to fence the Active NameNode • The script may have properties with the '_' character replacing any '.' ex : dfs_namenode_rpc-address Fencing Method 7/26/2013 Copyright 2013 Trend Micro Inc.

16. • Health monitoring – the ZKFC pings its local NameNode on a periodic basis with a health-check command. (healthy/unhealthy) • ZooKeeper session management – when the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper. – If the local NameNode is active, it also holds a special "lock" znode. – if the session expires, the lock node will be automatically deleted. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.

17. • ZooKeeper-based election – if the local NameNode is healthy, and no other node currently holds the lock znode, it will itself try to acquire the lock. – If it succeeds, then it has "won the election“ • Failover – the previous active is fenced – local NameNode transitions to active state. ZKFailoverController 7/26/2013 Copyright 2013 Trend Micro Inc.

20. • Client connect to Active Namenode via proxy • When Active Namenode down, client receive Exception  retry and send RPC to another namenode (implement by ConfiguredFailoverProxyProvider) Client Failover 7/26/2013 Copyright 2013 Trend Micro Inc.

22. • If setting up a fresh HDFS cluster, hdfs namenode –format • copy over the contents of your NameNode metadata directories to the other hdfs namenode –bootstrapStandby ./format-failover-namenode.sh • hdfs –initializeSharedEdits to initialize edits log in journalnode • Startup both Namenode converting a non-HA-enabled cluster to be HA-enabled 7/26/2013 Copyright 2013 Trend Micro Inc.

23. • http://hadoop.apache.org/docs/current/hadoop- yarn/hadoop-yarn- site/HDFSHighAvailabilityWithQJM.html#Deployment • http://blog.cloudera.com/blog/2012/10/quorum-based- journaling-in-cdh4-1/ • https://issues.apache.org/jira/secure/attachment/125475 98/qjournal-design.pdf Reference 7/26/2013 Copyright 2013 Trend Micro Inc.

Hdfs ha using journal nodes

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hdfs ha using journal nodes

Similar to Hdfs ha using journal nodes (20)

More from Evans Ye

More from Evans Ye (20)

Recently uploaded

Recently uploaded (20)

Hdfs ha using journal nodes