SlideShare a Scribd company logo
Page1 © Hortonworks Inc. 2015
Enterprise-Grade Rolling Upgrade for a Live
Hadoop Cluster
Sanjay Radia, Vinod Kumar Vavilapalli
Hortonworks Inc
June 9, 2015
Page2 © Hortonworks Inc. 2015
Agenda
•Introduction
•What is Rolling Upgrade?
•Problem – Several key issues to be addressed
–Wire compatibility and side-by-side installs are not sufficient!!
–Must Address: Data safety, Service degradation and disruption
•Enhancements to various components
–Packaging – side-by-side install
–HDFS, YARN, Hive, Oozie, …
Page3 © Hortonworks Inc. 2015
Sanjay Radia
•Chief Architect, Founder, Hortonworks
•Part of the Hadoop team at Yahoo! since 2007
–Chief Architect of Hadoop Core at Yahoo!
–Apache Hadoop PMC and Committer
• Prior
–Data center automation, schedulers, virtualization, Java, HA, OSs, File
Systems
– (Startup, Sun Microsystems, Inria …)
–Ph.D., University of Waterloo
Page4 © Hortonworks Inc. 2015
Vinod Kumar Vavilapalli
– Long time Hadooper since 2007
– Apache Hadoop Committer / PMC
– Apache Member
– Yahoo! -> Hortonworks
– MapReduce -> YARN from day one
Page5 © Hortonworks Inc. 2015
HDP Upgrade: Two Upgrade Modes
Stop the Cluster Upgrade
Shutdown services and cluster and then upgrade.
Traditionally this was the only way
Rolling Upgrade
Upgrade cluster and its services while cluster is
actively running applications
Note: Upgrade time is proportional to # nodes, not data size
Enterprises run critical services and data on a Hadoop cluster.
Need live cluster upgrade that maintains SLAs without degradation
Page6 © Hortonworks Inc. 2015
But you can also “Revert to Prior State”
Rollback
Revert bits and state of cluster and its services back to a
checkpoint’d state.
Why? This is an emergency procedure.
Downgrade
Downgrade the service and component to prior version, but
keep any new data and metadata that has been generated
Why? You are not happy with performance, or app compatibility, ….
Page7 © Hortonworks Inc. 2015
But aren’t wire compatibility and
side-by-side installs sufficient for
Rolling upgrades?
Unfortunately No!! Not if you want
• Data safety
• Keep running jobs/apps during upgrades; continue to run
correctly
• Maintain SLAs
• Allow downgrade/rollbacks in case of problems
Page8 © Hortonworks Inc. 2015
Issues that need to be addressed (1)
• Data safety
• HDFS’s upgrade checkpoint does not work for rolling upgrade
• Service degradation – note every daemon is restarted in rolling fashion
• HDFS write pipeline
• Application Masters on YARN restart
• NodeManagers restart
• Hive server is processing client queries – it cannot restart to new version without loss
• Client must not see failures – many components do not have retry
BUT Hadoop deals with failures, it will fix pipelines, restart tasks –
what is the big deal!!
Service degradation will be high because every daemon is restarted
Page9 © Hortonworks Inc. 2015
Issues that need to be addressed (2)
• Maintaining the application submitter’s context (correctness)
• MR tasks get their context from the local node
– In the past the submitters and node’s context were identical
– But with RU, a node’s binaries are being upgraded and hence may be inconsistent with submitter
- Half of the job could execute with old binaries and the other with the new one!!
• Persistent state
• Backward compatibility for upgrade (or convert)
• Forward compatibility for downgrade (or convert)
• Wire compatibility
• With clients (forward and backward)
• Internally (Between Masters and Slaves or Peers)
– Note: the upgrade is in a rolling fashion
Page10 © Hortonworks Inc. 2015
Component Enhancements
• Packaging – Side-by-side installs
• HDFS Enhancements
• YARN Enhancements
• Retaining Job/App Context
• Hive Enhancements
Page11 © Hortonworks Inc. 2015
Packaging: Side-by-side Installs (1)
• Need side-by-side installs of multiple versions on same node
• Some components are version N, while others are N+1
• For same component, some daemons version N, others N+1 on the same node (e.g. NN and DN)
• HDP’s solution: Use OS-distro standard packaging solution
• Rejected proprietary packing as a solution (no lock-in)
• Want to support RU via Ambari and Manually
• Standard packaging solutions like RPMs have useful tools and mechanisms
– Tools to install, uninstall, query, etc
– Manage dependencies automatically
– Admins do not need to learn new tools and formats
• Side benefits for ‘stop-the-world” upgrade:
• Can install the new binaries before the shutdown
Page12 © Hortonworks Inc. 2015
Packaging: Side-by-side installs (2)
• Layout: side-by-side
• /usr/hdp/2.2.0.0/hadoop
• /usr/hdp/2.2.0.0/hive
• /usr/hdp/2.3.0.0/hadoop
• /usr/hdp/2.3.0.0/hive
• Define what is current for each component’s
daemon and clients
• /usr/hdp/current/hdfs-nn->/usr/hdp/2.3.0.0/hadoop
• /usr/hdp/current/hadoop-client->/usr/hdp/2.2.0.0/hadoop
• /usr/hdp/current/hdfs-dn->/usr/hdp/2.2.0.0/hadoop
• Distro-select helps you manage the version switch
• Our solution: the package name contains the version number:
• E.g hadoop_2_2_0_0 is the RPM package name itself
– Hadoop_2_3_0_0 is different peer package
• Bin commands point to current:
/usr/bin/hadoop->/usr/hdp/current/hadoop-client/bin/hadoop
Page13 © Hortonworks Inc. 2015
Packaging: Side-by-side installs (3)
• distro-select tool to select current binary
• Per-component, Per-daemon
• Maintain stack consistency – that is what QE tested
• Each component refers to its siblings of same stack version
• Each component knows the “hadoop home” of the same stack
– Wrapper bin-scripts set this up
• Config updates can be optionally synchronized with binary upgrade
• Configs can sit in their old location
• But what if the new binary version requires slightly different config?
• Each binary version has its own config pointer
– /usr/hdp/2.2.0.0/hadoop/conf -> /etc/hadoop/conf
Page14 © Hortonworks Inc. 2015
Component Enhancements
• Packaging – Side-by-side installs
• HDFS Enhancements
• YARN Enhancements
• Retaining Job/App Context
• Hive Enhancements
Page15 © Hortonworks Inc. 2015
HDFS Enhancements (1)
Data safety
• Since version 2007, HDFS supported an upgrade-checkpoint
• Backups of HDFS not practical – too large
• Protects against HDFS bugs in new version deleting files
– Standard practice to use for ALL upgrade even patch releases
• But this only works for “stop-the-world” full upgrade and does not support downgrade
• Irresponsible to do rolling upgrade without such a mechanism
HDP 2.2 has enhanced upgrade-checkpoint (HDFS-5535)
• Markers for rollback
• “Hardlinks” to protect against deletes due to bugs in the new version of HDFS code
– Old scheme had hardlinks but we now delay the deletes
• Added downgrade capability
• Protobuf based fsImage for compatible extensibility
Page16 © Hortonworks Inc. 2015
HDFS Enhancements (2)
Minimize service degradation and retain data safety
• Fast datanode restart (HDFS-5498)
• Write pipeline – every DN will be upgraded and hence many write
pipelines will break and repaired
• Umbrella Jira HDFS-5535
– Repair it to the same DN during RU (avoid replica data copy)
– Retain same number of replicas in pipeline
• Upgrade HA standby and failover (NN HA available for a long time)
Page17 © Hortonworks Inc. 2015
Component Enhancements
• Packaging – Side-by-side installs
• HDFS Enhancements
• YARN Enhancements
• Retaining Job/App Context
• Hive Enhancements
Page18 © Hortonworks Inc. 2015
YARN Enhancements: Minimize Service Degradation
• YARN RM retains application queue (2013)
• YARN RM fail-over (2014)
– Note this retains the queues but ALL jobs are rekicked
• YARN RM can restart while retaining applications (2015)
Page19 © Hortonworks Inc. 2015
YARN Enhancements: Minimize Service Degradation
• A restarted YARN NodeManager retains existing containers (2015)
• Recall: restarting containers will cause serious SLA degradation
Page20 © Hortonworks Inc. 2015
YARN Enhancements: Compatibility
• Versioning of state-stores of RM and NMs
• Compatible evolution of tokens over time
• Wire compatibility between mixed versions of RM
Page21 © Hortonworks Inc. 2015
Component Enhancements
• Packaging – Side-by-side installs
• HDFS Enhancements
• YARN Enhancements
• Retaining Job/App Context
• Hive Enhancements
Page22 © Hortonworks Inc. 2015
Retaining Job/App context
• Previously a Job/Apps used libraries from the local node
• Worked because client-node & compute-nodes had same version
• But during RU, the NodeManager has multiple versions
• Must use the same version as used by the client when submitting a job
• Solution:
• Framework libraries are now installed in HDFS
• Client-context sent as “distro-version” variable in job config
• Has side benefits
– Frameworks now installed in single node and then uploaded to HDFS
• Note Oozie also enhanced to maintain consistent context
Page23 © Hortonworks Inc. 2015
YARN Rolling Upgrades: A Cluster Snapshot
Page24 © Hortonworks Inc. 2015
Component Enhancements
• Packaging – Side-by-side installs
• HDFS Enhancements
• YARN Enhancements
• Retaining Job/App Context
• Hive Enhancements
Page25 © Hortonworks Inc. 2015
Hive Enhancements
• Fast restarts + client-side reconnection
• Hive metastore and Hive client
• Hive-server2: stateful server that submits the client’s query
• Need to keep it running till the old queries complete
• Solution:
• Allow multiple Hive-servers to run, each registered in Zookeeper
• New client requests go to new servers
• Old server completes old queries but does not receive any new ones
– Old-server is removed from Zookeeper
• Side benefits
• HA + Load balancing solution for Hiveserver2
Page26 © Hortonworks Inc. 2015
Automated Rolling Upgrade
Via Ambari
Via Your own cluster management scripts
Page27 © Hortonworks Inc. 2015
HDP Rolling Upgrades Runbook
Pre-requisites
• HA
• Configs
Prepare
• Install bits
• DB backups
• HDFS
checkpoint
Rolling Upgrade Finalize
Rolling
Downgrade
Rollback
NOT Rolling. Shutdown all
services.
Note: Upgrade time is proportional to # nodes, not data size
Page30 © Hortonworks Inc. 2015
Both Manual and Automated Rolling Upgrade
• Ambari supports fully automated upgrades
• Verifies prerequisites
• Performs HDFS upgrade-checkpoint, prompts for DB backups
• Performs rolling upgrade
• All the components, in the right order
• Smoke tests at each critical stages
• Opportunities for Admin verification at critical stages
• Downgrade if you change your mind
• Have published the runbook for those that do not use Ambari
• You can do it manually or automate your own process
Page31 © Hortonworks Inc. 2015
Runbook: Rolling Upgrade
Ambari has automated
process for Rolling Upgrades
Services are switched over to
new version in rolling fashion
Any components not installed
on cluster are skipped
Zookeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Hue
Finalize
HDFS, YARN, MR,
Tez, HBase, Pig.
Hive, Phoenix,
Mahout
HDFS
YARN
HBase
Page32 © Hortonworks Inc. 2015
Runbook: Rolling Downgrade
Zookeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Hue
Downgrade
Finalize
Page33 © Hortonworks Inc. 2015
Summary
• Enterprises run critical services and data on a Hadoop cluster.
• Need a live cluster upgrade without degradation and maintaining SLAs
• We enhanced Hadoop components for enterprise-grade rolling upgrade
• Non-proprietary packaging solution using OS-standard solution (RPMs, Debs, )
• Data safety
– HDFS checkpoints and write-pipelines
• Maintain SLAs – solve a number of service degradation problems
– HDFS write pipelines, Yarn RM, NM state recovery, Hive, …
• Jobs/apps continue to run correctly with the right context
• Allow downgrade/rollbacks in case of problems
• All enhancements truly open source and pushed back to Apache?
• Yes of course – that is how Hortonworks does business …
Page34 © Hortonworks Inc. 2015
Backup slides
Page35 © Hortonworks Inc. 2015
Why didn’t you use alternatives
• Alternatives generally keep one version active, not two
• We need to move some services as a pack (clients)
• We need to support managing confs and binaries together and
separately
• Maybe we could have done it, but it was getting complex …..

More Related Content

What's hot

High Availability with MariaDB Enterprise
High Availability with MariaDB EnterpriseHigh Availability with MariaDB Enterprise
High Availability with MariaDB Enterprise
MariaDB Corporation
 
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013Michael Noel
 
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014Michael Noel
 
Install Oracle FMW - 'Mostly Scripted'
Install Oracle FMW - 'Mostly Scripted'Install Oracle FMW - 'Mostly Scripted'
Install Oracle FMW - 'Mostly Scripted'
makker_nl
 
Status Quo on the automation support in SOA Suite OGhTech17
Status Quo on the automation support in SOA Suite OGhTech17Status Quo on the automation support in SOA Suite OGhTech17
Status Quo on the automation support in SOA Suite OGhTech17
Jon Petter Hjulstad
 
Oracle Enterprise Linux
Oracle Enterprise LinuxOracle Enterprise Linux
Oracle Enterprise Linuxvkv_vkv
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
DataWorks Summit
 
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...Michael Noel
 
33616611930205162156 upgrade internals_19c
33616611930205162156 upgrade internals_19c33616611930205162156 upgrade internals_19c
33616611930205162156 upgrade internals_19c
Locuto Riorama
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
DataWorks Summit/Hadoop Summit
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing Hadoop
DataWorks Summit
 
Overview about OracleVM and Oracle Linux
Overview about OracleVM and Oracle LinuxOverview about OracleVM and Oracle Linux
Overview about OracleVM and Oracle Linux
andreas kuncoro
 
MySQL in the Cloud, is Amazon RDS for you?
MySQL in the Cloud, is Amazon RDS for you?MySQL in the Cloud, is Amazon RDS for you?
MySQL in the Cloud, is Amazon RDS for you?
Continuent
 
DevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud DatabaseDevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud Database
EDB
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance Platform
Maris Elsins
 
20618782218718364253 emea12 vldb
20618782218718364253 emea12 vldb20618782218718364253 emea12 vldb
20618782218718364253 emea12 vldb
Locuto Riorama
 
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and AdministerOracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
Andrejs Karpovs
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariDataWorks Summit
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
Hortonworks
 

What's hot (20)

High Availability with MariaDB Enterprise
High Availability with MariaDB EnterpriseHigh Availability with MariaDB Enterprise
High Availability with MariaDB Enterprise
 
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013
SPSMEL 2012 - SQL 2012 AlwaysOn Availability Groups for SharePoint 2010 / 2013
 
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014
SQL 2014 AlwaysOn Availability Groups for SharePoint Farms - SPS Sydney 2014
 
Install Oracle FMW - 'Mostly Scripted'
Install Oracle FMW - 'Mostly Scripted'Install Oracle FMW - 'Mostly Scripted'
Install Oracle FMW - 'Mostly Scripted'
 
Status Quo on the automation support in SOA Suite OGhTech17
Status Quo on the automation support in SOA Suite OGhTech17Status Quo on the automation support in SOA Suite OGhTech17
Status Quo on the automation support in SOA Suite OGhTech17
 
Oracle Enterprise Linux
Oracle Enterprise LinuxOracle Enterprise Linux
Oracle Enterprise Linux
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...
SPSSac2014 - SharePoint Infrastructure Tips and Tricks for On-Premises and Hy...
 
33616611930205162156 upgrade internals_19c
33616611930205162156 upgrade internals_19c33616611930205162156 upgrade internals_19c
33616611930205162156 upgrade internals_19c
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
 
YARN and the Docker container runtime
YARN and the Docker container runtimeYARN and the Docker container runtime
YARN and the Docker container runtime
 
Best Practices for Virtualizing Hadoop
Best Practices for Virtualizing HadoopBest Practices for Virtualizing Hadoop
Best Practices for Virtualizing Hadoop
 
Overview about OracleVM and Oracle Linux
Overview about OracleVM and Oracle LinuxOverview about OracleVM and Oracle Linux
Overview about OracleVM and Oracle Linux
 
MySQL in the Cloud, is Amazon RDS for you?
MySQL in the Cloud, is Amazon RDS for you?MySQL in the Cloud, is Amazon RDS for you?
MySQL in the Cloud, is Amazon RDS for you?
 
DevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud DatabaseDevOps Culture & Enablement with Postgres Plus Cloud Database
DevOps Culture & Enablement with Postgres Plus Cloud Database
 
Database as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance PlatformDatabase as a Service on the Oracle Database Appliance Platform
Database as a Service on the Oracle Database Appliance Platform
 
20618782218718364253 emea12 vldb
20618782218718364253 emea12 vldb20618782218718364253 emea12 vldb
20618782218718364253 emea12 vldb
 
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and AdministerOracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.6 on Database 12c: Install, Patch and Administer
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with Ambari
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 

Similar to Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster

Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop ClusterEnterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
DataWorks Summit
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
DataWorks Summit
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
markgrover
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
Hortonworks
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
Ian Lumb
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
Chris Nauroth
 
Habitat at SRECon
Habitat at SREConHabitat at SRECon
Habitat at SRECon
Mandi Walls
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Hortonworks
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
Data Con LA
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
Vinod Kumar Vavilapalli
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
DataWorks Summit
 
Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)John Dougherty
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at RakutenPlatform CF
 
Hadoop In Action
Hadoop In ActionHadoop In Action
Hadoop In Action
Bigdata Meetup Kochi
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
Steve Loughran
 
Inside hadoop-dev
Inside hadoop-devInside hadoop-dev
Inside hadoop-dev
Steve Loughran
 
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
VMware Tanzu
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise Hadoop
Yifeng Jiang
 

Similar to Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster (20)

Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop ClusterEnterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
 
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
Migrating your clusters and workloads from Hadoop 2 to Hadoop 3
 
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon ValleyIntro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
Intro to Hadoop Presentation at Carnegie Mellon - Silicon Valley
 
YARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider WebinarYARN Ready - Integrating to YARN using Slider Webinar
YARN Ready - Integrating to YARN using Slider Webinar
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5Hadoop operations-2014-strata-new-york-v5
Hadoop operations-2014-strata-new-york-v5
 
Habitat at SRECon
Habitat at SREConHabitat at SRECon
Habitat at SRECon
 
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013Apache Ambari BOF - OpenStack - Hadoop Summit 2013
Apache Ambari BOF - OpenStack - Hadoop Summit 2013
 
Stinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of HortonworksStinger.Next by Alan Gates of Hortonworks
Stinger.Next by Alan Gates of Hortonworks
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
 
Apache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and FutureApache Hadoop YARN 2015: Present and Future
Apache Hadoop YARN 2015: Present and Future
 
Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)Hadoop Infrastructure (Oct. 3rd, 2012)
Hadoop Infrastructure (Oct. 3rd, 2012)
 
Cloud Foundry at Rakuten
Cloud Foundry at RakutenCloud Foundry at Rakuten
Cloud Foundry at Rakuten
 
Hadoop In Action
Hadoop In ActionHadoop In Action
Hadoop In Action
 
Hadoop: today and tomorrow
Hadoop: today and tomorrowHadoop: today and tomorrow
Hadoop: today and tomorrow
 
Inside hadoop-dev
Inside hadoop-devInside hadoop-dev
Inside hadoop-dev
 
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
Pivotal HAWQ and Hortonworks Data Platform: Modern Data Architecture for IT T...
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise Hadoop
 

More from DataWorks Summit

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 

Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster

  • 1. Page1 © Hortonworks Inc. 2015 Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster Sanjay Radia, Vinod Kumar Vavilapalli Hortonworks Inc June 9, 2015
  • 2. Page2 © Hortonworks Inc. 2015 Agenda •Introduction •What is Rolling Upgrade? •Problem – Several key issues to be addressed –Wire compatibility and side-by-side installs are not sufficient!! –Must Address: Data safety, Service degradation and disruption •Enhancements to various components –Packaging – side-by-side install –HDFS, YARN, Hive, Oozie, …
  • 3. Page3 © Hortonworks Inc. 2015 Sanjay Radia •Chief Architect, Founder, Hortonworks •Part of the Hadoop team at Yahoo! since 2007 –Chief Architect of Hadoop Core at Yahoo! –Apache Hadoop PMC and Committer • Prior –Data center automation, schedulers, virtualization, Java, HA, OSs, File Systems – (Startup, Sun Microsystems, Inria …) –Ph.D., University of Waterloo
  • 4. Page4 © Hortonworks Inc. 2015 Vinod Kumar Vavilapalli – Long time Hadooper since 2007 – Apache Hadoop Committer / PMC – Apache Member – Yahoo! -> Hortonworks – MapReduce -> YARN from day one
  • 5. Page5 © Hortonworks Inc. 2015 HDP Upgrade: Two Upgrade Modes Stop the Cluster Upgrade Shutdown services and cluster and then upgrade. Traditionally this was the only way Rolling Upgrade Upgrade cluster and its services while cluster is actively running applications Note: Upgrade time is proportional to # nodes, not data size Enterprises run critical services and data on a Hadoop cluster. Need live cluster upgrade that maintains SLAs without degradation
  • 6. Page6 © Hortonworks Inc. 2015 But you can also “Revert to Prior State” Rollback Revert bits and state of cluster and its services back to a checkpoint’d state. Why? This is an emergency procedure. Downgrade Downgrade the service and component to prior version, but keep any new data and metadata that has been generated Why? You are not happy with performance, or app compatibility, ….
  • 7. Page7 © Hortonworks Inc. 2015 But aren’t wire compatibility and side-by-side installs sufficient for Rolling upgrades? Unfortunately No!! Not if you want • Data safety • Keep running jobs/apps during upgrades; continue to run correctly • Maintain SLAs • Allow downgrade/rollbacks in case of problems
  • 8. Page8 © Hortonworks Inc. 2015 Issues that need to be addressed (1) • Data safety • HDFS’s upgrade checkpoint does not work for rolling upgrade • Service degradation – note every daemon is restarted in rolling fashion • HDFS write pipeline • Application Masters on YARN restart • NodeManagers restart • Hive server is processing client queries – it cannot restart to new version without loss • Client must not see failures – many components do not have retry BUT Hadoop deals with failures, it will fix pipelines, restart tasks – what is the big deal!! Service degradation will be high because every daemon is restarted
  • 9. Page9 © Hortonworks Inc. 2015 Issues that need to be addressed (2) • Maintaining the application submitter’s context (correctness) • MR tasks get their context from the local node – In the past the submitters and node’s context were identical – But with RU, a node’s binaries are being upgraded and hence may be inconsistent with submitter - Half of the job could execute with old binaries and the other with the new one!! • Persistent state • Backward compatibility for upgrade (or convert) • Forward compatibility for downgrade (or convert) • Wire compatibility • With clients (forward and backward) • Internally (Between Masters and Slaves or Peers) – Note: the upgrade is in a rolling fashion
  • 10. Page10 © Hortonworks Inc. 2015 Component Enhancements • Packaging – Side-by-side installs • HDFS Enhancements • YARN Enhancements • Retaining Job/App Context • Hive Enhancements
  • 11. Page11 © Hortonworks Inc. 2015 Packaging: Side-by-side Installs (1) • Need side-by-side installs of multiple versions on same node • Some components are version N, while others are N+1 • For same component, some daemons version N, others N+1 on the same node (e.g. NN and DN) • HDP’s solution: Use OS-distro standard packaging solution • Rejected proprietary packing as a solution (no lock-in) • Want to support RU via Ambari and Manually • Standard packaging solutions like RPMs have useful tools and mechanisms – Tools to install, uninstall, query, etc – Manage dependencies automatically – Admins do not need to learn new tools and formats • Side benefits for ‘stop-the-world” upgrade: • Can install the new binaries before the shutdown
  • 12. Page12 © Hortonworks Inc. 2015 Packaging: Side-by-side installs (2) • Layout: side-by-side • /usr/hdp/2.2.0.0/hadoop • /usr/hdp/2.2.0.0/hive • /usr/hdp/2.3.0.0/hadoop • /usr/hdp/2.3.0.0/hive • Define what is current for each component’s daemon and clients • /usr/hdp/current/hdfs-nn->/usr/hdp/2.3.0.0/hadoop • /usr/hdp/current/hadoop-client->/usr/hdp/2.2.0.0/hadoop • /usr/hdp/current/hdfs-dn->/usr/hdp/2.2.0.0/hadoop • Distro-select helps you manage the version switch • Our solution: the package name contains the version number: • E.g hadoop_2_2_0_0 is the RPM package name itself – Hadoop_2_3_0_0 is different peer package • Bin commands point to current: /usr/bin/hadoop->/usr/hdp/current/hadoop-client/bin/hadoop
  • 13. Page13 © Hortonworks Inc. 2015 Packaging: Side-by-side installs (3) • distro-select tool to select current binary • Per-component, Per-daemon • Maintain stack consistency – that is what QE tested • Each component refers to its siblings of same stack version • Each component knows the “hadoop home” of the same stack – Wrapper bin-scripts set this up • Config updates can be optionally synchronized with binary upgrade • Configs can sit in their old location • But what if the new binary version requires slightly different config? • Each binary version has its own config pointer – /usr/hdp/2.2.0.0/hadoop/conf -> /etc/hadoop/conf
  • 14. Page14 © Hortonworks Inc. 2015 Component Enhancements • Packaging – Side-by-side installs • HDFS Enhancements • YARN Enhancements • Retaining Job/App Context • Hive Enhancements
  • 15. Page15 © Hortonworks Inc. 2015 HDFS Enhancements (1) Data safety • Since version 2007, HDFS supported an upgrade-checkpoint • Backups of HDFS not practical – too large • Protects against HDFS bugs in new version deleting files – Standard practice to use for ALL upgrade even patch releases • But this only works for “stop-the-world” full upgrade and does not support downgrade • Irresponsible to do rolling upgrade without such a mechanism HDP 2.2 has enhanced upgrade-checkpoint (HDFS-5535) • Markers for rollback • “Hardlinks” to protect against deletes due to bugs in the new version of HDFS code – Old scheme had hardlinks but we now delay the deletes • Added downgrade capability • Protobuf based fsImage for compatible extensibility
  • 16. Page16 © Hortonworks Inc. 2015 HDFS Enhancements (2) Minimize service degradation and retain data safety • Fast datanode restart (HDFS-5498) • Write pipeline – every DN will be upgraded and hence many write pipelines will break and repaired • Umbrella Jira HDFS-5535 – Repair it to the same DN during RU (avoid replica data copy) – Retain same number of replicas in pipeline • Upgrade HA standby and failover (NN HA available for a long time)
  • 17. Page17 © Hortonworks Inc. 2015 Component Enhancements • Packaging – Side-by-side installs • HDFS Enhancements • YARN Enhancements • Retaining Job/App Context • Hive Enhancements
  • 18. Page18 © Hortonworks Inc. 2015 YARN Enhancements: Minimize Service Degradation • YARN RM retains application queue (2013) • YARN RM fail-over (2014) – Note this retains the queues but ALL jobs are rekicked • YARN RM can restart while retaining applications (2015)
  • 19. Page19 © Hortonworks Inc. 2015 YARN Enhancements: Minimize Service Degradation • A restarted YARN NodeManager retains existing containers (2015) • Recall: restarting containers will cause serious SLA degradation
  • 20. Page20 © Hortonworks Inc. 2015 YARN Enhancements: Compatibility • Versioning of state-stores of RM and NMs • Compatible evolution of tokens over time • Wire compatibility between mixed versions of RM
  • 21. Page21 © Hortonworks Inc. 2015 Component Enhancements • Packaging – Side-by-side installs • HDFS Enhancements • YARN Enhancements • Retaining Job/App Context • Hive Enhancements
  • 22. Page22 © Hortonworks Inc. 2015 Retaining Job/App context • Previously a Job/Apps used libraries from the local node • Worked because client-node & compute-nodes had same version • But during RU, the NodeManager has multiple versions • Must use the same version as used by the client when submitting a job • Solution: • Framework libraries are now installed in HDFS • Client-context sent as “distro-version” variable in job config • Has side benefits – Frameworks now installed in single node and then uploaded to HDFS • Note Oozie also enhanced to maintain consistent context
  • 23. Page23 © Hortonworks Inc. 2015 YARN Rolling Upgrades: A Cluster Snapshot
  • 24. Page24 © Hortonworks Inc. 2015 Component Enhancements • Packaging – Side-by-side installs • HDFS Enhancements • YARN Enhancements • Retaining Job/App Context • Hive Enhancements
  • 25. Page25 © Hortonworks Inc. 2015 Hive Enhancements • Fast restarts + client-side reconnection • Hive metastore and Hive client • Hive-server2: stateful server that submits the client’s query • Need to keep it running till the old queries complete • Solution: • Allow multiple Hive-servers to run, each registered in Zookeeper • New client requests go to new servers • Old server completes old queries but does not receive any new ones – Old-server is removed from Zookeeper • Side benefits • HA + Load balancing solution for Hiveserver2
  • 26. Page26 © Hortonworks Inc. 2015 Automated Rolling Upgrade Via Ambari Via Your own cluster management scripts
  • 27. Page27 © Hortonworks Inc. 2015 HDP Rolling Upgrades Runbook Pre-requisites • HA • Configs Prepare • Install bits • DB backups • HDFS checkpoint Rolling Upgrade Finalize Rolling Downgrade Rollback NOT Rolling. Shutdown all services. Note: Upgrade time is proportional to # nodes, not data size
  • 28. Page30 © Hortonworks Inc. 2015 Both Manual and Automated Rolling Upgrade • Ambari supports fully automated upgrades • Verifies prerequisites • Performs HDFS upgrade-checkpoint, prompts for DB backups • Performs rolling upgrade • All the components, in the right order • Smoke tests at each critical stages • Opportunities for Admin verification at critical stages • Downgrade if you change your mind • Have published the runbook for those that do not use Ambari • You can do it manually or automate your own process
  • 29. Page31 © Hortonworks Inc. 2015 Runbook: Rolling Upgrade Ambari has automated process for Rolling Upgrades Services are switched over to new version in rolling fashion Any components not installed on cluster are skipped Zookeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Hue Finalize HDFS, YARN, MR, Tez, HBase, Pig. Hive, Phoenix, Mahout HDFS YARN HBase
  • 30. Page32 © Hortonworks Inc. 2015 Runbook: Rolling Downgrade Zookeeper Ranger Core Masters Core Slaves Hive Oozie Falcon Clients Kafka Knox Storm Slider Flume Hue Downgrade Finalize
  • 31. Page33 © Hortonworks Inc. 2015 Summary • Enterprises run critical services and data on a Hadoop cluster. • Need a live cluster upgrade without degradation and maintaining SLAs • We enhanced Hadoop components for enterprise-grade rolling upgrade • Non-proprietary packaging solution using OS-standard solution (RPMs, Debs, ) • Data safety – HDFS checkpoints and write-pipelines • Maintain SLAs – solve a number of service degradation problems – HDFS write pipelines, Yarn RM, NM state recovery, Hive, … • Jobs/apps continue to run correctly with the right context • Allow downgrade/rollbacks in case of problems • All enhancements truly open source and pushed back to Apache? • Yes of course – that is how Hortonworks does business …
  • 32. Page34 © Hortonworks Inc. 2015 Backup slides
  • 33. Page35 © Hortonworks Inc. 2015 Why didn’t you use alternatives • Alternatives generally keep one version active, not two • We need to move some services as a pack (clients) • We need to support managing confs and binaries together and separately • Maybe we could have done it, but it was getting complex …..

Editor's Notes

  1. HDFS write pipeline – slow down writes, risk data Yarn App masters restart – app failure if App master does not have persistent state Node manager restart – Tasks fail, restarts, SLA degrades Hive server is processing client queries – it cannot restart for new version Client must not see failures – many components do not have retry
  2. Yahoo! upgrades approx 1K nodes (out of 40K) a day A 4K cluster takes 2 days