Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Cluster Management and Automation
with Cloudera Manager
Darren Lo – Software Engineer at Cloudera
Agenda
● Hadoop Installation and Setup
● Diagnosing Problems
● Automating Management Tasks
● Links
Hadoop is...
● Fast-changing
– New features all the time
● Different from other IT projects
– One application on many host...
Many Common Setup Issues
●
Operating system issues
– Transparent Huge Pages
– Ulimits
– Clock Skew
●
Networking issues
– R...
Let others do the work for you
●
Cloudera's Distribution including Apache
Hadoop (CDH)
– Enterprise-Ready: Tested and depl...
Cloudera Manager
●
Available for free
– Any number of nodes
– Manage all services available in CDH
– Set up, configure, mo...
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installing with Cloudera Manager
Installation Complete
● Everything is up and running – Great!
● Add users and start running jobs, and get
a whole new set ...
Next Challenges
● Find, Diagnose and fix problems
– Why are my HBase queries slow?
● View cluster activity
– Who ran the M...
Health Tests
● Common problems that are easy to check
– Are any processes down?
– Are HDFS reads and writes working?
– Are...
Health Tests
Log Search
● Grep works great on 1 machine, not 100's
● Useful to answer
– What errors/warnings occurred when my service w...
Log Search
Events and Alerts
● CM publishes a stream of events
– Critical events are alerts
● Event search
● Integrate with external ...
Activity Monitor
● Who was running stuff when the cluster had
problems?
● See who is running MR jobs
– identifies Hive job...
Activity Monitor
Metrics and Charts
● Like Log search, a must-have for any distributed
system
● Hadoop services expose many metrics
● Colle...
Charting with Cloudera Manager
Charting with Cloudera Manager
Charting with Cloudera Manager
Next Challenges
● We know how to set up a cluster manually
● We know how to identify, diagnose and fix
issues
● Also need ...
Cloudera Manager API
●
Setup
– Create / configure cluster and services
– Configure new host to run on cluster
●
Workflows
...
Cloudera Manager API
● http://cloudera.github.com/cm_api/
● Java and Python client bindings
● Shell
● Export health inform...
Common Integration Questions
● Nagios – yes
● Even have tools to help integrate
● Chef – not yet
● Puppet – yes
● Customer...
Links
● Hadoop Operations - A Guide for Developers and Administrators
– Book by Eric Sammer
● CM Architecture blog
– http:...
Enterprise Features
● Easily upload support bundle
– Enables proactive support
– Fix problems more quickly
● Rolling Upgra...
Upcoming SlideShare
Loading in …5
×

Cluster management and automation with cloudera manager

4,390 views

Published on

Darren Lo's talk from #lspe http://www.meetup.com/SF-Bay-Area-Large-Scale-Production-Engineering/events/129859402/

Published in: Technology

Cluster management and automation with cloudera manager

  1. 1. Cluster Management and Automation with Cloudera Manager Darren Lo – Software Engineer at Cloudera
  2. 2. Agenda ● Hadoop Installation and Setup ● Diagnosing Problems ● Automating Management Tasks ● Links
  3. 3. Hadoop is... ● Fast-changing – New features all the time ● Different from other IT projects – One application on many hosts; not vice-versa ● Complex – Things you might run: HDFS, MapReduce, Yarn, ZooKeeper, Oozie, Hive, Pig, HBase, Sqoop, Solr, Cloudera Impala... ● Useful
  4. 4. Many Common Setup Issues ● Operating system issues – Transparent Huge Pages – Ulimits – Clock Skew ● Networking issues – Reverse-lookup does must report FQDN – NICs can negotiate less than full speed These are just examples. There are many more!
  5. 5. Let others do the work for you ● Cloudera's Distribution including Apache Hadoop (CDH) – Enterprise-Ready: Tested and deployed in production on 10s of 1000s of nodes – Enterprise-grade features and innovation ● Fine-grained Authorization (Sentry) ● Impala, Search – 100% open source and Apache licensed
  6. 6. Cloudera Manager ● Available for free – Any number of nodes – Manage all services available in CDH – Set up, configure, monitor, diagnose, and upgrade – Complex workflows – Kerberos – API ● 5 Years of expertise baked into product
  7. 7. Installing with Cloudera Manager
  8. 8. Installing with Cloudera Manager
  9. 9. Installing with Cloudera Manager
  10. 10. Installing with Cloudera Manager
  11. 11. Installing with Cloudera Manager
  12. 12. Installing with Cloudera Manager
  13. 13. Installing with Cloudera Manager
  14. 14. Installing with Cloudera Manager
  15. 15. Installation Complete ● Everything is up and running – Great! ● Add users and start running jobs, and get a whole new set of challenges – Great...
  16. 16. Next Challenges ● Find, Diagnose and fix problems – Why are my HBase queries slow? ● View cluster activity – Who ran the MapReduce job that made my HBase queries slow? ● Get alerts for any problems that come up – Outage at 2AM, you want that wake-up call...right?
  17. 17. Health Tests ● Common problems that are easy to check – Are any processes down? – Are HDFS reads and writes working? – Are HDFS checkpoints too slow? – Has a host been swapping? – Is there too much Clock Skew?
  18. 18. Health Tests
  19. 19. Log Search ● Grep works great on 1 machine, not 100's ● Useful to answer – What errors/warnings occurred when my service was slow? – Has this error occurred before? – When did a problem start happening?
  20. 20. Log Search
  21. 21. Events and Alerts ● CM publishes a stream of events – Critical events are alerts ● Event search ● Integrate with external tools like Nagios
  22. 22. Activity Monitor ● Who was running stuff when the cluster had problems? ● See who is running MR jobs – identifies Hive jobs too
  23. 23. Activity Monitor
  24. 24. Metrics and Charts ● Like Log search, a must-have for any distributed system ● Hadoop services expose many metrics ● Collect and visualize these with – Cloudera Manager – Ganglia
  25. 25. Charting with Cloudera Manager
  26. 26. Charting with Cloudera Manager
  27. 27. Charting with Cloudera Manager
  28. 28. Next Challenges ● We know how to set up a cluster manually ● We know how to identify, diagnose and fix issues ● Also need to handle regular tasks – Grow cluster – Replace hardware
  29. 29. Cloudera Manager API ● Setup – Create / configure cluster and services – Configure new host to run on cluster ● Workflows – Enable HDFS High Availability – Enable MapReduce JobTracker High Availability – Decommission / Recommission host ● Monitoring – Metrics used for charting available via API – Health checks, including export to Nagios – Events
  30. 30. Cloudera Manager API ● http://cloudera.github.com/cm_api/ ● Java and Python client bindings ● Shell ● Export health information into Nagios
  31. 31. Common Integration Questions ● Nagios – yes ● Even have tools to help integrate ● Chef – not yet ● Puppet – yes ● Customers use CM and puppet together to press button and stamp out new cluster ● Snmp – yes ● events published and can be integrated
  32. 32. Links ● Hadoop Operations - A Guide for Developers and Administrators – Book by Eric Sammer ● CM Architecture blog – http://blog.cloudera.com/blog/2013/07/how-does-cloudera-manager-work/ ● API Examples and Tutorials – http://cloudera.github.io/cm_api/ – http://blog.cloudera.com/blog/2013/05/how-to-automate-your-hadoop-cluster-from-java/ – http://blog.cloudera.com/blog/2012/09/automating-your-cluster-with-cloudera-manager-api/ ● Cloudera Manager installer link and docs – http://www.cloudera.com/content/support/en/downloads.html – http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/Cloudera-Manager- Installation-Guide/Cloudera-Manager-Installation-Guide.html
  33. 33. Enterprise Features ● Easily upload support bundle – Enables proactive support – Fix problems more quickly ● Rolling Upgrades and Restarts ● Backup and Disaster Recovery ● Auditing ● Operational Reports ● Configuration History and Rollback ● LDAP

×