Proven Tools toSimplify Hadoop Environments    Joey Jablonski & Vin Sharma
Intros and Bios• Joey Jablonski  – Dell Principal Solution Architect  – http://www.linkedin.com/in/joeyjablonski  – http:/...
Agenda• Why Hadoop is difficult for IT to operate• How the right tools can make this easier  – Deployment & Configuration ...
Hadoop Operational ModelsTraditional Datacenter   Cloud Infrastructure• Assigned Servers       • Elastic Resources• Rigid ...
Operational Challenges• Deployment   – Complex because of scale (60 nodes to 1000 nodes)   – Cumbersome because of high-to...
CloudOps Framework
Three aspects of revolutionary cloudsTwo Sides of Cloud   Ecosystem        +API         Cloud = Operations      Ops   Blac...
Images vs. Layers: OverviewImages: Single Unit            Layers: Stacked Pieces          Configuration                   ...
Images vs. Layers: Lifecycle Images: Replacement             Layers: UpgradeConfig      Config     Config                 ...
Modular Design: Barclamps                  APIs, User Access,      Nagios       Ganglia   Dashboard                  & Eco...
Crowbar = Install State Machine
Cloud = OpsWe have capable hardware & software, the real question ishow are we going to operate it as a service?          ...
Second Act
Platform Selection Dell PowerEdge C2100 for Hadoop based on Intel® Xeon® Dell PowerEdge C2100• Designed with big data in m...
So what seems to be the problem?• Dataflow and high level  abstraction make it difficult  to understand runtime  behaviors...
HiTune: Hadoop Performance Analyzer •   Collects metrics from each node •   Aggregates data using Chukwa •   Analyzes resu...
HiTune Architecture                                                                                Sampler• Tracker       ...
Case Study          Partitioned            Input        Map Tasks                            Reduce Tasks               D ...
Have at it• Pull Crowbar  – https://github.com/dellcloudedge/crowbar• Pull HiTune  – https://github.com/HiTune/HiTune
Q&A
Upcoming SlideShare
Loading in …5
×

Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonski, Dell & Vin Sharma, Intel

2,279 views

Published on

This session will answer frequently asked questions about Hadoop, and share proven ways you can overcome challenges in deploying, managing, and tuning Hadoop environments. The discussion topics will include Hadoop operations, configuration management, upgrades and lifecycle management, monitoring and managing power and heat, and Hadoop performance tuning, testing, and optimization. The presenters will also discuss how rapid Hadoop deployment makes life easier for administrators, and talk about Crowbar, an open source Operations Framework.

Published in: Technology
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total views
2,279
On SlideShare
0
From Embeds
0
Number of Embeds
574
Actions
Shares
0
Downloads
0
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide
  • Hadoop Operations (10-min)Struggles and Challenges (Dell)Operations Framework (25 min)Dev Ops inspired operations framework (Dell)Crowbar (Dell)Monitoring and Management (Intel)Power & Cooling (Dell)Hadoop Lifecycle Management (10-min)Performance Testing - HiTune (Intel)Hadoop Tuning (Intel)
  • For NoSQL data warehouses using Hadoop, you can see the benefit of modern servers versus legacy. On these two tests, the Xeon 5600-based server cluster significantly outperformed the legacy server cluster, and offered many more features and greater energy-efficiency than the older model.It pays to optimize around the right hardware. Legacy servers will forego a lot of performance and energy efficiency, potentially limiting the SLA, number of users and amount of data that can be processed for analysis.• Intel® Xeon® 5600 improves Hadoop Workload performance• Choosing an optimized server board can reduce power consumption• Use Intel® X25-E SATA SSDs to improve performanceSoftware & configurations:• Use latest Linux kernel• Turn on Intel® Hyper-threading• Optimize Hadoop Configuration• Tuning may be different for different workload types
  • Hadoop World 2011: Proven Tools to Manage Hadoop Environments - Joey Jablonski, Dell & Vin Sharma, Intel

    1. 1. Proven Tools toSimplify Hadoop Environments Joey Jablonski & Vin Sharma
    2. 2. Intros and Bios• Joey Jablonski – Dell Principal Solution Architect – http://www.linkedin.com/in/joeyjablonski – http://mergingbusinessandit.com• Vin Sharma – Intel Open Source Enterprise Strategist – http://www.linkedin.com/in/c1f3rt3xt – http://www.intel.com/opensource
    3. 3. Agenda• Why Hadoop is difficult for IT to operate• How the right tools can make this easier – Deployment & Configuration with Dell Crowbar – Monitoring and Management with Dell Barclamps – Performance Tuning with Intel HiTune
    4. 4. Hadoop Operational ModelsTraditional Datacenter Cloud Infrastructure• Assigned Servers • Elastic Resources• Rigid Policies • Services (APIs)• Tiered Software • Distributed Software
    5. 5. Operational Challenges• Deployment – Complex because of scale (60 nodes to 1000 nodes) – Cumbersome because of high-touch processes• Configuration & Tuning – Error-prone configuration management – State management• Monitoring and Management – Complex troubleshooting and diagnostics – No proactive notification of problems• Performance Optimization – Limitations of traditional tools
    6. 6. CloudOps Framework
    7. 7. Three aspects of revolutionary cloudsTwo Sides of Cloud Ecosystem +API Cloud = Operations Ops Black Box HW OPS CloudOps SW APIs Cloud Ops O/S Physical
    8. 8. Images vs. Layers: OverviewImages: Single Unit Layers: Stacked Pieces Configuration Integrations Application Foo Configuration Integrations + Application Bar Applications + Utilities + Operating Utilities System Operating System
    9. 9. Images vs. Layers: Lifecycle Images: Replacement Layers: UpgradeConfig Config Config I I Foo Foo Config ConfigI+A+U+O I+A+U+O I+A+U+O Bar v1 Bar v2 /S /S /S U U OS OS Config Bar v2 I+A+U+O /S
    10. 10. Modular Design: Barclamps APIs, User Access, Nagios Ganglia Dashboard & Ecosystem PartnersOps Management Hadoop Dell “Crowbar” Cloud Infrastructure & Dell IP Extensions Crowbar DNS Logging Core Components & Operating Systems Deployer NTP Provisioner BIOS IPMI Physical Resources Network RAID
    11. 11. Crowbar = Install State Machine
    12. 12. Cloud = OpsWe have capable hardware & software, the real question ishow are we going to operate it as a service? • This is CloudOps OPS HW • Software mindset to infrastructure • Software is constantly changing Cloud • Fluid resources instead of servers SW Ops • Manual touch is unacceptableUltimately, all the rules for operating the data center becomeencoded as automation software.
    13. 13. Second Act
    14. 14. Platform Selection Dell PowerEdge C2100 for Hadoop based on Intel® Xeon® Dell PowerEdge C2100• Designed with big data in mind• Compact 2U form factor• 2-socket 6-core• Intel® Xeon® 5620 processor• High performance memory system• Expansive disk storage Recommended Configuration • Intel Xeon Processor 5600 series • 4-6 1TB or 2TB 7200 RPM SATA SSD • 12-24GB DDR3 R-ECC RAM • 1-2 dual-port 1GigE • Linux kernel 2.6.30 or later • Sun Java 6u14 or later • Hadoop version 0.20.x or laterIntel Whitepaper: “Optimizing Hadoop Deployments” (http://software.intel.com/file/31124)
    15. 15. So what seems to be the problem?• Dataflow and high level abstraction make it difficult to understand runtime behaviors• Large distributed system makes it difficult to correlate concurrent performance- related activities
    16. 16. HiTune: Hadoop Performance Analyzer • Collects metrics from each node • Aggregates data using Chukwa • Analyzes results using Hadoop • Generates reports for visualization• System metric (CPU, Disk I/O, Network IO, Memory)• Hadoop metrics (NameNode, DataNode, JobTracker, TaskTracker, JVM metrics)• Dataflow based statistics (Job, MapTasks, Reduce Tasks, Threaddump for M/R)• Summary view of a single job• Summary view by comparing multiple jobs Apache 2.0 License
    17. 17. HiTune Architecture Sampler• Tracker Sampler Task Sampler Task Sampler – Lightweight agent running on each node Task Task Sampler Sampler Task Task Sampler Sampler Tracker in the Hadoop cluster Task Sampler Task Task Tracker • Sysstat, Hadoop logs and metrics, Java Tracker Sampler instrumentation Task Sampler Task Sampler Task Tracker• Aggregation engine Aggregation engine – Merges the results of all the trackers in a Analysis engine distributed fashion Specification file• Analysis engine – Generates reports based on data flow model Dataflow diagram
    18. 18. Case Study Partitioned Input Map Tasks Reduce Tasks D map spill Aggregated shuffle Output copier merge sort reduce A map spill shuffle copier merge sort reduce T map Spill shuffle copier merge sort map reduce A spill Streaming dataflow Sequential dataflow Terasort with zlib• Large gap between end of map and end of shuffle• No CPU, I/O, or network bandwidth bottlenecks• Adding copiers does not change “shuffle fetchers busy percent” = 100 Terasort with LZO• Copier threads idle 80% waiting for memory merge threads• Memory merge threads busy mostly due to compression• Changing compression codec to LZO closes the gap• Improves job running time by 2.3x
    19. 19. Have at it• Pull Crowbar – https://github.com/dellcloudedge/crowbar• Pull HiTune – https://github.com/HiTune/HiTune
    20. 20. Q&A

    ×