Welcome to Hadoop2Land!

  • 1,626 views
Uploaded on

This talk takes you on a rollercoaster ride through Hadoop 2 and explains the most significant changes and components. …

This talk takes you on a rollercoaster ride through Hadoop 2 and explains the most significant changes and components.

The talk has been held on the JavaLand conference in Brühl, Germany on 25.03.2014.

Agenda:

- Welcome Office
- YARN Land
- HDFS 2 Land
- YARN App Land
- Enterprise Land

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,626
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
100
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. 25.03.2014 Welcome to uweseiler
  • 2. 25.03.2014 2 Your Travel Guide Big Data Nerd TravelpiratePhotography Enthusiast Hadoop Trainer NoSQL Fan Boy
  • 3. 25.03.2014 2 Your Travel Agency specializes on... Big Data Nerds Agile Ninjas Continuous Delivery Gurus Enterprise Java Specialists Performance Geeks Join us!
  • 4. 25.03.2014 2 Your Travel Destination Welcome Office YARN Land HDFS 2 Land YARN App Land Enterprise Land Goodbye Photo
  • 5. 25.03.2014 2 Stop #1: Welcome Office Welcome Office
  • 6. 25.03.2014 2 …there was MapReduce In the beginning of Hadoop • It could handle data sizes way beyond those of its competitors • It was resilient in the face of failure • It made it easy for users to bring their code and algorithms to the data
  • 7. 25.03.2014 2 …but it was too low level Hadoop 1 (2007)
  • 8. 25.03.2014 2 …but it was too rigid Hadoop 1 (2007)
  • 9. 25.03.2014 2 HDFS …but it was Batch HDFS HDFS Single App Batch Single App Batch Single App Batch Single App Batch Single App Batch Hadoop 1 (2007)
  • 10. 25.03.2014 2 …but it had limitations Hadoop 1 (2007) • Scalability – Maximum cluster size ~ 4,500 nodes – Maximum concurrent tasks – 40,000 – Coarse synchronization in JobTracker • Availability – Failure kills all queued and running jobs • Hard partition of resources into map & reduce slots – Low resource utilization • Lacks support for alternate paradigms and services
  • 11. 25.03.2014 2 YARN to the rescue! Hadoop 2 (2013)
  • 12. 25.03.2014 2 Stop #2: YARN Land Welcome Office YARN Land
  • 13. 25.03.2014 2 A brief history of Hadoop 2 • Originally conceived & architected by the team at Yahoo! – Arun Murthy created the original JIRA in 2008 and now is the Hadoop 2 release manager • The community has been working on Hadoop 2 for over 4 years • Hadoop 2 based architecture running at scale at Yahoo! – Deployed on 35,000+ nodes for 6+ months
  • 14. 25.03.2014 2 Hadoop 1 HDFS Redundant, reliable storage Hadoop 2: Next-gen platform MapReduce Cluster resource mgmt. + data processing Hadoop 2 HDFS 2 Redundant, reliable storage MapReduce Data processing Single use system Batch Apps Multi-purpose platform Batch, Interactive, Streaming, … YARN Cluster resource management Others Data processing
  • 15. 25.03.2014 2 Taking Hadoop beyond batch Applications run natively in Hadoop HDFS 2 Redundant, reliable storage Batch MapReduce Store all data in one place Interact with data in multiple ways YARN Cluster resource management Interactive Tez Online HOYA Streaming Storm, … Graph Giraph In-Memory Spark Other Search, …
  • 16. 25.03.2014 2 YARN: Design Goals • Build a new abstraction layer by splitting up the two major functions of the JobTracker • Cluster resource management • Application life-cycle management • Allow other processing paradigms • Flexible API for implementing YARN apps • MapReduce becomes YARN app • Lots of different YARN apps
  • 17. 25.03.2014 2 Hadoop: BigDataOS™ Storage: File System Traditional Operating System Execution / Scheduling: Processes / Kernel Storage: HDFS Hadoop Execution / Scheduling: YARN
  • 18. 25.03.2014 2 YARN: Architectural Overview Split up the two major functions of the JobTracker Cluster resource management & Application life-cycle management ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Scheduler AM 1 Container 1.2 Container 1.1 AM 2 Container 2.1 Container 2.2 Container 2.3
  • 19. 25.03.2014 2 YARN: Multi-tenancy I ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Scheduler MapReduce 1 map 1.2 map 1.1 MapReduce 2 map 2.1 map 2.2 reduce 2.1 NodeManager NodeManager NodeManager NodeManager reduce 1.1 Tez map 2.3 reduce 2.2 vertex 1 vertex 2 vertex 3 vertex 4 HOYA HBase Master Region server 1 Region server 2 Region server 3 Storm nimbus 1 nimbus 2 Different types of applications on the same cluster
  • 20. 25.03.2014 2 YARN: Multi-tenancy II ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Scheduler MapReduce 1 map 1.2 map 1.1 MapReduce 2 map 2.1 map 2.2 reduce 2.1 NodeManager NodeManager NodeManager NodeManager reduce 1.1 Tez map 2.3 reduce 2.2 vertex 1 vertex 2 vertex 3 vertex 4 HOYA HBase Master Region server 1 Region server 2 Region server 3 Storm nimbus 1 nimbus 2 DWHUser Ad- Hoc root 30% 60% 10% Dev Prod 20% 80% Dev Prod Dev1 Dev2 25% 75% 60% 40% Different users and organizations on the same cluster
  • 21. 25.03.2014 2 YARN: Your own app ResourceManager NodeManager NodeManager NodeManager Scheduler Container Container MyApp YARNClient App specific API myAM AMRMClient NMClient Application Client Protocol Application Master Protocol Container Management Protocol DistributedShell is the new WordCount!
  • 22. 25.03.2014 2 Stop #3: HDFS 2 Land Welcome Office YARN Land HDFS 2 Land
  • 23. 25.03.2014 2 HDFS 2: In a nutshell • Removes tight coupling of Block Storage and Namespace • Adds (built-in) High Availability • Better Scalability & Isolation • Increased performance Details: https://issues.apache.org/jira/browse/HDFS-1052
  • 24. 25.03.2014 2 HDFS 2: Federation NameNodes do not talk to each other NameNodes manages only slice of namespace DataNodes can store blocks managed by any NameNode NameNode 1 Namespace 1 Namespace State Block Map NameNode 2 Namespace 2 Block Pools Pool 1 Pool 2 Block Storage as generic storage service Data Nodes b3 b1 b2 b4 b2 b1 b5 b3 b2 b5 b4 JBOD JBOD JBOD Namespace State Block Map Horizontally scale IO and storage
  • 25. 25.03.2014 2 HDFS 2: Architecture Active NameNode Standby NameNode DataNodeDataNodeDataNode DataNode DataNode Maintains Block Map and Edits File Simultaneously reads and applies the edits Report to both NameNodes Block Map Edits File Block Map Edits File NFS Shared state on NFS OR Quorum based storage Journal Node Journal Node Journal Node Take orders only from the Active or
  • 26. 25.03.2014 2 ZKFailover Controller ZKFailover Controller HDFS 2: High Availability Active NameNode Standby NameNode DataNodeDataNodeDataNode DataNode DataNode Block Map Edits File Block Map Edits File ZooKeeper Node ZooKeeper Node ZooKeeper Node Send Heartbeats & Block Reports Shared State Monitors health of NN, OS, HW Heartbeat Heartbeat Holds special lock znode Journal Node Journal Node Journal Node
  • 27. 25.03.2014 2 HDFS 2: Write-Pipeline • Earlier versions of HDFS • Files were immutable • Write-once-read-many model • New features in HDFS 2 • Files can be reopened for append • New primitives: hflush and hsync • Replace data node on failure • Read consistency DataNode 1 DataNode 2 DataNode 3 DataNode 4 Writer Add new node to the pipeline Reader Data Data Data Can read from any node and then failover to any other node
  • 28. 25.03.2014 2 HDFS 2: Snapshots • Admin can create point in time snapshots of HDFS • Of the entire file system • Of a specific data-set (sub-tree directory of file system) • Restore state of entire file system or data-set to a snapshot (like Apple Time Machine) • Protect against user errors • Snapshot diffs identify changes made to data set • Keep track of how raw or derived/analytical data changes over time
  • 29. 25.03.2014 2 HDFS 2: NFS Gateway • Supports NFS v3 (NFS v4 is work in progress) • Supports all HDFS commands • List files • Copy, move files • Create and delete directories • Ingest for large scale analytical workloads • Load immutable files as source for analytical processing • No random writes • Stream files into HDFS • Log ingest by applications writing directly to HDFS client mount
  • 30. 25.03.2014 2 HDFS 2: Performance • Many improvements • New AppendableWrite-Pipeline • Read path improvements for fewer memory copies • Short-circuit local reads for 2-3x faster random reads • I/O improvements using posix_fadvise() • libhdfs improvements for zero copy reads • Significant improvements: I/O 2.5 - 5x faster
  • 31. 25.03.2014 2 Stop #4: YARN App Land Welcome Office YARN Land HDFS 2 Land YARN App Land
  • 32. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 33. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 34. 25.03.2014 2 MapReduce 2: In a nutshell • MapReduce is now a YARN app • No more map and reduce slots, it’s containers now • No more JobTracker, it’s YarnAppmaster library now • Multiple versions of MapReduce • The older mapred APIs work without modification or recompilation • The newer mapreduce APIs may need to be recompiled • Still has one master server component: the Job History Server • The Job History Server stores the execution of jobs • Used to audit prior execution of jobs • Will also be used by YARN framework to store charge backs at that level • Better cluster utilization • Increased scalability & availability
  • 35. 25.03.2014 2 MapReduce 2: Shuffle • Faster Shuffle • Better embedded server: Netty • Encrypted Shuffle • Secure the shuffle phase as data moves across the cluster • Requires 2 way HTTPS, certificates on both sides • Causes significant CPU overhead, reserve 1 core for this work • Certificates stored on each node (provision with the cluster), refreshed every 10 secs • Pluggable Shuffle Sort • Shuffle is the first phase in MapReduce that is guaranteed to not be data- local • Pluggable Shuffle/Sort allows application developers or hardware developers to intercept the network-heavy workload and optimize it • Typical implementations have hardware components like fast networks and software components like sorting algorithms • API will change with future versions of Hadoop
  • 36. 25.03.2014 2 MapReduce 2: Performance • Key Optimizations • No hard segmentation of resource into map and reduce slots • YARN scheduler is more efficient • MR2 framework has become more efficient than MR1: shuffle phase in MRv2 is more performant with the usage of Netty. • 40.000+ nodes running YARN across over 365 PB of data. • About 400.000 jobs per day for about 10 million hours of compute time. • Estimated 60% – 150% improvement on node usage per day • Got rid of a whole 10,000 node datacenter because of their increased utilization.
  • 37. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 38. 25.03.2014 2 Apache Tez: In a nutshell • Distributed execution framework that works on computations represented as dataflow graphs • Tez is Hindi for “speed” • Naturally maps to execution plans produced by query optimizers • Highly customizable to meet a broad spectrum of use cases and to enable dynamic performance optimizations at runtime • Built on top of YARN
  • 39. 25.03.2014 2 Apache Tez: Architecture • Task with pluggable Input, Processor & Output Task HDFS Input Map Processor Sorted Output „Classical“ Map Task Shuffle Input Reduce Processor HDFS Output „Classical“ Reduce YARN ApplicationMaster to run DAG of Tez Tasks
  • 40. 25.03.2014 2 Apache Tez: Tez Service • MapReduce Query Startup is expensive: – Job launch & task-launch latencies are fatal for short queries (in order of 5s to 30s) • Solution: – Tez Service (= Preallocated Application Master) • Removes job-launch overhead (Application Master) • Removes task-launch overhead (Pre-warmed Containers) – Hive (or Pig) • Submit query plan to Tez Service – Native Hadoop service, not ad-hoc
  • 41. 25.03.2014 2 Hadoop 1 HDFS Redundant, reliable storage Apache Tez: The new primitive MapReduce Cluster resource mgmt. + data processing Hadoop 2 MapReduce as Base Apache Tez as Base Pig Hive Other HDFS Redundant, reliable storage YARN Cluster resource management Tez Execution Engine MR Pig Hive Real time Storm O t h e r
  • 42. 25.03.2014 2 Apache Tez: Performance SELECT a.state, COUNT(*), AVERAGE(c.price) FROM a JOIN b ON (a.id = b.id) JOIN c ON (a.itemId = c.itemId) GROUP BY a.state Existing Hive Parse Query 0.5s Create Plan 0.5s Launch Map- Reduce 20s Process Map- Reduce 10s Total 31s Hive/Tez Parse Query 0.5s Create Plan 0.5s Launch Map- Reduce 20s Process Map- Reduce 2s Total 23s Tez & Hive Service Parse Query 0.5s Create Plan 0.5s Submit to Tez Service 0.5s Process Map-Reduce 2s Total 3.5s * No exact numbers, for illustration only
  • 43. 25.03.2014 2 Stinger Initiative: In a nutshell
  • 44. 25.03.2014 2 Stinger: Overall Performance * Real numbers, but handle with care!
  • 45. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 46. 25.03.2014 2 Storm: In a nutshell • Stream-processing • Real-time processing • Developed as standalone application • https://github.com/nathanmarz/storm • Ported on YARN • https://github.com/yahoo/storm-yarn
  • 47. 25.03.2014 2 Storm: Conceptual view Spout Spout Spout Source of streams Bolt Bolt Bolt Bolt Bolt Tuple Bolt • Consumer of streams • Processing of tuples • Possibly emits new tuplesStream Unbound sequence of tuples Tuple Tuple Tuple List of name-value pairs Topology Network of spouts & bolts as the nodes and streams as the edges
  • 48. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 49. 25.03.2014 2 Spark: In a nutshell • High-speed in-memory analytics over Hadoop and Hive • Separate MapReduce-like engine – Speedup of up to 100x – On-disk queries 5-10x faster • Spark is now a top-level Apache project – http://spark.apache.org • Compatible with Hadoop‘s Storage API • Spark can be run on top of YARN – http://spark.apache.org/docs/0.9.0/running-on-yarn.html
  • 50. 25.03.2014 2 Spark: RDD • Key idea: Resilient Distributed Datasets (RDDs) • Read-only partitioned collection of records • Optionally cached in memory across cluster • Manipulated through parallel operators • Support only coarse-grained operations • Map • Reduce • Group-by transformations • Automatically recomputed on failure RDD A11 A12 A13
  • 51. 25.03.2014 2 Spark: Data Sharing
  • 52. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 53. 25.03.2014 2 HOYA: In a nutshell • Create on-demand HBase clusters • Small HBase cluster in large YARN cluster • Dynamic HBase clusters • Self-healing HBase Cluster • Elastic HBase clusters • Transient/intermittent clusters for workflows • Configure custom configurations & versions • Better isolation • More efficient utilization/sharing of cluster
  • 54. 25.03.2014 2 HOYA: Creation of AppMaster ResourceManager NodeManager NodeManager NodeManager Scheduler Container Container HOYA Client YARNClient HOYA specific API HOYA Application Master Container Container Container Container
  • 55. 25.03.2014 2 HOYA: Deployment of HBase ResourceManager NodeManager NodeManager NodeManager Scheduler Container Container HOYA Client YARNClient HOYA specific API HOYA Application Master Container Container Container Container HBase Master Region Server Region Server
  • 56. 25.03.2014 2 HOYA: Bind via ZooKeeper ResourceManager NodeManager NodeManager NodeManager Scheduler Container Container HOYA Client YARNClient HOYA specific API HOYA Application Master Container Container Container Container HBase Master Region Server Region Server HBase Client ZooKeeper
  • 57. 25.03.2014 2 YARN Apps: Overview • MapReduce 2 Batch • Tez DAG Processing • Storm Stream Processing • Samza Stream Processing • Apache S4 Stream Processing • Spark In-Memory • HOYA HBase on YARN • Apache Giraph Graph Processing • Apache Hama Bulk Synchronous Parallel • Elastic Search Scalable Search
  • 58. 25.03.2014 2 Giraph: In a nutshell • Giraph is a framework for processing semi- structured graph data on a massive scale • Giraph is loosely based upon Google's Pregel • Both systems are inspired by the Bulk Synchronous Parallel model • Giraph performs iterative calculations on top of an existing Hadoop cluster • Uses Single Map-only Job • Apache top level project since 2012 – http://giraph.apache.org
  • 59. 25.03.2014 2 Stop #5: Enterprise Land Welcome Office YARN Land HDFS 2 Land YARN App Land Enterprise Land
  • 60. 25.03.2014 2 Falcon: In a nutshell • A framework for managing data processing in Hadoop Clusters • Falcon runs as a standalone server as part of the Hadoop cluster • Key Features: • Data Replication Handling • Data Lifecycle Management • Process Coordination & Scheduling • Declarative Data Process Programming • Apache Incubation Status • http://falcon.incubator.apache.org
  • 61. 25.03.2014 2 Falcon: One-stop Shop Data Management Needs Tool Orchestration Data Processing Replication Retention Scheduling Reprocessing Multi Cluster Mgmt. Oozie Sqoop Distcp Flume MapReduce Pig & Hive
  • 62. 25.03.2014 2 Falcon: Weblog Use Case • Weblogs saved hourly to primary cluster • HDFS location is /weblogs/{date} • Desired Data Policy: • Replicate weblogs to secondary cluster • Evict weblogs from primary cluster after 2 days • Evict weblogs from secondary cluster after 1 week
  • 63. 25.03.2014 2 Falcon: Weblog Use Case <feed description=“" name=”feed-weblogs” xmlns="uri:falcon:feed:0.1"> <frequency>hours(1)</frequency> <clusters> <cluster name=”cluster-primary" type="source"> <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/> <retention limit="days(2)" action="delete"/> </cluster> <cluster name=”cluster-secondary" type="target"> <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/> <retention limit=”days(7)" action="delete"/> </cluster> </clusters> <locations> <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}- ${HOUR}"/> </locations> <ACL owner=”hdfs" group="users" permission="0755"/> <schema location="/none" provider="none"/> </feed>
  • 64. 25.03.2014 2 Knox: In a nutshell • System that provides a single point of authentication and access for Apache Hadoop services in a cluster. • The gateway runs as a server (or cluster of servers) that provide centralized access to one or more Hadoop clusters. • The goal is to simplify Hadoop security for both users and operators • Apache Incubation Status • http://knox.incubator.apache.org
  • 65. 25.03.2014 2 Knox: Architecture ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Scheduler MapReduce 1 map 1.2 map 1.1 MapReduce 2 map 2.1 map 2.2 reduce 2.1 NodeManager NodeManager NodeManager NodeManager reduce 1.1 Tez map 2.3 reduce 2.2 vertex 1 vertex 2 vertex 3 vertex 4 HOYA HBase Master Region server 1 Region server 2 Region server 3 Storm nimbus 1 nimbus 2 ResourceManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Scheduler MapReduce 1 map 1.2 map 1.1 MapReduce 2 map 2.1 map 2.2 reduce 2.1 NodeManager NodeManager NodeManager NodeManager reduce 1.1 Tez map 2.3 reduce 2.2 vertex 1 vertex 2 vertex 3 vertex 4 HOYA HBase Master Region server 1 Region server 2 Region server 3 Storm nimbus 1 nimbus 2 Kerberos SSO Provider Identity Providers Secure Hadoop Cluster 1 Secure Hadoop Cluster 2 Gateway Cluster #1 #3#2 Ambari / Hue Server Browser Ambari Client REST Client JDBC Client DMZ Firewall Firewall
  • 66. 25.03.2014 2 Final Stop: Goodbye Photo Welcome Office YARN Land HDFS2 Land YARN App Land Enterprise Land Goodbye Photo
  • 67. 25.03.2014 2 Hadoop 2: Summary 1. Scale 2. New programming models & services 3. Beyond Java 4. Improved cluster utilization 5. Enterprise Readiness
  • 68. 25.03.2014 2 One more thing… Let’s get started with Hadoop 2!
  • 69. 25.03.2014 2 Hortonworks Sandbox 2.0 http://hortonworks.com/products/hortonworks-sandbox
  • 70. 25.03.2014 2 Trainings Developer Training • 19. - 22.05.2014, Düsseldorf • 30.06 - 03.07.2014, München • 04. - 07.08.2014, Frankfurt Admin Training • 26. - 28.05.2014, Düsseldorf • 07. - 09.07.2014, München • 11. - 13.08.2014, Frankfurt Details: https://www.codecentric.de/schulungen-und-workshops/
  • 71. 25.03.2014 2 Thanks for traveling with me