Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon

1,071 views

Published on

Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon

Published in: Technology
  • Be the first to comment

Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon

  1. 1. 1© Copyright 2015 EMC Corporation. All rights reserved. IMPROVING HADOOP RESILIENCY & OPERATIONAL EFFICIENCY WITH EMC ISILON 1 MODERNIZE
  2. 2. 2© Copyright 2015 EMC Corporation. All rights reserved. A LITTLE BIT ABOUT ME AND WHAT I DO FOR EMC. BONI BRUNO, CISSP, CISM, CGEIT PRINCIPAL SOLUTIONS ARCHITECT, ANALYTICS EMERGING TECHNOLOGIES DIVISION | EMC 2
  3. 3. 3© Copyright 2016 EMC Corporation. All rights reserved. Agenda Analyze Hadoop’s behavior under different failure scenarios. Review how EMC Isilon improves Hadoop resiliency and operations.
  4. 4. 4© Copyright 2016 EMC Corporation. All rights reserved. Hadoop Deployment Considerations
  5. 5. 5© Copyright 2016 EMC Corporation. All rights reserved.
  6. 6. 6© Copyright 2016 EMC Corporation. All rights reserved. DataNode Failures… DataNode failures affect the availability of job input and output data and also delay read and write data operations which are central to Hadoop’s performance…
  7. 7. 7© Copyright 2016 EMC Corporation. All rights reserved. DataNode Shutdown WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: DataNode failed volumes:/data2/dfs/current; 2016-04-22 13:01:00,112 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:svc-platfora (auth:SIMPLE) cause:java.io.IOException: Block blk_2910942244825575033_338680521 is not valid. 2016-04-22 13:01:00,112 INFO org.apache.hadoop.ipc.Server: IPC Server handler 50 on 50020, call org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol.getBlockLocalPathInfo from 172.28.10.40:55874: error: java.io.IOException: Block blk_2910942244825575033_338680521 is not valid. java.io.IOException: Block blk_2910942244825575033_338680521 is not valid. Log message: Note: HDFS does not support *decommission* of one single disk now. HDFS DataNode can only be decommissioned as a whole.
  8. 8. 8© Copyright 2016 EMC Corporation. All rights reserved. hdfs-site.xml <property> <name>dfs.datanode.failed.volumes.tolerated</name> <value>0</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/data1/dfs,/data2/dfs,/data3/dfs</value> </property>
  9. 9. 9© Copyright 2016 EMC Corporation. All rights reserved. Recovering Data Nodes The fix and work around for the above error log requires the replacement of any failed disks associated with /data2 volume and to recreate the data directory structure as defined by “dfs.datanode.data.dir”. Recovery steps: 1. replace failed hardware 2. restore data volume using OS utilities to recreate the file system and mount. 3. mkdir /data2/dfs 4. chown hdfs:hadoop /data2/dfs 5. service hadoop-hdfs-datanode start
  10. 10. 10© Copyright 2016 EMC Corporation. All rights reserved. TaskTracker Failures… TaskTracker failures are equally important because they affect running tasks as well as the availability of intermediate data, i.e. map outputs.
  11. 11. 11© Copyright 2016 EMC Corporation. All rights reserved. What’s the impact??? Surprisingly, a single failure can lead to large and unpredictable variations in job completion time. For example, the running time of a job that takes 220s without failures can vary from 220s to as much as 1000s under TaskTracker failures and 700s under DataNode failures. Ref: Florin Dinu & Eugene Ng, Rice University
  12. 12. 12© Copyright 2016 EMC Corporation. All rights reserved. Why??? • Hadoop’s speculative execution (SE) algorithm can be negatively influenced by the presence of fast advancing tasks. DataNode failures are one cause of such fast tasks. • Hadoop tasks are not good at sharing failure information. The unfortunate effect is that multiple tasks could be left wasting time discovering a failure that has already been identified by another task. • Temporary overload conditions such as network congestion or excessive end-host load can lead to TCP connection failures.
  13. 13. 13© Copyright 2016 EMC Corporation. All rights reserved. ISILON SCALE-OUT NAS ARCHITECTURE OneFS Operating Environment Intra-cluster Communication Layer Client/Application Layer Ethernet Layer SingleFS/Volume CIFSNFS FTPHTTP HDFS for Hadoop REST for Object Gig-e 10 Gig-e Network Protocols
  14. 14. 14© Copyright 2016 EMC Corporation. All rights reserved. HDFS: Standard Hadoop Cluster HDFS file file copy2 file copy3 node info file node info file copy2 file copy3 file node info file copy2 file copy3 file node info file copy2 file copy3 Node reply Node reply Node reply Node reply node reply MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce node info MAP Reduce MAP Reduce MAP Reduce MAP Reduce Data Compute MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce Compute Data Name node 3X NFS Name node Decision Support Databases Web Click data OLAP EDW HTTP CIFS FTP NFS Landing Zone Servers Step 1: Data is copied into the Landing Zone Step 2: Data is copied into the Cluster (3 times) Step 3: Hadoop Jobs are run
  15. 15. 15© Copyright 2016 EMC Corporation. All rights reserved. HADOOP WITH ISILON SCALE-OUT NAS STORAGE 1 Multi Protocol Scale-Out Storage Platform – NFS, CIFS, FTP, HTTP, HDFS 2 Highly resilient, Predictable Scalability – Distributed NameNode & DataNode 3 Enterprise Data Protection & Governance – SnapshotIQ, SyncIQ, SmartLock, ACLs.. 4 Industry-Leading Storage Efficiency – >80% Storage Utilization 5 Independent Scalability with Optimized QoS – Optimally Scale Storage & Compute 6 Consolidate Data Silos – Industry Standard Protocols – Bring Applications to Shared Data
  16. 16. 16© Copyright 2016 EMC Corporation. All rights reserved. Better Hadoop--What If You Could…?  Have implicit high availability--automatically  Elastically & independently scale compute & storage  Efficiently protect data with “erasure coding”  Use your HDFS system for non-Hadoop processing  Automatically have differentiated QoS  Run multiple Hadoop distros at the same time
  17. 17. 17© Copyright 2016 EMC Corporation. All rights reserved. ISILON ONEFS: BUILT FOR BIG DATA Massive Scalability • automates activities “unfit for humans” • • • 17 • Symmetric scale-out architecture • Fully distributed, fine-grained services • Unified IP storage (NFS, SMB, Object, HDFS)
  18. 18. 18© Copyright 2016 EMC Corporation. All rights reserved. Ethernet HADOOP ARCHITECTURE – DAS VS ISILON NameNode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Ethernet Compute Node Compute Node Compute Node Compute NodeCompute Node Compute Node name node name node name node datanode
  19. 19. 19© Copyright 2016 EMC Corporation. All rights reserved. SMB, NFS, HTTP, FTP, HDFS node info node info node info node info MAP Reduce MAP Reduce MAP Reduce MAP Reduce HDFS: Integrated Isilon and Hadoop name node datanode Isilon name node name node name node NFS Decision Support Databases Web Click data OLAP EDW Step 1: Much or all of the Data lives on the Isilon/Hadoop Cluster Step 2: Jobs are run Hadoop Cluster
  20. 20. 20© Copyright 2016 EMC Corporation. All rights reserved. DAS Hadoop = at least 5 copies Existing Virtualized Data Center DAS Hadoop Infrastructure Unstructured Data 2 Existing Primary Storage 3 4 4 4 4 4 1 5 3 4 5 3 4 5 3 4 5 3 4 5 2 Primary Data Copy of Data HDFS Rep Count = 3 1 It takes >24 hours to transfer 100TB into DAS Hadoop over 10GB Ethernet Network
  21. 21. 21© Copyright 2016 EMC Corporation. All rights reserved. Data Center Network TIME-TO-RESULTS Data Copy Analysis In-Place Analysis Existing Primary Storage Hadoop on a Stick Have you ever copied 100TB from Primary Storage to a Hadoop system? How long does it take to copy 100TB from one place to another over a 10Gb link? >24 Hours Data Center Network Existing Primary Storage Hadoop Compute Nodes Reading relevant data to analysis
  22. 22. 22© Copyright 2016 EMC Corporation. All rights reserved. Existing Virtualized Data Center Existing Primary Storage ISILON ENTERPRISE HADOOP 1  No replication required (Use your existing data)  Store 1 copy instead of 5  Industry Leading Time to Results – no need to wait to transfer data into HadoopNew Hadoop Compute Nodes Unstructured Data Use Native HDFS Protocol Primary Data1 1 1 1 Start analyzing Data immediately – no need to wait >24 hours to start
  23. 23. 23© Copyright 2016 EMC Corporation. All rights reserved. Isilon HDFS Interface  Isilon supports the HDFS interfaces for the DataNode and NameNode to host data and metadata  Underlying file system is OneFS  As simple as pointing the HDFS clients to the DNS name of the Isilon cluster!
  24. 24. 24© Copyright 2016 EMC Corporation. All rights reserved. SCALE-OUT ISILON FOR SCALE-OUT HADOOP Compute Nodes • Isilon is a scale-out system, like Hadoop • HDFS on Isilon functions as a parallel file system • Each compute node performs I/O on every Isilon node in the rack • I/O bandwidth and storage capacity can be increased linearly simply by adding Isilon nodes • Compute can be increased or decreased on the fly and can easily be virtualized • With a mesh network that is faster than the disks, data locality is irrelevant Isilon Nodes
  25. 25. 25© Copyright 2016 EMC Corporation. All rights reserved. PROTOCOL SUPPORT Servers Servers Servers Before After  HDFS is not visible to Windows, Unix, Linux, Apple, or any other file system natively  Big Data is only used for Big Data  Inherent multi-protocol support in Isilon allows ubiquitous access to all file systems including Hadoop  Big Data is actual data!Servers
  26. 26. 26© Copyright 2016 EMC Corporation. All rights reserved. ACCESS FILES USING SMB AND HDFS! • With Isilon, you can use SMB, NFS, and HDFS to access your files! • Simply drag-and-drop input files to your HDFS root directory, analyze them using Hadoop, and drag-and- drop the results back to your desktop.
  27. 27. 27© Copyright 2016 EMC Corporation. All rights reserved. HDFS SMB, NFS, HTTP, FTP, HDFS Node reply Node reply Node reply Node reply NameNode Data Support for Multiple Hadoop Distributions name node name node name node name node datanode NFS SMB SMB NFS MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce MAP Reduce IBM
  28. 28. © Copyright 2015 EMC Corporation. All rights reserved.  HDFS protocol stack written in C++ – Increased parallel processing – Greater scalability – Support for CloudPools and file filtering – Audit support on cluster  Easy web administration interface – Full configuration options  Extensive CLI options for scripting – isi hdfs controls HDFS settings ONEFS HDFS PROTOCOL ADVANTAGES
  29. 29. © Copyright 2015 EMC Corporation. All rights reserved. CONFIGURE VIA WEB ADMIN INTERFACE New HDFS configuration page in web administration interface Authentication type and root directory: Any configuration previously done via CLI now done in web administration interface Can enable HDFS and change block size
  30. 30. © Copyright 2015 EMC Corporation. All rights reserved. PIVOTAL HDB (POWERED BY APACHE HAWK)
  31. 31. © Copyright 2015 EMC Corporation. All rights reserved. RECENT BETA TEST ENVIRONMENT
  32. 32. © Copyright 2015 EMC Corporation. All rights reserved. BETA TEST DETAILS…
  33. 33. © Copyright 2015 EMC Corporation. All rights reserved. BETA TEST DETAILS… Test runs through TPCDC Benchmark in regular and Kerberos clusters.
  34. 34. © Copyright 2015 EMC Corporation. All rights reserved. LOAD & ANALYZE RESULTS (UNOFFICIAL)…
  35. 35. © Copyright 2015 EMC Corporation. All rights reserved. HDB 2.0 – ONEFS V8.0 VS V7.2.1.1 (UNOFFICIAL)
  36. 36. © Copyright 2015 EMC Corporation. All rights reserved. HDB 2.0 – DAS VS ONEFS V8 (UNOFFICIAL)
  37. 37. © Copyright 2015 EMC Corporation. All rights reserved. 5 USER CONCURRENCY RESULTS (UNOFFICIAL)…
  38. 38. © Copyright 2015 EMC Corporation. All rights reserved. TPCDS SCORES (UNOFFICIAL)…
  39. 39. © Copyright 2015 EMC Corporation. All rights reserved. ROLLING UPGRADE -> NON-DISRUPTIVE UPGRADE 8.0 8.0 8.0 8.0 8.x 8.x 8.x 8.x 8.08.x 8.0 8.x Release Rollback 7.2.1 7.2.1 7.2.1 7.2.1 7.2.1 Non-Disruptive Upgrade INTERNAL USE ONLY. UNDER NDA. 40
  40. 40. © Copyright 2015 EMC Corporation. All rights reserved. FEATURES Seamless tiering of “frozen” data to Cloud Provides OneFS with Cloud scale capacity Choice of public and private Cloud options Optional Encryption and compression Seamless policy-based data placement Uses the same SmartPools policy engine Integrated with Backups and Replication Transparent to users and applications Optimized recall of portions of a file OPEX options with Cloud provider while reducing CAPEX WHAT IS CLOUDPOOLS S-Series Performance HD-Series Deep archive X-Series Throughput NL-Series Archive Capacity $/TB CloudPools Cold archive 41© Copyright 2015 EMC Corporation. All rights reserved. High Low
  41. 41. © Copyright 2015 EMC Corporation. All rights reserved. S - Series X - Series NL-Series EXTENDING ISILON TO THE CLOUD HD-Series 42© Copyright 2015 EMC Corporation. All rights reserved. Cloud Cold archive
  42. 42. © Copyright 2015 EMC Corporation. All rights reserved. ISILON AND CLOUDPOOLS COMPARISON Isilon Cloud vendors enabled by CloudPools Capacity Up to 68 PB Virtually Limitless Storage platforms S-, X-, NL-, HD-Series Public and private cloud providers Tiering Cluster-wide using SmartPools Within data center and/or cloud Management Same Same Reporting Same Same
  43. 43. 44© Copyright 2015 EMC Corporation. All rights reserved. HADOOP RESPONSE WITH COTS INFRASTRUCTURE • TCP connection failure (failed request) • Multiple tasks waste time attempting to discover the failure (failure information is not shared across tasks) • Task failure on a node can induce task failures in other healthy nodes • Significant performance impact • System outage KEY BENEFITS WITH ISILON • Network congestion on Isilon can be easily avoided via Isilon’s SmartConnect IP load balancing software • Each node has four network interfaces which allows for improved throughput and load balancing • Data Node traffic can be isolated from compute traffic due to tiered architecture • Isilon provides monitoring tools for connectivity reporting across the cluster 44© Copyright 2015 EMC Corporation. All rights reserved. Failure Scenario: Overload condition such as network congestion or excessive end-host load. Result: System Performance Degradation Support Process: Network Team Server Team Greater BI Team/Leads
  44. 44. 45© Copyright 2015 EMC Corporation. All rights reserved. HADOOP RESPONSE WITH COTS INFRASTRUCTURE • System waits for non-responsive node for up to 10 minutes • Temporary overload conditions such as network congestion or excessive end-host load can lead to TCP connection failures • Completed map tasks whose output data is inaccessible is re-executed very conservatively • Significant performance impact KEY BENEFITS WITH ISILON • DataNode non-responsiveness due to network contention is avoided via Isilon’s SmartConnect IP load balancing software • Each node has four network interfaces which allows for improved throughput and load balancing • Data Node traffic can be isolated from compute traffic due to tiered architecture 45© Copyright 2015 EMC Corporation. All rights reserved. Failure Scenario: Non-responsiveness from Data Nodes / TaskTracker Result: System Performance Degradation (5x delay) Support Process: Network Team Server Team Greater BI Team/Leads
  45. 45. 46© Copyright 2015 EMC Corporation. All rights reserved. HADOOP RESPONSE WITH COTS INFRASTRUCTURE • TCP connection failure (failed request) • Multiple tasks required to analyze and waste time discovering the failure (failure information is not shared) • Since tasks do not share failure information, a task involving multiple HDFS requests may encounter multiple CTO(connection timeout) errors • DataNode considered underprotected and reprotection is initiated after 10 min. • Significant performance impact KEY BENEFITS WITH ISILON • Isilon is a combination of multiple nodes that all actively participate in reads and writes and is fully redundant • Failures within Isilon are immediately discovered via the OneFS OS and communicated on the Infiniband Network for millisecond resolution • DataNode failures do not occur on Isilon due to Isilon’s high-availability and resiliency 46© Copyright 2015 EMC Corporation. All rights reserved. Failure Scenario: Data Node Complete Failure Result: Task Failure CTO Errors Cluster Performance Impact Support Process: Network Team Server Team Greater BI Team/Leads
  46. 46. 47© Copyright 2015 EMC Corporation. All rights reserved. HADOOP RESPONSE WITH COTS INFRASTRUCTURE • Replicating data (3X mirroring - default) is required to increase availability • Mirroring data across nodes can add massive amounts of IP traffic over existing interfaces which can cause network congestion • Network congestion caused by mirroring can cause failed tasks and delayed/failed processing KEY BENEFITS WITH ISILON • Isilon utilizes erasure-encoding for efficient storage utilization • All nodes in an Isilon cluster participate in reads and writes for improved performance • All nodes in an Isilon cluster utilize in-memory and flash- based caching strategies resulting in improved reads and writes • Isilon utilizes a dedicated infiniband network (backplane), alleviating possible network contention scenarios between compute and storage nodes within a traditional hadoop environment 47© Copyright 2015 EMC Corporation. All rights reserved. Failure Scenario: Slow reads and writes Result: Storage Inefficiency Unused Resources Network Contention Support Process: Network Team Server Team Greater BI Team/Leads
  47. 47. 48© Copyright 2015 EMC Corporation. All rights reserved. HADOOP RESPONSE WITH COTS INFRASTRUCTURE KEY BENEFITS WITH ISILON 48© Copyright 2015 EMC Corporation. All rights reserved. Scalability/Growth • Adding both compute and storage when only compute or storage is actually required (cost effectiveness?) • Network infrastructure requirements grows exponentially over time • 3x mirroring creates massive infrastructure growth as the environment matures and grows • Lack of enterprise features for “plug and play” infrastructure, DR, multi-protocol, multi-tenancy, hardware abstraction, SEC-17A4 (WORM) • Isilon node can be added to a production cluster in under 60 seconds • Scale compute and storage independently • Minimize network requirements • Minimize data center footprint • Staging not required • Future proof, no downtime during refresh cycles
  48. 48. 49© Copyright 2015 EMC Corporation. All rights reserved. 49© Copyright 2016 EMC Corporation. All rights reserved.

×