• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
BigData Clusters Redefined
 

BigData Clusters Redefined

on

  • 162 views

 

Statistics

Views

Total Views
162
Views on SlideShare
162
Embed Views
0

Actions

Likes
1
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    BigData Clusters Redefined BigData Clusters Redefined Presentation Transcript

    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1 Samuel Kommu Technical Marketing Engineer sakommu@cisco.com Hadoop Summit 2014 V3
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2 • Hadoop Overview • Scheduling and Prioritizing Compute Network • Visibility Plugins • Hadoop Optimization and Tuning • Recommendations • Q & A
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3 Big Data @ Cisco - www.cisco.com/go/bigdata Multi-year network and compute analysis testing (In conjunction with partners) Hadoop World 2011, 2012 & 2013 Hadoop Summit 2012 & 2013 Certifications and Solutions with UCS C-Series and Nexus Series Switches Cloudera Hadoop Certified Technology Oracle NoSQL Validated Solution Visibility & Monitoring
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 4
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5 Could take several days just to read Per a test it took 11 days 100 Terabytes of data
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6 NAS/SAN Since you are still limited by the Throughput of the storage systems 100 Terabytes of data
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7 100 Terabytes of direct attached storage Hadoop Distributed File System Same job took 15 minutes once they went parallel spreading the load across 1000 nodes
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9 Map Map Map Map Map Map Map Map Map Map • The Data Ingest & Replication External Connectivity East West Traffic (Replication of data blocks) • Map Phase – Raw data Analyzed and converted to name/value pair. Workload translate to multiple batches of Map task Reducer can start the reduce phase ONLY after the entire Map set is complete • Mostly a IO/compute function Hadoop Distributed File System Unstructured Data Map Map Map Map Map Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 1 Key 2 Key 1 Key 1 Key 1 Key 3 Key 1 Key 1 Key 1 Key 4 Reduce Shuffle Phase ReduceReduce Result/Output Reduce Map Map Map Map Map
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10 Key 1 Key 1 Key 1 Key 4 Key 1 Key 1 Key 1 Key 3 Key 1 Key 1 Key 1 Key 2 Map Map Map Map Map Map Map Map Map Map • Shuffle Phase - All name/value pair are sorted and grouped by their keys. • Reducer is PULLING the data from the Mapper Nodes • High Network Activity • Reduce Phase – All values associates with a key are process for results, three phases Copy - get intermediate result from each data node local disk Merge - to reduce the number of files Reduce method • Output Replication Phase - Reducer replicating result to multiple nodes Highest Network Activity • Network Activities Dependent on Workload Behavior Hadoop Distributed File System Unstructured Data Map Map Map Map Map Key 1 Key 1 Key 1 Key 1 Reduce Shuffle Phase ReduceReduce Result/Output Reduce Map Map Map Map Map
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11 Analyze Search - Count Extract Transform Load (ETL - TeraSort) Explode Normalizing Data Reduce Ingress vs. Egress Data Set 1:0.3 Ingress vs. Egress Data Set 1:1 Ingress vs. Egress Data Set 1:2 Reduce Reduce
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12 Analyze Simulated with Shakespeare Wordcount Extract Transform Load (ETL) Simulated with Yahoo TeraSort Extract Transform Load (ETL) Simulated with Yahoo TeraSort with output replication Job Patterns have varying impact on network utilization
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13 Small Flows/Messaging (Admin Related, Heart-beats, Keep-alive, delay sensitive application messaging) Small – Medium Incast (Hadoop Shuffle) Large Flows (HDFS Ingest) Large Incast (Hadoop Replication)
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 14
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15 Scheduling Core 1 Core 2 Core 3 Job 1 Job 2 Core 4 Job 3 Rescheduling Core 1 Core 2 Core 3 Job 1 Job 2 Core 4 Job 3 Prioritize/De-Prioritize Core 1 Core 2 Core 3 Job 1 Job 2 Core 4 Job 3
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16 Client 1 Client 2 Resource Manager Node Manager Container App Master Node Manager Container App Master Node Manager ContainerContainer Job Submission Resource Request Node Status MapReduce Status
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17 Switch Buffer Usage With Network QoS Policy to prioritize Hbase Update/Read Operations 0 5000 10000 15000 20000 25000 30000 35000 40000 Latency(us) Time UPDATE - Average Latency (us) READ - Average Latency (us) QoS - UPDATE - Average Latency (us) QoS - READ - Average Latency (us) 1 70 139 208 277 346 415 484 553 622 691 760 829 898 967 1036 1105 1174 1243 1312 1381 1450 1519 1588 1657 1726 1795 1864 1933 2002 2071 2140 2209 2278 2347 2416 2485 2554 2623 2692 2761 2830 2899 2968 3037 3106 3175 3244 3313 3382 3451 3520 3589 3658 3727 3796 3865 3934 4003 4072 4141 4210 4279 4348 4417 4486 4555 4624 4693 4762 4831 4900 4969 5038 5107 5176 5245 5314 5383 5452 5521 5590 5659 5728 5797 5866 5935 BufferUsed Timeline Hadoop TeraSort Hbase Read/Update Latency Comparison of Non- QoS vs. QoS Policy ~60% for Read Improvement
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 18
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19 Traditional Networking – Switching/Routing Decisions are per flow What if we could divide a flow into multiple parts – that could take independent network paths
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20 Normal Congested Leaf 1 Leaf 2 Spine 2Spine 1 In traditional networks available today: Leaf 1 is not aware of the congestion between Spine 2 and Leaf 2 Congestion awareness across the network is essential for optimal traffic distribution
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21 Small FlowsLarge Flows Impact of large flows on small flows Without Dynamic Packet Prioritization Large Flows tend to use up resources: • Bandwidth • Buffer Unless smaller flows are prioritized – large flows could potentially have an adverse impact on the smaller flows
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22 Decisions are per leaf Does not consider end-to-end connectivity/congestion Load balancing decisions are made based on end-to-end connectivity/congestion 50% Usage Congested Link 66% Usage 33% Usage ECMP DLB Leaf 1 Leaf 2 Spine 2Spine 1 Leaf 1 Leaf 2 Spine 2Spine 1
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23 Buffer usage is mostly on Leafs Not on Spines Map Progress Reduce Progress
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 25
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26 OS : Red Hat 6.2 Hadoop : Cloudera 3 U5 Memcached Symmetric - No link failures Asymmetric – One failed link
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27 Tests with Memcached Dynamic Packet Prioritization helps improve memcached performance Note: These tests were disk bound – Will be testing with servers containing higher number of drives
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28 ECMP DLB
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 29
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 30 • Hardware 1 x Nexus 9508 (Spine) 2 x Nexus 9396 (Leaf) 8 x UCS C240 M3 2 x Xeon(R) CPU E5-2690 0 @ 2.90GHz 100G RAM 4 x 1TB Drives UCS VIC 1225 • Software Red Hat 6.2 Cloudera 5 NXOS 6.1(2)I2(1)
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31 • Open Framework Modules will be on Github • Based on Apache Hadoop Rest APIs NXOS Python APIs
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 32 leaf-001# python bootflash:hadoopModule/vpmHadoop.py localNodes --------------------------------------------------------------- Node Name Port Speed InRate OutRate Buffer --------------------------------------------------------------- c240-m3-017 Eth1/1 10 1385580232 1128094464 0 c240-m3-018 Eth1/2 10 1882651664 1273181224 50 c240-m3-020 Eth1/3 10 1236743432 1238902664 0 c240-m3-021 Eth1/4 10 675482600 1292790704 0 --------------------------------------------------------------- In/Out Rate is the avg of last 30 seconds in Gbits/sec NXAPI© • Open Framework • Based on Apache Hadoop Rest APIs NXOS Python APIs Scripts on GitHub: https://github.com/datacenter/VisibilityPlugins
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 33 leaf-001# python bootflash:hadoopModule/vpmHadoop.py allNodesJobs Node Name Port Neighbor Speed -------------------------------------- c240-m3-001 Eth2/2 spine-001 40 c240-m3-002 Eth2/2 spine-001 40 c240-m3-004 Eth2/2 spine-001 40 c240-m3-005 Eth2/2 spine-001 40 c240-m3-017 Eth1/1 local 10 c240-m3-018 Eth1/2 local 10 c240-m3-020 Eth1/3 local 10 c240-m3-021 Eth1/4 local 10 Job ID Job Name Host Name Application Progress % --------------------------------------------------------------------- 0887 TeraGen c240-m3-001 MAPREDUCE 37.7623 --------------------------------------------------------------------- Scripts on GitHub: https://github.com/datacenter/VisibilityPlugins NXAPI© • Another example With Job info
    • © 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 34 Instantaneous Results Historic Data using GraphiteTopology
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 35
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 36 0 500 1000 1500 2000 2500 SSD Non-SSD TimeTakeninSeconds Time taken for 1TB Sort SSD vs. Non-SSD Drives
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 37 0 500 1000 1500 2000 2500 3000 Never Always TimeTakeninSeconds Time taken for 1TB Sort THP Always vs THP Never To disable: echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
    • Cisco Confidential© 2010 Cisco and/or its affiliates. All rights reserved. 38
    • 39  Network Attributes  Architecture  Availability  Capacity, Scale & Oversubscription  Flexibility  Management & Visibility Integration Considerations
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 40 Availability • Single NIC failure doubles the job completion time. • Dual NIC has no impact on job completion time • Effective load-sharing of traffic flow on two NICs. NIC bonding configured at Linux – with LACP mode of bonding • Recommended to change the hashing to src-dst-ip-port (both network and NIC bonding in Linux) for optimal load- sharing 40 1161 min 286 min
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 41 Single 1GE 100% Utilized Dual 1GE 75% Utilized 10GE 40% Utilized
    • © 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 42 Big Data @ Cisco - www.cisco.com/go/bigdata Q & A For further questions and a chance to win Beats headset meet us in Booth P12