Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
W h a t i t Ta k e s t o R u n H a d o o p a t S c a l e :
Ya h o o P e r s p e c t i v e s
P R E S E N T E D B Y S u m e ...
Introduction
2
 Senior Engineer with the Hadoop Operations team at
Yahoo
 Involved with Hadoop since 2006, starting with...
Hadoop a Secure Shared Hosted Multi-tenant Platform
3
TV
PC
Phone
Tablet
Pushed Data
Pulled Data
Web Crawl
Social
Email
3r...
Platform Evolution (2006 – 2015)
4
0
100
200
300
400
500
600
700
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
40,000
...
Top 10 Considerations for Scaling a Hadoop-based Platform
5
On-Premise or Public Cloud
Total Cost Of Ownership (TCO)
Hardw...
On-Premise or Public Cloud – Deployment Models
6
1
Private (dedicated)
Clusters
Hosted Multi-tenant
(private cloud)
Cluste...
On-Premise or Public Cloud – Selection Criteria
7
1
 Fixed, does not vary with utilization
 Favors scale and 24x7 centra...
On-Premise or Public Cloud – Evaluation
8
1
On-Premise
Public Cloud
Cost
Data
SLA
Tech Stack
Security
Multi-tenancy
On-Premise or Public Cloud – Utilization Matters
9
1
Utilization / Consumption (Compute and Storage)
Cost($)
On-premise
Ha...
Total Cost Of Ownership (TCO) – Components
10
2
$2.1 M
60%
12%
7%
6%
3%
2%
6
5
4
3
2
1
7
10%
Operations Engineering
 Head...
Total Cost Of Ownership (TCO) – Unit Costs (Hadoop)
11
2
Container memory
where apps perform
computation and
access HDFS i...
Total Cost Of Ownership (TCO) – Consumption Costs
12
2
Map GB-Hours = GB(M1) x
T(M1) + GB(M2) x T(M2) +
…
Reduce GB-Hours ...
Hardware Configuration – Physical Resources
13
3
.
.
.
.
Datacenter 1
Rack 1 Rack N
.
.
Clusters in Datacenters Server Res...
Hardware Configuration – Eventual Heterogeneity
14
3
24 G 8 cores SATA 0.5 TB
48 G 12 cores SATA 1.0 TB
64 G Harpertown SA...
Network – Common Backplane
15
4
DataNode NodeManager
NameNode RM
DataNodes RegionServers
NameNode HBase Master Nimbus
Supe...
Network – Bottleneck Awareness
16
4
Hadoop
Cluster
(Data Set 1)
Hadoop
Cluster
(Data Set 2)
HBase Cluster
(Low-latency
Dat...
Network – 1/10G BAS (Rack Locality Not A Major Issue)
17
4
RSW
…
…
…
N x
RSW RSW
BAS1-1 BAS1-2
FAB 1 FAB 2 FAB 3 FAB 4 FAB...
Network –10G CLOS (Server Placement Not an Issue)
18
4
Spine 1
Leaf 1
Spine 2
Leaf 2
Leaf 3
Leaf 4
Spine 15 Leaf 29
Leaf 3...
Network – Gen Next
19
4
Source: http://www.opencompute.org
Software Stack – Where are We Today
20
5
Compute
Services
Storage
Infrastructure
Services
Hive
(0.13)
Pig
(0.11, 0.14)
Ooz...
Software Stack – Obsess With Use Cases, Not Tech
21
5
HDFS
(File System)
YARN
(Scheduling, Resource Management)
Common
In-...
Security and Account Management – Overview
22
6
Grid
Identity,
Authentication and
Authorization
User Id
SSO
Groups, Netgro...
Security and Account Management – Flexibly Secure
23
6
Kerb Realm 2
(Users)
Kerb Realm 1
(Projects, Services)
IdP
SP
CLIEN...
Data Lifecycle Management and BCP
24
7
Acquisition
Replication
(Feeds)Source
Retention
(Policy based
Expiration)
Archival
...
Data Lifecycle Management and BCP
25
7
MetaStore
Cluster 1 - Colo 1
HDFS
Cluster 2 – Colo 2
HDFS
Grid Data
Management
Feed...
Metering, Audit, and Governance
26
8
Starling
FS, Job, Task logs
Cluster 1 Cluster 2 Cluster n...
CF, Region, Action, Quer...
Metering, Audit, and Governance
27
8
Data Discovery and Access
Public
Non-sensitive
Financial
Restricted
$
Governance
Clas...
Integration with External Systems
28
9
BI, Reporting, Transactional DBs
Hadoop Customers
…
DH
Cloud Messaging
Serving Syst...
Debunking Myths
29
10
Hadoop isn’t enterprise ready
Hadoop isn’t stable, clusters go down
You lose data on HDFS
Data canno...
Thank You
@sumeetksingh
@rajivec
Yahoo Kiosk #D5
We are Hiring!
Upcoming SlideShare
Loading in …5
×

What it takes to run Hadoop at Scale: Yahoo! Perspectives

1,745 views

Published on

Hadoop Summit 2015

Published in: Technology
  • Want to earn $4000/m? Of course you do. Learn how when you join today! ★★★ http://scamcb.com/ezpayjobs/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Just got my check for $500, Sometimes people don't believe me when I tell them about how much you can make taking paid surveys online... So I took a video of myself actually getting paid $500 for paid surveys to finally set the record straight. I'm not going to leave this video up for long, so check it out now before I take it down!  http://ishbv.com/surveys6/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

What it takes to run Hadoop at Scale: Yahoo! Perspectives

  1. 1. W h a t i t Ta k e s t o R u n H a d o o p a t S c a l e : Ya h o o P e r s p e c t i v e s P R E S E N T E D B Y S u m e e t S i n g h , R a j i v C h i t t a j a l l u ⎪ J u n e 1 1 , 2 0 1 5 H a d o o p S u m m i t 2 0 1 5 , S a n J o s e
  2. 2. Introduction 2  Senior Engineer with the Hadoop Operations team at Yahoo  Involved with Hadoop since 2006, starting with the early 400-node to over 42,000-node prod env. today  Started with Center for Development of Advanced Computing in 2002 before joining Yahoo in  BS degree in Computer Science from Osmania University, India Rajiv Chittajallu Sr. Principle Engineer Hadoop Operations 701 First Avenue, Sunnyvale, CA 94089 USA @rajivec  Manages Cloud Storage and Big Data products team at Yahoo  Responsible for Product Management, Strategy and Customer Engagements  Managed Cloud Engineering products teams and headed Strategy functions for the Cloud Platform Group at Yahoo  MBA from UCLA and MS from RPI Sumeet Singh Sr. Director, Product Management Cloud Storage and Big Data Platforms 701 First Avenue, Sunnyvale, CA 94089 USA @sumeetksingh
  3. 3. Hadoop a Secure Shared Hosted Multi-tenant Platform 3 TV PC Phone Tablet Pushed Data Pulled Data Web Crawl Social Email 3rd Party Content Data Highway Hadoop Grid BI, Reporting, Adhoc Analytics Data Content Ads No-SQL Serving Stores Serving
  4. 4. Platform Evolution (2006 – 2015) 4 0 100 200 300 400 500 600 700 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,000 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 RawHDFS(inPB) #Servers Year Servers Storage Yahoo! Commits to Scaling Hadoop for Production Use Research Workloads in Search and Advertising Production (Modeling) with machine learning & WebMap Revenue Systems with Security, Multi- tenancy, and SLAs Open Sourced with Apache Hortonworks Spinoff for Enterprise hardening Nextgen Hadoop (H 0.23 YARN) New Services (HBase, Storm, Spark, Hive) Increased User-base with partitioned namespaces Apache H2.6 (Scalable ML, Latency, Utilization, Productivity) Server s Use Cases Hadoo p 43,000 300 HBase 3,000 70 Storm 2,000 50
  5. 5. Top 10 Considerations for Scaling a Hadoop-based Platform 5 On-Premise or Public Cloud Total Cost Of Ownership (TCO) Hardware Configuration 2 3 Network4 Software Stack5 6 7 8 10 Security and Account Management Data Lifecycle Management and BCP Metering, Audit and Governance 9 Integration with External Systems Debunking Myths 1
  6. 6. On-Premise or Public Cloud – Deployment Models 6 1 Private (dedicated) Clusters Hosted Multi-tenant (private cloud) Clusters Hosted Compute Clusters  Large demanding use cases  New technology not yet platformized  Data movement and regulation issues  When more cost effective than on- premise  Time to market/ results matter  Data already in public cloud  Source of truth for all of orgs data  App delivery agility  Operational efficiency and cost savings through economies of scale On-Premise Public Cloud Purpose-built Big Data Clusters  For performance, tighter integration with tech stack  Value added services such as monitoring, alerts, tuning and common tools
  7. 7. On-Premise or Public Cloud – Selection Criteria 7 1  Fixed, does not vary with utilization  Favors scale and 24x7 centralized ops  Variable with usage  Favors run and done, decentralized ops Cost  Aggregated from disparate or distributed sources  Typically generated and stored in the cloud Data  Job queue, cap. sched., BCP, catchup  Controlled latency and throughput  No guarantees (beyond uptime) without provisioning additional resources SLA  Control over deployed technology  Requires platform team/ vendor support  Little to no control over tech stack  No need for platform R&D headcount Tech Stack  Shared env., control over data /movement, PII, ACLs, pluggable security  Data typically not shared among users in the cloud Security  Matters, complex to develop and operate  Does not matter, clusters are dynamic/ virtual and dedicated Multi- tenancy On-Premise Public CloudCriteria
  8. 8. On-Premise or Public Cloud – Evaluation 8 1 On-Premise Public Cloud Cost Data SLA Tech Stack Security Multi-tenancy
  9. 9. On-Premise or Public Cloud – Utilization Matters 9 1 Utilization / Consumption (Compute and Storage) Cost($) On-premise Hadoop as a Service On-demand public cloud service Terms-based public cloud service Favors on-premise Hadoop as a Service Favors public cloud service x x Current and expected or target utilization can provide further insights into your operations and cost competitiveness Highstartingcost Scalingup Crossover point 1
  10. 10. Total Cost Of Ownership (TCO) – Components 10 2 $2.1 M 60% 12% 7% 6% 3% 2% 6 5 4 3 2 1 7 10% Operations Engineering  Headcount for service engineering and data operations teams responsible for day-to-day ops and support 6 Acquisition/ Install (One-time)  Labor, POs, transportation, space, support, upgrades, decommissions, shipping/ receiving etc. 5 Network Hardware  Aggregated network component costs, including switches, wiring, terminal servers, power strips etc. 4 Active Use and Operations (Recurring)  Recurring datacenter ops cost (power, space, labor support, and facility maintenance 3 R&D HC  Headcount for platform software development, quality, and release engineering 2 Cluster Hardware  Data nodes, name nodes, job trackers, gateways, load proxies, monitoring, aggregator, and web servers 1 Monthly TCOTCO Components Network Bandwidth  Data transferred into and out of clusters for all colos, including cross-colo transfers 7 ILLUSTRATIVE
  11. 11. Total Cost Of Ownership (TCO) – Unit Costs (Hadoop) 11 2 Container memory where apps perform computation and access HDFS if needed Container CPU cores used by apps to perform computation / data processing Network bandwidth needed to move data into/out of the clusters by the app $ / GB-Hour (H 2.0+) GBs of Memory available for an hour Monthly Memory Cost Avail. Memory Capacity $ / vCore-Hour (H 2.6+) vCores of CPU available for an hour Monthly CPU Cost Avail. CPU vCores Unit Total Capacity Unit Cost $ / GB of data stored Usable storage space (less replication and overheads) Monthly Storage Cost Avail. Usable Storage $ / GB for Inter-region data transfers Inter-region (peak) link capacity Monthly BW Cost Monthly GB In + Out Files and directories used by the apps to understand/ limit the load on NN) HFDS (usable) space needed by an app with default replication factor of three
  12. 12. Total Cost Of Ownership (TCO) – Consumption Costs 12 2 Map GB-Hours = GB(M1) x T(M1) + GB(M2) x T(M2) + … Reduce GB-Hours = GB(R1) x T(R1) + GB(R2) x T(R2) + … Cost = (M + R) GB-Hour x $0.002 / GB-Hour / Month = $ for the Job/ Month (M+R) GB-Hours for all jobs can summed up for the month for a user, app, BU, or the entire platform Monthly Job and Task Cost Monthly Roll- ups Map vCore-Hours = vCores(M1) x T(M1) + vCores(M2) x T(M2) + … Reduce vCore-Hours = vCores(R1) x T(R1) + vCores(R2) x T(R2) + … Cost = (M + R) vCore-Hour x $0.002 / vCore-Hour / Month = $ for the Job/ Month (M+R) vCore-Hours for all jobs can summed up for the month for a user, app, BU, or the entire platform / project (app) quota in GB (peak monthly used) / user quota in GB (peak monthly used) / data as each user accountable for their portion of use. For e.g. GB Read (U1) GB Read (U1) + GB Read (U2) + … Roll-ups through relationship among user, file ownership, app, and their BU Bandwidth measured at the cluster level and divided among select apps and users of data based on average volume In/Out Roll-ups through relationship among user, app, and their BU
  13. 13. Hardware Configuration – Physical Resources 13 3 . . . . Datacenter 1 Rack 1 Rack N . . Clusters in Datacenters Server Resources C-nn / 64,128,256 G / 4000, 6000 etc.
  14. 14. Hardware Configuration – Eventual Heterogeneity 14 3 24 G 8 cores SATA 0.5 TB 48 G 12 cores SATA 1.0 TB 64 G Harpertown SATA 2.0 TB 128 G Sandy Bridge SATA 3.0 TB 192 G Ivy Bridge SATA 4.0 TB 256 G Haswell SATA 6.0 TB 384 G  Heterogeneous Configurations: 10s of configs of data nodes (collected over the years) without dictating scheduling decisions – let the framework balance out the configs  Heterogeneous Storage: HDFS supports heterogeneous storage (HDD, SSD, RAM, RAID etc.) – HDFS-2832, HDFS-5682  Heterogeneous Scheduling: operate multiple purpose hardware in the same cluster (e.g. GPUs) – YARN 796
  15. 15. Network – Common Backplane 15 4 DataNode NodeManager NameNode RM DataNodes RegionServers NameNode HBase Master Nimbus Supervisor Administration, Management and Monitoring ZooKeeper Pools HTTP/HDFS/GDM Load Proxies Applications and Data Data Feeds Data Stores Oozie Server HS2/ HCat Network Backplane
  16. 16. Network – Bottleneck Awareness 16 4 Hadoop Cluster (Data Set 1) Hadoop Cluster (Data Set 2) HBase Cluster (Low-latency Data Store) Storm Cluster (Real-time / Stream Processing) Large dataset joins or data sharing over network 1 Large extractions may saturate the network 2 Fast bulk updates may saturate the network 3 Large data copies may not be possible 4
  17. 17. Network – 1/10G BAS (Rack Locality Not A Major Issue) 17 4 RSW … … … N x RSW RSW BAS1-1 BAS1-2 FAB 1 FAB 2 FAB 3 FAB 4 FAB 5 FAB 6 FAB 7 FAB 8 L3 Backplane RSW … … … N x RSW RSW BAS8-1 BAS8-2 L3 Backplane … 1 Gbps 2:1 oversubscription 10 Gbps 8 x 10 Gbps Fabric Layer 48 racks, 15,360 hosts SPOF!
  18. 18. Network –10G CLOS (Server Placement Not an Issue) 18 4 Spine 1 Leaf 1 Spine 2 Leaf 2 Leaf 3 Leaf 4 Spine 15 Leaf 29 Leaf 30 Leaf 31 Spine 0 Leaf 0 . . . . . . Virtual Chassis 0 Spine 1 Leaf 1 Spine 2 Leaf 2 Leaf 3 Leaf 4 Spine 15 Leaf 29 Leaf 30 Leaf 31 Spine 0 Leaf 0 . . . . . . Virtual Chassis 1 RSW N x RSW RSW 10 Gbps 5:1 oversubscription 16 spines, 32 leafs 2 x 40 Gbps 512 racks, 20,480 hosts SPOF!
  19. 19. Network – Gen Next 19 4 Source: http://www.opencompute.org
  20. 20. Software Stack – Where are We Today 20 5 Compute Services Storage Infrastructure Services Hive (0.13) Pig (0.11, 0.14) Oozie (4.4) HDFS Proxy (3.2) GDM (6.4) YARN (2.6) MapReduce (2.6) HDFS (2.6) HBase (0.98) Zookeeper Grid UI (SS/Doppler, Discovery, Hue 3.7) Monitoring Starling Messaging Service HCatalog (0.13) Storm (0.9.2) Spark (1.3) Tez (0.6)
  21. 21. Software Stack – Obsess With Use Cases, Not Tech 21 5 HDFS (File System) YARN (Scheduling, Resource Management) Common In- progress, Unmet needs or Apache Alignment Platformized Tech with Production Support RHEL6 64-bit, JDK8
  22. 22. Security and Account Management – Overview 22 6 Grid Identity, Authentication and Authorization User Id SSO Groups, Netgroups, Roles RPC (GSSAPI) UI (SPNEGO)
  23. 23. Security and Account Management – Flexibly Secure 23 6 Kerb Realm 2 (Users) Kerb Realm 1 (Projects, Services) IdP SP CLIENTS CORP PROD Auth User SSO Netgroups Hadoop RPC Delegation tokens Block tokens Job tokens Grid
  24. 24. Data Lifecycle Management and BCP 24 7 Acquisition Replication (Feeds)Source Retention (Policy based Expiration) Archival (Tape Backup) DataOut Data Lifecycle Datastore Datastore defines a data source/target (e.g. HDFS) Dataset Defines the data flow of a feed Workflow Defines a unit of work carried out by acquisition, replication, retention servers for moving an instance of a feed
  25. 25. Data Lifecycle Management and BCP 25 7 MetaStore Cluster 1 - Colo 1 HDFS Cluster 2 – Colo 2 HDFS Grid Data Management Feed Acquisition MetaStore Feed datasets as partitioned external tables Growl extracts schema for backfill HCatClient. addPartitions(…) Mark LOAD_DONE HCatClient. addPartitions(…) Mark LOAD_DONE Partitions are dropped with (HCatClient.dropPartitions(…)) after retention expiration with a drop_partition notification add_partition event notification add_partition event notification Acquisition Archival, Dataout Retention Feed Replication
  26. 26. Metering, Audit, and Governance 26 8 Starling FS, Job, Task logs Cluster 1 Cluster 2 Cluster n... CF, Region, Action, Query Stats Cluster 1 Cluster 2 Cluster n... DB, Tbl., Part., Colmn. Access Stats ...MS 1 MS 2 MS n GDM Data Defn., Flow, Feed, Source F 1 F 2 F n Log Warehouse Log Sources
  27. 27. Metering, Audit, and Governance 27 8 Data Discovery and Access Public Non-sensitive Financial Restricted $ Governance Classification No addn. reqmt. LMS Integration Stock Admin Integration Approval Flow
  28. 28. Integration with External Systems 28 9 BI, Reporting, Transactional DBs Hadoop Customers … DH Cloud Messaging Serving Systems Monitoring, Tools, Portals Infrastructure in Transition
  29. 29. Debunking Myths 29 10 Hadoop isn’t enterprise ready Hadoop isn’t stable, clusters go down You lose data on HDFS Data cannot be shared across the org NameNodes do not scale Software upgrades are rare✗ Hadoop use cases are limited I need expensive servers to get more Hadoop is so dead I need Apache this vs. that ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗
  30. 30. Thank You @sumeetksingh @rajivec Yahoo Kiosk #D5 We are Hiring!

×