Upcoming SlideShare
×

# Costing your Bug Data Operations

910 views
773 views

Published on

1 Comment
13 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• great !

Are you sure you want to  Yes  No
Views
Total views
910
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
0
1
Likes
13
Embeds 0
No embeds

No notes for slide
• (3 min)
You can also benchmark your costs. There are two approaches here. One is practical where you take your workloads to a public cloud and see what the bill was. It wasn’t possible for me, so I took a theoretical approach. It is not accurate, but works well for the purposes of understanding your costs. Take the used portion of your monthly capacity and calculate the equivalent quantity of compute and storage needed. For the quantity equivalent, calculate your cost based on unit pricing from the cloud provider. Now, you should be able to compare your fixed cost with the pay-for-use pricing of a cloud.
• (3 min)
You can also benchmark your costs. There are two approaches here. One is practical where you take your workloads to a public cloud and see what the bill was. It wasn’t possible for me, so I took a theoretical approach. It is not accurate, but works well for the purposes of understanding your costs. Take the used portion of your monthly capacity and calculate the equivalent quantity of compute and storage needed. For the quantity equivalent, calculate your cost based on unit pricing from the cloud provider. Now, you should be able to compare your fixed cost with the pay-for-use pricing of a cloud.
• ### Costing your Bug Data Operations

1. 1. Costing Your Big Data Operations P R E S E N T E D B Y S u m e e t S i n g h , Am r i t L a l ⎪ J u n e 5 , 2 0 1 4 2 0 1 4 Ha d o o p Su mmi t , Sa n J o s e , Ca l i f o r n i a
2. 2. Introduction 2  Product Manager at Yahoo engaged in building high class and robust Hadoop infrastructure services  Eight years of experience across HSBC, Oracle and Google in developing products and platforms for high growth enterprises  MBA from Carnegie Mellon University  Manages Hadoop products team at Yahoo!  Responsible for Product Management, Strategy and Customer Engagements  Managed Cloud Services products team and headed Strategy functions for the Cloud Platform Group at Yahoo  MBA from UCLA and MS from Rensselaer Polytechnic Institute (RPI) Sumeet Singh Senior Director, Product Management Hadoop and Big Data Platforms Cloud Engineering Group 701 First Avenue, Sunnyvale, CA 94089 USA @sumeetksingh Amrit Lal Product Manager Hadoop and Big Data Platforms Cloud Engineering Group 701 First Avenue, Sunnyvale, CA 94089 USA @amritasshwar 2014 Hadoop Summit, San Jose, California
3. 3. Agenda 3 Total Cost of Ownership (TCO) Models1 Deeper Understanding of (Resource) Usage P&L, Metering and Billing Provisions Benchmark Costs Improve Utilization and ROI 2 3 4 5 2014 Hadoop Summit, San Jose, California
4. 4. Why do Costing? 4 Profitability Understanding the data services costs (an element of your total project cost) to determine how profitable the project is ROI Investment decisions both at the platform and app / project level Operational Efficiency Benchmark, improve ops by focusing on avg. utilization, increasing the # hosted apps, storage efficiencies, job performance etc. Planning Capital planning and budgeting, product improvements Cost Transparency Metering / usage metrics, billing, chargeback / showback, P&L 2014 Hadoop Summit, San Jose, California
5. 5. Costing is Relevant Irrespective of the Service Model 5 Private Cloud Public Cloud  Fixed costs that favors scale and 24x7 operations  Centralized operations  Multi-tenant clusters with security and data sharing  Cost a function of desired SLA  Utilization and # hosted apps a primary lever  Tenants often tend to ignore costs  Variable with usage and favors a run and done model  Decentralized operations, ops / headcount costs still relevant  Dedicated virtual clusters  Monthly bills!  Releasing cluster instances, when not needed, a wise idea  Users often overlook the peripheral costs 2014 Hadoop Summit, San Jose, California
6. 6. 0 50 100 150 200 250 300 350 400 450 500 0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 2006 2007 2008 2009 2010 2011 2012 2013 2014 RawHDFSStorage(inPB) NumberofServers Year Servers Storage Important with Multi-tenancy and Scale 6 Yahoo! Commits to Scaling Hadoop for Production Use Research Workloads in Search and Advertising Production (Modeling) with machine learning & WebMap Revenue Systems with Security, Multi-tenancy, and SLAs Open Sourced with Apache Hortonworks Spinoff for Enterprise hardening Nextgen Hadoop (H 0.23 YARN) New Services (HBase, Storm, Hive etc. Increased User-base with partitioned namespaces Apache H 2.x (Low latency, Util, HA etc.) 2014 Hadoop Summit, San Jose, California
7. 7. 272 330 382 495 525 260 310 360 410 460 510 560 Q1-11 Q2-11 Q3-11 Q4-11 Q1-12 Q2-12 Q3-12 Q4-12 Q1-13 Q2-13 Q3-13 Q4-13 Q1-14 Hosted Apps Growth on Apache Hadoop 7 NumberofNewProjects New Customer Apps On-boarded 58 projects in 2011 52 projects in 2012 113 projects in 2013 2014 Hadoop Summit, San Jose, California
8. 8. Multi-tenant Apache HBase Growth 8 1140 33.6 PB 0 5 10 15 20 25 30 35 40 0 200 400 600 800 1000 1200 Q1-13 Q2-13 Q3-13 Q4-13 Q1-14 DataStored(inPB) NumberofRegionServers Zero to “20” Use Cases (60,000 Regions) in a Year Region Servers Storage 2014 Hadoop Summit, San Jose, California
9. 9. 760 175 0 20 40 60 80 100 120 140 160 180 200 0 100 200 300 400 500 600 700 800 Q1-13 Q2-13 Q3-13 Q4-13 Q1-14 NumberofTopologies NumberofSupervisors Supervisor Topologies Multi-tenant Apache Storm Growth 9 Zero to “175” Production Topologies in a Year Multi-tenancy Release 2014 Hadoop Summit, San Jose, California
10. 10. Capital Deployment for Big Data Infrastructure 10 DataNode NodeManager NameNode RM DataNodes RegionServers NameNode HBase Master Nimbus Supervisor Administration, Management and Monitoring ZooKeeper Pools HTTP/HDFS/GDM Load Proxies Applications and Data Data Feeds Data Stores Oozie Server HS2/ HCat Network Backplane 2014 Hadoop Summit, San Jose, California
11. 11. Big Data Platforms Technology Stack at Yahoo 11 Compute Services Storage Infrastructure Services HivePig Oozie HDFS ProxyGDM YARN MapReduce HDFS HBase Zookeeper Support Shop Monitoring Starling Messaging Service HCatalog Storm SparkTez 2014 Hadoop Summit, San Jose, California
12. 12. Resources Consumed in Big Data Operations 12 . . . . Colo 1 Rack 1 Rack N . . Clusters in Datacenters Server Resources 2014 Hadoop Summit, San Jose, California
13. 13. Elements of a TCO Model 13 \$2.1 M 60% 12% 7% 6% 3% 2% 6 5 4 3 2 1 7 10% Operations Engineering  Headcount for service engineering and data operations teams responsible for day-to-day ops and support 6 Acquisition/ Install (One-time)  Labor, POs, transportation, space, support, upgrades, decommissions, shipping/ receiving etc. 5 Network Hardware  Aggregated network component costs, including switches, wiring, terminal servers, power strips etc. 4 Active Use and Operations (Recurring)  Recurring datacenter ops cost (power, space, labor support, and facility maintenance 3 R&D HC  Headcount for platform software development, quality, and release engineering 2 Cluster Hardware  Data nodes, name nodes, job trackers, gateways, load proxies, monitoring, aggregator, and web servers 1 Monthly TCOTCO Components Network Bandwidth  Data transferred into and out of clusters for all colos, including cross-colo transfers 7 ILLUSTRATIVE 2014 Hadoop Summit, San Jose, California
14. 14. Understanding Apache Hadoop Resources 14 Task 1 Task 2 Task 3 2014 Hadoop Summit, San Jose, California NameNode Resource Manager DFS Blocks DFS Blocks DataNode Node Manager MR Containers MR Containers Storage and Compute MapReduce and Memory . . . . . .
15. 15. Unit Costs for Hadoop Operations 15 Containers where apps can perform computation and access HDFS if needed HFDS (usable) space needed by an app with default replication factor of three Network bandwidth needed to move data into/out of the clusters by the app Files and directories used by the apps to understand/ limit the load on NN \$ / GB-Hour (H 0.23/2.0) GBs of Memory available for an hour Monthly Compute Cost Avail. Compute Capacity \$ / GB Stored Usable storage space (less replication and overheads) Monthly Storage Cost Avail. Usable Storage Unit Total Capacity Unit Cost \$ / GB for Inter-region data transfers Inter-region (peak) link capacity [Monthly GB In + Out] x \$ / GB N/A N/A N/A 2014 Hadoop Summit, San Jose, California
16. 16. Working Through A Hadoop Example 16 Monthly TCO (less bw.) = \$2 M Compute @ 50% = \$1 M 315 TB memory == 315 TB x 24 x 30 = 227 M GB-Hours \$1 M/ 227 M GB-Hours = \$0.004 / GB-Hour / Month Monthly TCO (less bw.) = \$2 M Storage @ 50% = \$1 M RAW HDFS = 200 PB Usable HDFS == [ 200 x 0.8 (20% overhead) ] / 3 = 53.3 PB \$ 1 M / 53.3 PB = \$ 0.019 / GB / Month Monthly Cost Monthly Capacity Unit Cost Monthly Charges = \$0.1 M Total Data In + Out = 5 PB \$ 0.1 M / 5 PB = \$ 0.02/ GB transferred 2014 Hadoop Summit, San Jose, California ILLUSTRATIVE
17. 17. Measuring Hadoop Resource Consumption 17 Map GB-Hours = GB(M1) x T(M1) + GB(M2) x T(M2) + … Reduce GB-Hours = GB(R1) x T(R1) + GB(R2) x T(R2) + … Cost = (M + R) GB-Hour x \$0.004 / GB-Hour / Month = \$ for the Job/ Month (M+R) GB-Hours for all jobs can summed up for the month for a user, app, BU, or the entire platform Monthly Job and Task Cost Monthly Roll- ups / project (app) directory quota in GB (peak monthly storage used) / user directory quota in GB (peak monthly storage used) / data is accounted for as each user accountable for their portion of use. For e.g. GB Read (U1) GB Read (U1) + GB Read (U2) + … Roll-ups through relationship among user, file ownership, app, and their BU Bandwidth measured at the cluster level and divided among select apps and users of data based on average volume In/Out Roll-ups through relationship among user, app, and their BU 2014 Hadoop Summit, San Jose, California
18. 18. Measuring Hadoop Resource Consumption 18 2014 Hadoop Summit, San Jose, California queue 2 queue 1 queue 3 queue 4 queue 5 queue 6 queue 7 queue 8 queue 11 queue 9 queue 10
19. 19. Measuring Hadoop Resource Consumption 19 2014 Hadoop Summit, San Jose, California SLA Dashboard on Hadoop Analytics Warehouse
20. 20. Putting it Together for Hadoop Services 20 BU HDFS (Storage) Compute Network Bandwidth Total Cost (\$ M)Used (PB) Effective Used (PB) Cost (\$ M) Used (GB-hour) Cost (\$ M) Transferred (GB) Cost (\$ M) BU1 15 PB 3.45 PB \$0.065 12.5 M \$0.05 1.25 PB \$0.025 \$0.15 M BU2 10 PB 2.65 PB \$0.05 6.25 M \$0.025 0.5 PB \$0.01 \$0.085 M … …. … … … … … … BU N … … … … … … ... Total 148 PB 39.5 PB \$0.75 M 125 M \$0.5 M 5 PB \$0.1 M \$1.35 M Resource Unit Aggregated / Measured Cost HDFS (Storage) GB Monthly, Peak storage used \$ 0.019/GB Compute Map-Reduce GB Hours Number of GBs used by mappers and reducers and hours they ran for \$ 0.004/GB-Hour Network Bandwidth GB Monthly, total in /out \$ 0.02/GB Hadoop Services Billing Rate Card [ Monthly Rates ] Monthly Bill for May 2014 2014 Hadoop Summit, San Jose, California ILLUSTRATIVE
21. 21. Multi-Tenant Deployment For Apache HBase 21 Region Server M X:Table:Region M Y:Table:Region M … Z:Table:Region M Region Server N X:Table:Region N Y:Table:Region N … Z:Table:Region N Projects X,Y & Z RegionServerJVMHDFSReads/Writes Shared Region Servers Region Server 2 X:Table:Region 2 Y:Table:Region 2 … Z:Table:Region 2 … HMaster Zookeeper Region Server 1 X:Table:Region 1 Y:Table:Region 1 … Z:Table:Region 1 2014 Hadoop Summit, San Jose, California
22. 22. Understanding Apache HBase Resources 22 X:Table:Region 1 Y:Table:Region M … Regionlevel Reads/Writes HFile HFile HFile HDFS Storage (Disk)RegionServer JVM (Heap) Z:Table:Region N … 2014 Hadoop Summit, San Jose, California Total Reads @ RS Reads (Table X: Reg 1 + Table X: Reg 2 + … + Table Z: Reg N) Read Share (X) Total Table X Total Table (X, Y, Z) Total Writes @ RS Writes (Table X: Reg 1 + Table X: Reg 2 + … + Table Z: Reg N) Total Table Data @ RS Table X: Reg 1 + Table X: Reg 2 + … + Table Z: Reg N Write Share (X) Total Table X Total Table (X, Y, Z) Reads Writes Data Stored
23. 23. Unit Costs for HBase Operations 23 Write Operations performed on Region Server while writing to individual table regions Read Operations performed on Region Server while reading from individual table regions HFDS (usable) space needed by table region’s HFiles with default replication factor Network bandwidth needed to move data in to/out of the clusters by clients \$ / 1000 Writes Total Write operations across Region Servers Monthly Write TCO Total Write Ops (K) \$ / 1000 Reads Total Read operations across Region Servers Monthly Read TCO Total Read Ops (K) Unit Total Capacity Unit Cost \$ / GB Stored Usable storage space (less replication and overheads) Monthly Storage Cost Avail. Usable Storage \$ / GB for Inter-region data transfers Inter-region (peak) link capacity Monthly GB [In + Out] x \$ / GB 2014 Hadoop Summit, San Jose, California
24. 24. Working Through An HBase Example 24 Monthly TCO (less bw.) = \$60 K Write Serving @ 25% = \$15 K Total Write operations across Region Servers = 100 M \$ 15 K / 100 M = \$0.15 per 1000 writes per month Monthly TCO (less bw.) = \$60 K Write Serving @ 25% = \$15 K Total Read operations across Region Servers = 200 M \$ 15 K / 200 M = \$0.075 per 1000 reads per month Monthly Cost Monthly Capacity Unit Cost Monthly TCO (less bw.) = \$60 K Storage @ 50% = \$30 K RAW HDFS = 10 PB Usable HDFS == [ 10 x 0.8 (20% overhead) ] / 3 = 2.67 PB \$ 30 K / 2.67 PB = \$ 0.011 / GB / Month 2014 Hadoop Summit, San Jose, California Monthly Charges = \$5 K Total Data In + Out = 0.25 PB \$ 5 K / 0.25 PB = \$ 0.02 / GB transferred ILLUSTRATIVE
25. 25. Measuring HBase Resource Consumption 25 Write Ops per Region Server per Table Region = #W(R1:RS1)+#W(R2:RS1)+ … Cost = Total Writes x \$0.15 /1000 writes/month =\$ for the Table/RS/Month Write Ops cost for all tables across all region servers for a user ,app, BU or the platform Read Ops per Region Server per Table Region = #R(R1:RS1)+#R(R2:RS1)+ … Cost = Total Reads x \$0.075 /1000 writes/month =\$ for the Table/RS/Month Read Ops cost for all tables across all region servers for a user ,app, BU or the platform Monthly HBase Project Cost Monthly Roll- ups HDFS size of regions under hbase/table/<regions> in GBs Cost = Total HDFS size x \$ 0.011 / GB / Month =\$ for the Table/Month Total HDFS size for all tables across all region servers for a user ,app, BU or the platform 2014 Hadoop Summit, San Jose, California Bandwidth measured at the cluster level and divided among select apps and users of data based on average volume In/Out Roll-ups through relationship among user, app, and their BU
26. 26. Putting it Together for HBase Services 26 Resource Unit Aggregated / Measured Cost Write Operations Count of operations Monthly, Total write operations across regions of table \$ 0.15 / 1000 Writes Read Operations Count of operations Monthly, Total read operations across regions of table \$ 0.075 / 1000 Reads HDFS (Storage) GB Monthly, Peak storage used \$ 0.011 / GB Network Bandwidth GB Monthly, total in /out \$ 0.02 / GB HBase Services Billing Rate Card [ Monthly Rates ] Monthly Bill for May 2014 BU Write Operations Read Operations HDFS (Storage) Network Bandwidth Total Cost (\$ K)Count (M) Cost (\$ K) Count (M) Cost (\$ K) Used (PB) Effective Used (PB) Cost (\$ K) Transferred (PB) Cost (\$ K) BU 1 30 M \$ 4.5 20 M \$ 1.5 3 PB 0.8 PB \$ 8.80 1.25 PB \$ 0.025 \$ 14.82 BU 2 10 M \$ 1.5 60 M \$ 4.5 1 PB 0.27 PB \$ 2.93 0.5 PB \$ 0.01 \$ 8.94 … …. … … … … … BU N … … … … … ... Total 100 M \$ 15 200 M \$ 15 10 PB 2.67PB \$ 29.4 0.25 PB \$ 5 \$ 64.4 2014 Hadoop Summit, San Jose, California ILLUSTRATIVE
27. 27. Multi-Tenant Deployment For Apache Storm 27 Topologies X,Y & Z SharedSupervisors NimbusZookeeper Supervisor M X: Worker M Y: Worker M … Z: Worker M Supervisor N X: Worker N Y: Worker N … Z: Worker N Supervisor 2 X: Worker 2 Y: Worker 2 … Z: Worker 2 … Supervisor 1 X: Worker 1 Y: Worker 1 … Z: Worker 1 2014 Hadoop Summit, San Jose, California
28. 28. Understanding Apache Storm Resources 28 Topology A : Worker Task Task Task Task Supervisor FixedWorkerSlots  Supervisor runs one or worker processes for one or more topologies  Each Supervisor have fixed number of worker slots  A worker process belongs to a specific topology  The workers from topologies are distributed randomly on supervisor  Tasks perform the actual data processing Topology B : Worker Task Task Task Task 2014 Hadoop Summit, San Jose, California
29. 29. \$ / Slot-Hour Total number of slots Monthly Slots Used Avail. Slots Unit Costs for Storm Operations 29 Worker Slots where topology workers execute the actual logic / tasks of spout and bolts in parallel Network bandwidth needed to move data into/out of the clusters by topologies Unit Total Capacity Unit Cost 2014 Hadoop Summit, San Jose, California \$ / GB for Inter-region data transfers Inter-region (peak) link capacity [Monthly GB In + Out] x \$ / GB
30. 30. Monthly TCO (less bw.) = \$30 K 24 Slots Per Supervisors@100% = \$30 K 19.2 K Slots = 19.2 K x 24 x30 = 13.8 M Slot Hours \$ 30 K / 13.8 M Slot-Hours = \$0.002 / Slot-Hour / Month Working Through a Storm Example 30 Monthly Cost Monthly Capacity Unit Cost 2014 Hadoop Summit, San Jose, California Monthly Charges = \$2.5 K Total Data In + Out = 0.12 PB [\$ 2.5 K / 0.12 PB = \$ 0.02/ GB transferred ILLUSTRATIVE
31. 31. Worker Slot-Hours for Topologies = #W(TP1) x T(TP1) + #W(TP2) x T(TP2) + … Cost = Worker Slot-Hours x \$0.002 / Slot-Hour / Month = \$ for the Topology / Month Worker Slot-Hours for all Topologies can be summed up for the month for a user, app, BU, or the entire platform Measuring Storm Resource Consumption 31 Monthly Cost Monthly Roll- ups 2014 Hadoop Summit, San Jose, California Bandwidth measured at the cluster level and divided among select apps and users of data based on average volume In/Out Roll-ups through relationship among user, app, and their BU ILLUSTRATIVE
32. 32. Putting it Together for Storm Services 32 BU Compute Network Bandwidth Total Cost (\$ K)Used (Slot hour) Cost (\$ K) Transferred (PB) Cost (\$ K) BU1 2.5 M \$ 5 0.02 PB \$ 0.4 \$ 5.4 BU2 1.25 M \$ 2.5 0.04 PB \$ 0.8 \$ 3.3 … … … … … BU N … … … ... Total 10 M \$ 20 0.12 PB \$ 2.4 K \$ 22.4 Resource Unit Aggregated / Measured Cost Compute Worker Slot Hours Number of slots used by Topology workers and hours they ran for \$ 0.002/Slot-Hour Network Bandwidth GB Monthly, total in /out \$ 0.02/GB Storm Services Billing Rate Card [ Monthly Rates ] Monthly Bill for May 2014 ILLUSTRATIVE 2014 Hadoop Summit, San Jose, California
33. 33. Project Based Costing for Grid Services 33 Project Summary Period Cost (K) Grid Services Cost May 2014 \$ 165.5 K Project Usage Details (Data Center DC1) Usage Cost (K) Apache Hadoop Services \$ 126 K Compute (Map & Reduce GB-Hours consumed @ \$0.004/GB-Hour) 12.5 M \$ 50 K Storage (GBs of peak storage used @ \$ 0.019/GB) 3.45 PB \$ 66 K Network (GBs In/Out @ \$0.02/GB) 0.5 PB \$ 10 K Apache HBase Services \$ 34.1 K Reads (Number of Read Operations @ \$0.075/1000 Reads) 30 M \$ 2.2 K Writes (Number of Write Operations @ \$0.15/1000 Writes) 20 M \$ 3.0 K Storage (GBs of peak storage used @ \$ 0.011/GB) 2.45 PB \$26.9 K Network (GBs In/Out @ \$0.02/GB) 0.1 PB \$2 K Apache Storm Services \$ 5.4 K Compute (Slot Hours consumed @ \$ 0.002/Slot-Hour) 2.5 M \$ 5 K Network (GBs In/Out @ \$0.02/GB) 0.02 PB \$ 0.4 K ILLUSTRATIVE 2014 Hadoop Summit, San Jose, California
34. 34. Platform P&L 34 Line Item Q4’12 Q1’13 Q2’13 Q3 ’13 Total Total % Y! Gross Revenues Cost of revenues (less Grid CapEx) Gross Profit Grid OpEx R&D Headcount SE&O Headcount Acquisition/Install Active Use/ Ops Network Bandwidth Total Gird OpEx Grid CapEx Grid Services Total Grid CapEx Contribution Margin Indirect Costs G&A Sales and Marketing ILLUSTRATIVE 2014 Hadoop Summit, San Jose, California LEFT BLANK ON PURPOSE
35. 35. Hadoop Cost Benchmarking – An Approach 35 Monthly Used Unused Total Public Pricing or Terms-based (Used On-Premise Eqv.) M/R 71.4 M 61.6 M 133 M Compute Instances (normalized time, RAM, 32/64 ops, I/O etc.) 1,000 instances/ hr. HDFS 148 PB 52 PB 200 PB Storage (account for 3x repl., job/ app space) 30 PB/ month Avg. Data Processe d - - 75 PB Instance Storage 2.5 PB daily M/R \$0.50 M \$0.50 M \$1 M 1,000 x \$0.70/ instance/ hr. x 24 x 30 \$0.5 M HDFS \$0.75 M \$0.25 M \$1 M 30 PB x \$0.04/GB/month \$1.2 M Other Costs (if any) such as reads, writes, data services/ hour etc. \$0.25 M Total * \$1.25 M \$0.75 M \$2 M Total \$ 1.95 M Quantity equivalent Cost equivalent 2014 Hadoop Summit, San Jose, California * Ignored bandwidth, assumed equivalent ILLUSTRATIVE
36. 36. HBase and Storm Cost Benchmarking 36 Total Public Pricing or Terms-based (Used On-Premise Eqv.) Reads Peak concurrent reads for a given record size 300 MB/s Reads on chosen instances (benchmarks 45MB/s) 300/45 = 7 instances Writes Peak concurrent writes for a given record size 160 MB/s Writes on chosen instances (benchmarks 10MB/s) 160/10 = 16 instances Storage Data storage in tables (incl. replication) 1.6 TB Data served per instance (benchmarks 0.5 TB incl. repl.) 1.6/0.5 = 3 Cost calculations stay the same as Hadoop. Instances required based on thru-put and storage needs 16 instances/ hour Slots- Hours Slot hours per month 2.5M Instance hours based on memory and CPU requirements (12 slots / instance) 0.21 M instance hours Cost calculations stay the same as Hadoop. Quantity equivalent 2014 Hadoop Summit, San Jose, California * Ignored bandwidth, assumed equivalent ILLUSTRATIVE Quantity equivalent
37. 37. Improving Utilization favors on-premise setup 37 Utilization / Consumption (Compute and Storage) Cost(\$) On-premise Hadoop as a Service On-demand public cloud service Terms-based public cloud service Favors on-premise Hadoop as a Service Favors public cloud service x x Sensitivity analysis on costs based on current and expected utilization or target utilization can provide further insights into your operations and cost competitiveness Highstartingcost Scalingup 2014 Hadoop Summit, San Jose, California
38. 38. Improving Utilization improves ROI 38 Time CostAmortizedoverApps(\$) Phase I 2012 – 2013 (H 0.23) 2014 & Future Time = t Time = t’ Cost (t) = C Cost (t’)= C’ # App continue to grow on the Platform At time t, BU profits are R (t) – C(t) = π (t) Platform’s goal is to continue to increase the ROI while supporting new technology and services R (t’) – C (t’) = π (t’), where C (t’) < C (t) and π (t’) > π (t) for same or bigger revenues. 2014 Hadoop Summit, San Jose, California
39. 39. Going Forward 39 2014 Hadoop Summit, San Jose, California Hadoop HBase Storm  CPU as a resource  Pre-emption and priority  Long-running jobs  Other potential resources such as disk, network, GPUs etc.  Tez as the execution engine / Container reuse  Multiple Region Servers per node  Larger JVMs / GC improvements  HBase-on-YARN  cgroup profiles  Storm-on-YARN  Resource aware scheduling (memory, CPU, network)  cgroup profiles  More experience with multi-tenancy
40. 40. Thank You @sumeetksingh @amritasshwar We are hiring! Stop by Kiosk P9 or reach out to us at bigdata@yahoo-inc.com.