Your SlideShare is downloading. ×
Hadoop Summit - 2014
Cost of Ownership for Hadoop
Implementation
Santosh Jha,
Steve Ackley
Part 1 – Estimating TCO
Iceberg
Estimating TCO is hard.
Like an iceberg, many
costs are hidden.
Example :
integration of Big Data
within the exist...
Hadoop Implementations
Hadoop deployment methods
Sample Vendors
Hortonworks IBM, EMC AWS EMR
Cloudera Oracle, Teradata Rac...
On-Premise Cost Categories
Cost Group Item
Hardware/Infrastructure Costs Servers , Peripherals, Network
Storage
Communicat...
Managing Risk
Cost Group Item
Vendor Vendor Viability
Control on Technical Architecture
Data Protection
Loss of Intellectu...
Sample calculation
Inputs
Average Monthly HDFS (TB) 1500
Peak HDFS over Monthly (TB) 100
Monthly HDFS Growth (TB) 20
Avera...
Results without considering risk
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
8,000,000
Hadoop ...
Managing Risk (Vendor) – Sample data
Managing Risk Risk Factor Weight(%) Calculated Risk
Vendor Viability 2 40 0.8
Control...
Managing Risk (Internal IT – Sample data)
Managing Risk Risk Factor Weight(%) Calculated Risk
Vendor Viability 1 40 0.4
Co...
Results after considering risk
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
Hadoop as a
service
On Premise...
Part 2 - Deployment
Considerations
On-Premise Implementation – When?
• Well-defined use cases with a demonstrated ROI
• Developed and tuned Hadoop applicatio...
On-Premise Implementation – Company Profile
• Large enterprise with a strategic need for Big Data
Analytics
• Moved from a...
Hadoop as a Service – The Continuum
• Vendors manage the hardware
• Vendors install hadoop
• Vendors manage hadoop
Vendors Manage The Hardware
For Organizations that:
• Want to create a small cluster for a relatively
short period of time...
Vendors Install Hadoop
For Organizations that:
• Have a short-term need or small-scale Hadoop
requirement.
• Have Hadoop a...
Vendors Manage Hadoop
For Organizations that:
• Do not have the IT organization that can install,
manage, maintain and ope...
19
Thank You
Contact :
steve@altiscale.com
Santosh.jha@aziksa.com
Upcoming SlideShare
Loading in...5
×

Cost of Ownership for Hadoop Implementation

2,310

Published on

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,310
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Welcome to Hadoop Summit 2014.
  • Welcome to Hadoop Summit 2014.
  • Examples :

    IT engineer working to create reports
  • Thank you for your time today. Hope this has been helpful.
  • Transcript of "Cost of Ownership for Hadoop Implementation"

    1. 1. Hadoop Summit - 2014 Cost of Ownership for Hadoop Implementation Santosh Jha, Steve Ackley
    2. 2. Part 1 – Estimating TCO
    3. 3. Iceberg Estimating TCO is hard. Like an iceberg, many costs are hidden. Example : integration of Big Data within the existing ecosystem.
    4. 4. Hadoop Implementations Hadoop deployment methods Sample Vendors Hortonworks IBM, EMC AWS EMR Cloudera Oracle, Teradata Rackspace Altiscale MAPR VMware Gogrid Quoble On Premise Hadoop Appliance Hadoop Hosting Hadoop as a service Bare Metal Cloud
    5. 5. On-Premise Cost Categories Cost Group Item Hardware/Infrastructure Costs Servers , Peripherals, Network Storage Communication Costs Local Area Network , Wide Area Network Remote Access Software Costs License/Subscription Fees Implementation Costs Development/customization/integration Training , Consulting , Non Functional Testing(Performance, Capacity, Security etc.) Management Costs Hardware & software upgrades , Hardware & software administration, Legal Cost Support Costs Support staff, Staff training, Travel, Support contracts, Overhead labor, High Availability Cost Disaster Recovery Cost, Ticketing & Trouble Shooting Cost, Monitoring Cost, Internal Audit Cost
    6. 6. Managing Risk Cost Group Item Vendor Vendor Viability Control on Technical Architecture Data Protection Loss of Intellectual Property Loss of Privacy Internal IT Vendor Viability Control on Technical Architecture Data Protection Loss of Intellectual Property Loss of Privacy
    7. 7. Sample calculation Inputs Average Monthly HDFS (TB) 1500 Peak HDFS over Monthly (TB) 100 Monthly HDFS Growth (TB) 20 Average Monthly Compute ('000 SH) 20 Peak Compute (SH) 1400 Planning Cycle (Months) 36 Purchased Distribution No Hadoop Admin Costs Included Data from S3 Yes
    8. 8. Results without considering risk 0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 6,000,000 7,000,000 8,000,000 Hadoop as a service On Premise Amazon EMR Hadoop Distribution on EC2 Cost over 36 Months Cost over 36 Months
    9. 9. Managing Risk (Vendor) – Sample data Managing Risk Risk Factor Weight(%) Calculated Risk Vendor Viability 2 40 0.8 Control on Technical Architecture 1 20 0.2 Data Protection 2 15 0.3 Loss of Intellectual Property 1 10 0.1 Loss of Privacy 2 15 0.3 Total 1.7 Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture. Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
    10. 10. Managing Risk (Internal IT – Sample data) Managing Risk Risk Factor Weight(%) Calculated Risk Vendor Viability 1 40 0.4 Control on Technical Architecture 1 20 0.2 Data Protection 2 15 0.3 Loss of Intellectual Property 1 10 0.1 Loss of Privacy 2 15 0.3 Total 1.3 Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture. Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
    11. 11. Results after considering risk 0 2000000 4000000 6000000 8000000 10000000 12000000 14000000 Hadoop as a service On Premise Amazon EMR Hadoop Distribution on EC2 Cost over 36 Months Cost over 36 Months
    12. 12. Part 2 - Deployment Considerations
    13. 13. On-Premise Implementation – When? • Well-defined use cases with a demonstrated ROI • Developed and tuned Hadoop applications • IT team with experience and bandwidth to manage/maintain Hadoop and integrated hardware/software stack - as well as troubleshoot job problems • Sufficient # of Nodes to Support: o Growth in Data Sets o “Bursty” Nature of Jobs
    14. 14. On-Premise Implementation – Company Profile • Large enterprise with a strategic need for Big Data Analytics • Moved from an exploratory stage to enterprise adoption • Committed IT resources to support Hadoop hardware/software stack
    15. 15. Hadoop as a Service – The Continuum • Vendors manage the hardware • Vendors install hadoop • Vendors manage hadoop
    16. 16. Vendors Manage The Hardware For Organizations that: • Want to create a small cluster for a relatively short period of time, for training and software development purposes. • Have a short-term processing need and no internal capacity to support it. • Do not have an IT organization that can install, manage, maintain and operate the Hadoop hardware/software stack, and can fix “broken” jobs.
    17. 17. Vendors Install Hadoop For Organizations that: • Have a short-term need or small-scale Hadoop requirement. • Have Hadoop applications that are “bursty.” • Have an IT organization that can operate the Hadoop hardware/software stack, can manage scaling the cluster, and can fix “broken” jobs. • Do not need to tailor the hardware to their specific requirements.
    18. 18. Vendors Manage Hadoop For Organizations that: • Do not have the IT organization that can install, manage, maintain and operate the Hadoop hardware/software stack, and fix “broken” jobs. • Do not have the IT hardware infrastructure that’s required. • May need an “always on” Hadoop environment. • Need service providers that: • Can handle all aspects of the IT support for Hadoop. • Can provide comprehensive SLAs. • May offer hardware optimized for Hadoop.
    19. 19. 19 Thank You Contact : steve@altiscale.com Santosh.jha@aziksa.com

    ×