SlideShare a Scribd company logo
1 of 19
Hadoop Summit - 2014
Cost of Ownership for Hadoop
Implementation
Santosh Jha,
Steve Ackley
Part 1 – Estimating TCO
Iceberg
Estimating TCO is hard.
Like an iceberg, many
costs are hidden.
Example :
integration of Big Data
within the existing
ecosystem.
Hadoop Implementations
Hadoop deployment methods
Sample Vendors
Hortonworks IBM, EMC AWS EMR
Cloudera Oracle, Teradata Rackspace Altiscale
MAPR VMware Gogrid Quoble
On Premise
Hadoop
Appliance
Hadoop
Hosting
Hadoop as a
service
Bare Metal Cloud
On-Premise Cost Categories
Cost Group Item
Hardware/Infrastructure Costs Servers , Peripherals, Network
Storage
Communication Costs Local Area Network , Wide Area Network
Remote Access
Software Costs License/Subscription Fees
Implementation Costs Development/customization/integration
Training , Consulting , Non Functional
Testing(Performance, Capacity, Security etc.)
Management Costs Hardware & software upgrades , Hardware &
software administration, Legal Cost
Support Costs Support staff, Staff training, Travel, Support
contracts, Overhead labor, High Availability Cost
Disaster Recovery Cost, Ticketing & Trouble
Shooting Cost, Monitoring Cost, Internal Audit Cost
Managing Risk
Cost Group Item
Vendor Vendor Viability
Control on Technical Architecture
Data Protection
Loss of Intellectual Property
Loss of Privacy
Internal IT Vendor Viability
Control on Technical Architecture
Data Protection
Loss of Intellectual Property
Loss of Privacy
Sample calculation
Inputs
Average Monthly HDFS (TB) 1500
Peak HDFS over Monthly (TB) 100
Monthly HDFS Growth (TB) 20
Average Monthly Compute ('000 SH) 20
Peak Compute (SH) 1400
Planning Cycle (Months) 36
Purchased Distribution No
Hadoop Admin Costs Included
Data from S3 Yes
Results without considering risk
0
1,000,000
2,000,000
3,000,000
4,000,000
5,000,000
6,000,000
7,000,000
8,000,000
Hadoop as a
service
On Premise Amazon EMR Hadoop
Distribution
on EC2
Cost over 36 Months
Cost over 36 Months
Managing Risk (Vendor) – Sample data
Managing Risk Risk Factor Weight(%) Calculated Risk
Vendor Viability 2 40 0.8
Control on Technical
Architecture 1 20 0.2
Data Protection 2 15 0.3
Loss of Intellectual
Property 1 10 0.1
Loss of Privacy 2 15 0.3
Total 1.7
Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability
Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture.
Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection
Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP
Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
Managing Risk (Internal IT – Sample data)
Managing Risk Risk Factor Weight(%) Calculated Risk
Vendor Viability 1 40 0.4
Control on Technical
Architecture 1 20 0.2
Data Protection 2 15 0.3
Loss of Intellectual
Property 1 10 0.1
Loss of Privacy 2 15 0.3
Total 1.3
Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability
Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture.
Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection
Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP
Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
Results after considering risk
0
2000000
4000000
6000000
8000000
10000000
12000000
14000000
Hadoop as a
service
On Premise Amazon EMR Hadoop
Distribution
on EC2
Cost over 36 Months
Cost over 36 Months
Part 2 - Deployment
Considerations
On-Premise Implementation – When?
• Well-defined use cases with a demonstrated ROI
• Developed and tuned Hadoop applications
• IT team with experience and bandwidth to
manage/maintain Hadoop and integrated
hardware/software stack - as well as troubleshoot job
problems
• Sufficient # of Nodes to Support:
o Growth in Data Sets
o “Bursty” Nature of Jobs
On-Premise Implementation – Company Profile
• Large enterprise with a strategic need for Big Data
Analytics
• Moved from an exploratory stage to enterprise
adoption
• Committed IT resources to support Hadoop
hardware/software stack
Hadoop as a Service – The Continuum
• Vendors manage the hardware
• Vendors install hadoop
• Vendors manage hadoop
Vendors Manage The Hardware
For Organizations that:
• Want to create a small cluster for a relatively
short period of time, for training and software
development purposes.
• Have a short-term processing need and no
internal capacity to support it.
• Do not have an IT organization that can install,
manage, maintain and operate the Hadoop
hardware/software stack, and can fix “broken”
jobs.
Vendors Install Hadoop
For Organizations that:
• Have a short-term need or small-scale Hadoop
requirement.
• Have Hadoop applications that are “bursty.”
• Have an IT organization that can operate the
Hadoop hardware/software stack, can manage
scaling the cluster, and can fix “broken” jobs.
• Do not need to tailor the hardware to their
specific requirements.
Vendors Manage Hadoop
For Organizations that:
• Do not have the IT organization that can install,
manage, maintain and operate the Hadoop
hardware/software stack, and fix “broken” jobs.
• Do not have the IT hardware infrastructure that’s
required.
• May need an “always on” Hadoop environment.
• Need service providers that:
• Can handle all aspects of the IT support for Hadoop.
• Can provide comprehensive SLAs.
• May offer hardware optimized for Hadoop.
19
Thank You
Contact :
steve@altiscale.com
Santosh.jha@aziksa.com

More Related Content

What's hot

Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit
 
Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? DataWorks Summit
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Ontico
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduCloudera, Inc.
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIsCisco DevNet
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioNetApp
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudCloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Apache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateApache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateCloudera, Inc.
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalDiego Alberto Tamayo
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryDataCore Software
 

What's hot (20)

Empowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine LearningEmpowering you with Democratized Data Access, Data Science and Machine Learning
Empowering you with Democratized Data Access, Data Science and Machine Learning
 
Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud? Where to Deploy Hadoop: Bare Metal or Cloud?
Where to Deploy Hadoop: Bare Metal or Cloud?
 
Datacenter 2014: HP - Brian Andersen
Datacenter 2014: HP - Brian AndersenDatacenter 2014: HP - Brian Andersen
Datacenter 2014: HP - Brian Andersen
 
Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Part 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache KuduPart 1: Lambda Architectures: Simplified by Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Why Hadoop as a Service?
Why Hadoop as a Service?Why Hadoop as a Service?
Why Hadoop as a Service?
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
EMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized PortfolioEMEA TechTalk – The NetApp Flash Optimized Portfolio
EMEA TechTalk – The NetApp Flash Optimized Portfolio
 
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the CloudPart 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
Part 2: Cloudera’s Operational Database: Unlocking New Benefits in the Cloud
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Apache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance UpdateApache Impala (incubating) 2.5 Performance Update
Apache Impala (incubating) 2.5 Performance Update
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
SAP on datacomm cloud
SAP on datacomm cloudSAP on datacomm cloud
SAP on datacomm cloud
 
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_finalPresentacin webinar move_up_to_power8_with_scale_out_servers_final
Presentacin webinar move_up_to_power8_with_scale_out_servers_final
 
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site RecoveryHow to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
 

Viewers also liked

Viability Session 5: Site-specific case studies
Viability Session 5: Site-specific case studiesViability Session 5: Site-specific case studies
Viability Session 5: Site-specific case studiesPAS_Team
 
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...Supportin Farm Viability through the Agricultural Systems Approach - Municipa...
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...Carolyn Puterbough
 
Pas viability conference york rebecca housam 7th july 2015
Pas viability conference york rebecca housam 7th july 2015Pas viability conference york rebecca housam 7th july 2015
Pas viability conference york rebecca housam 7th july 2015PAS_Team
 
Lect 7 pavement materials
Lect 7 pavement materialsLect 7 pavement materials
Lect 7 pavement materialsM Firdaus
 
S106 ATLAS
S106 ATLASS106 ATLAS
S106 ATLASPAS_Team
 

Viewers also liked (6)

Viability Session 5: Site-specific case studies
Viability Session 5: Site-specific case studiesViability Session 5: Site-specific case studies
Viability Session 5: Site-specific case studies
 
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...Supportin Farm Viability through the Agricultural Systems Approach - Municipa...
Supportin Farm Viability through the Agricultural Systems Approach - Municipa...
 
Pas viability conference york rebecca housam 7th july 2015
Pas viability conference york rebecca housam 7th july 2015Pas viability conference york rebecca housam 7th july 2015
Pas viability conference york rebecca housam 7th july 2015
 
Lect 7 pavement materials
Lect 7 pavement materialsLect 7 pavement materials
Lect 7 pavement materials
 
S106 ATLAS
S106 ATLASS106 ATLAS
S106 ATLAS
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Similar to Cost of Ownership for Hadoop Implementation - Hadoop Summit 2014

Cloud Native Batch Processing: Beyond the What and How
Cloud Native Batch Processing: Beyond the What and HowCloud Native Batch Processing: Beyond the What and How
Cloud Native Batch Processing: Beyond the What and HowVMware Tanzu
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...ModusOptimum
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateClouderaUserGroups
 
How Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsHow Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsArcadia Data
 
Best Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupBest Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupEDB
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data AnalyticsCynthia Saracco
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data productsVikas Sardana
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Oracle Data Protection - 1. část
Oracle Data Protection - 1. částOracle Data Protection - 1. část
Oracle Data Protection - 1. částMarketingArrowECS_CZ
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Airavaat Technologies October 2013
Airavaat Technologies October 2013Airavaat Technologies October 2013
Airavaat Technologies October 2013VenkataGiri Puthigai
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationMichael Rainey
 
A #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDCA #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDCTeamQuest Corporation
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyInside Analysis
 

Similar to Cost of Ownership for Hadoop Implementation - Hadoop Summit 2014 (20)

Cloud Native Batch Processing: Beyond the What and How
Cloud Native Batch Processing: Beyond the What and HowCloud Native Batch Processing: Beyond the What and How
Cloud Native Batch Processing: Beyond the What and How
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
Better Total Value of Ownership (TVO) for Complex Analytic Workflows with the...
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready state
 
How Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT AnalyticsHow Hewlett Packard Enterprise Gets Real with IoT Analytics
How Hewlett Packard Enterprise Gets Real with IoT Analytics
 
Best Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupBest Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture Setup
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
 
Customer value analysis of big data products
Customer value analysis of big data productsCustomer value analysis of big data products
Customer value analysis of big data products
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Oracle Data Protection - 1. část
Oracle Data Protection - 1. částOracle Data Protection - 1. část
Oracle Data Protection - 1. část
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Airavaat Technologies October 2013
Airavaat Technologies October 2013Airavaat Technologies October 2013
Airavaat Technologies October 2013
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
A #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDCA #Pink14 Presentation: Optimizing for the #SDDC
A #Pink14 Presentation: Optimizing for the #SDDC
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 

Recently uploaded

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 

Recently uploaded (20)

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 

Cost of Ownership for Hadoop Implementation - Hadoop Summit 2014

  • 1. Hadoop Summit - 2014 Cost of Ownership for Hadoop Implementation Santosh Jha, Steve Ackley
  • 2. Part 1 – Estimating TCO
  • 3. Iceberg Estimating TCO is hard. Like an iceberg, many costs are hidden. Example : integration of Big Data within the existing ecosystem.
  • 4. Hadoop Implementations Hadoop deployment methods Sample Vendors Hortonworks IBM, EMC AWS EMR Cloudera Oracle, Teradata Rackspace Altiscale MAPR VMware Gogrid Quoble On Premise Hadoop Appliance Hadoop Hosting Hadoop as a service Bare Metal Cloud
  • 5. On-Premise Cost Categories Cost Group Item Hardware/Infrastructure Costs Servers , Peripherals, Network Storage Communication Costs Local Area Network , Wide Area Network Remote Access Software Costs License/Subscription Fees Implementation Costs Development/customization/integration Training , Consulting , Non Functional Testing(Performance, Capacity, Security etc.) Management Costs Hardware & software upgrades , Hardware & software administration, Legal Cost Support Costs Support staff, Staff training, Travel, Support contracts, Overhead labor, High Availability Cost Disaster Recovery Cost, Ticketing & Trouble Shooting Cost, Monitoring Cost, Internal Audit Cost
  • 6. Managing Risk Cost Group Item Vendor Vendor Viability Control on Technical Architecture Data Protection Loss of Intellectual Property Loss of Privacy Internal IT Vendor Viability Control on Technical Architecture Data Protection Loss of Intellectual Property Loss of Privacy
  • 7. Sample calculation Inputs Average Monthly HDFS (TB) 1500 Peak HDFS over Monthly (TB) 100 Monthly HDFS Growth (TB) 20 Average Monthly Compute ('000 SH) 20 Peak Compute (SH) 1400 Planning Cycle (Months) 36 Purchased Distribution No Hadoop Admin Costs Included Data from S3 Yes
  • 8. Results without considering risk 0 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 6,000,000 7,000,000 8,000,000 Hadoop as a service On Premise Amazon EMR Hadoop Distribution on EC2 Cost over 36 Months Cost over 36 Months
  • 9. Managing Risk (Vendor) – Sample data Managing Risk Risk Factor Weight(%) Calculated Risk Vendor Viability 2 40 0.8 Control on Technical Architecture 1 20 0.2 Data Protection 2 15 0.3 Loss of Intellectual Property 1 10 0.1 Loss of Privacy 2 15 0.3 Total 1.7 Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture. Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
  • 10. Managing Risk (Internal IT – Sample data) Managing Risk Risk Factor Weight(%) Calculated Risk Vendor Viability 1 40 0.4 Control on Technical Architecture 1 20 0.2 Data Protection 2 15 0.3 Loss of Intellectual Property 1 10 0.1 Loss of Privacy 2 15 0.3 Total 1.3 Vendor Viability 1 - No Risk, 5 - Very High Risk with vendor viability Control on Technical Architecture 1 - No Need to Control, 5 - Compelling Need to control technical architecture. Data Protection 1 - High data protection provided by architecture and process, 5 - No data protection Loss of Intellectual Property 1 - No IP, 5 - High business impact with the loss of IP Loss of Privacy 1 - No privacy issue for the solution, 5 - High business impact with loss of Data
  • 11. Results after considering risk 0 2000000 4000000 6000000 8000000 10000000 12000000 14000000 Hadoop as a service On Premise Amazon EMR Hadoop Distribution on EC2 Cost over 36 Months Cost over 36 Months
  • 12. Part 2 - Deployment Considerations
  • 13. On-Premise Implementation – When? • Well-defined use cases with a demonstrated ROI • Developed and tuned Hadoop applications • IT team with experience and bandwidth to manage/maintain Hadoop and integrated hardware/software stack - as well as troubleshoot job problems • Sufficient # of Nodes to Support: o Growth in Data Sets o “Bursty” Nature of Jobs
  • 14. On-Premise Implementation – Company Profile • Large enterprise with a strategic need for Big Data Analytics • Moved from an exploratory stage to enterprise adoption • Committed IT resources to support Hadoop hardware/software stack
  • 15. Hadoop as a Service – The Continuum • Vendors manage the hardware • Vendors install hadoop • Vendors manage hadoop
  • 16. Vendors Manage The Hardware For Organizations that: • Want to create a small cluster for a relatively short period of time, for training and software development purposes. • Have a short-term processing need and no internal capacity to support it. • Do not have an IT organization that can install, manage, maintain and operate the Hadoop hardware/software stack, and can fix “broken” jobs.
  • 17. Vendors Install Hadoop For Organizations that: • Have a short-term need or small-scale Hadoop requirement. • Have Hadoop applications that are “bursty.” • Have an IT organization that can operate the Hadoop hardware/software stack, can manage scaling the cluster, and can fix “broken” jobs. • Do not need to tailor the hardware to their specific requirements.
  • 18. Vendors Manage Hadoop For Organizations that: • Do not have the IT organization that can install, manage, maintain and operate the Hadoop hardware/software stack, and fix “broken” jobs. • Do not have the IT hardware infrastructure that’s required. • May need an “always on” Hadoop environment. • Need service providers that: • Can handle all aspects of the IT support for Hadoop. • Can provide comprehensive SLAs. • May offer hardware optimized for Hadoop.

Editor's Notes

  1. Welcome to Hadoop Summit 2014.
  2. Welcome to Hadoop Summit 2014.
  3. Examples : IT engineer working to create reports
  4. Thank you for your time today. Hope this has been helpful.