SlideShare a Scribd company logo
Conflict in the Cloud 
Big data and cloud computing 
Keith Peterson, CEO 
Halo BI 
©2014 Halo Business Intelligence | All Rights Reserved
Starting points 
• Information management and analytics issues hurting business objectives 
• Taking days and weeks to get to the data 
• Multiple copies of data around the organization 
• No shared view of the truth 
• ETL and data warehouse unable to handle loads 
• BI and reporting eating up capacity 
• Data volumes growing but budgets static 
• Desire to leverage new machine data sources 
©2014 Halo Business Intelligence | All Rights Reserved
The “Big Data” challenge (Executive View) 
… 
© 2014 Halo Business Intelligence | All Rights Reserved
The “Big Data” challenge (Business View) 
… 
Big Data Is… 
• Ever escalating volumes 
• Expanding sources, such as Internet Of Things 
• Increasingly high velocities 
• With a widening variety of unstructured formats and semantic contexts 
© 2014 Halo Business Intelligence | All Rights Reserved 
Big Data Is Not Really Useful 
Unless insight can be gleaned through analytics...with a reasonable effort!
Three Strategies to Deal with Big Data 
1 212 313 
Ignore It Archive It Analyze It 
©2014 Halo Business Intelligence | All Rights Reserved 
Don’t jump on the 
bandwagon 
You have better things to 
focus on 
Just collect and store it 
You can always analyze 
it when resources free 
up 
With a clear business 
problem and ROI 
Invest in infrastructure to 
derive the insights 
needed
Big Data 
Google 1 Trillion Web Pages per Year 
Facebook 1 Million GB of Disk Storage 
Yelp! 100 GB of log data per day 
Youtube.com 20 Petabytes new video per year 
. 
. 
Regional medical center – patient sensors 25 TB 
Mid-market retailer – POS 10 TB 
Mid-market manufacturer – machine sensors 6 TB 
http://www.google.com/trends/explore#q=%2Fm%2F04y7lrx%2C%20Amazon%20Aws%2C%20Rackspace&cmpt=q 
©2014 Halo Business Intelligence | All Rights Reserved
Five Big Data Questions 
111 212 313 414 515 
Left Behind? Cloud? Data? Tools? Usefulness? 
©2014 Halo Business Intelligence | All Rights Reserved 
Everyone is doing 
it…you need to. 
Really? 
Or will cost 
exceed benefit? 
Big Data requires Big 
Compute 
Outsourcing risks: 
• Loss of Control 
• Platform Reliability 
• Privacy 
• Security 
Which data and 
sources? 
Too much to handle 
Some or all? 
All vendors have 
a Big Data suite 
Which one? 
How to query the data 
Skills needed 
Machine Learning
Big Data and Cloud Computing 
Commodity computing to execute distributed queries across 
multiple data sets 
Rent commodity server instances to execute computation 
remotely 
Cloud hosting for $10/TB/Mo 
©2014 Halo Business Intelligence | All Rights Reserved
Traditional BI Architecture 
On Premise Or Cloud 
Operational Data 
• Data Volumes = 100 GB – 5 TB 
• Manageable on-premise or in the cloud 
©2014 Halo Business Intelligence | All Rights Reserved 
Data Warehouse
©2014 Halo Business Intelligence | All Rights Reserved 
Add Big Data 
Data volumes = 6 TB + 
image 
Big Data Logs 
Cluster
Big Data in the Cloud 
The “Traditional” Approach 
Data Platform 
Commodity Storage Traditional RDMS 
©2014 Halo Business Intelligence | All Rights Reserved 
Client 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC 
• Use Amazon 
Redshift, 
Azure 
HDInsight or 
similar 
• Use Blob 
storage to 
persist big 
data 
• Spin up 
Compute 
clusters as 
needed 
• Keep Data in 
Cloud 
perpetually
Big Data Storage Costs 
Cost per TB 
per year 
Sources: 
http://calculator.s3.amazonaws.com/index.html 
http://azure.microsoft.com/en-us/pricing/calculator/ 
As of Nov 2014 
Provider Type 
©2014 Halo Business Intelligence | All Rights Reserved 
Cost per PB 
per year 
Amazon EBS SSD storage $ 1,229 $ 1,258,291 
Amazon EBS Magnetic Storage $ 614 $ 629,145 
Amazon S3 Storage $ 411 $ 420,372 
Azure Tables & Queues $ 792 $ 811,302 
Azure Blob Storage $ 288 $ 294,912
Hosting Considerations 
• What if you host big data on-premise? 
• Cost of managing hundreds of servers, expensive processing power 
• Costs can be hidden in data center budget until too late 
• What is your Big Data output? 
• Beyond about 25 TB of data, cloud hosting costs become significant. 
• Data Transfer costs must be considered as well 
• Inbound is usually free 
• Outbound can be $1,000’s per month 
• Direct connect or physically ship 
• For audit purposes, data may need to be kept for up to 7 years 
• Factor this into your storage costs 
• Location 
• Will regulations impact ability to store or process on machines in different countries 
© 2014 Halo Business Intelligence | All Rights Reserved
Big Data Considerations 
Databases 
• High speed analysis of transactional data 
• Multi-step computations 
• Interactive querying 
• Lots of updates (adds/deletes/mods) 
MapReduce HDFS 
• Low cost storage and compute 
• High performance queries on large data 
• Complex data simple query 
• Simple scaling 
Note: Ideas in this slide are borrowed and adapted from “Running, Managing, and Adapting Hadoop at Sears,” by Andy McNalis, Senior Manager, 
Hadoop Infrastructure, Sears Holdings. 
© 2014 Halo Business Intelligence | All Rights Reserved
Cloud Considerations 
• Big Data needs Big Compute 
• Which cloud services will you choose? 
• Time, effort and skills will vary considerably 
• Microsoft Azure 
• Amazon EC2 
• Google Cloud Platform 
• Verizon Cloud 
• Rackspace 
http://online.wsj.com/articles/little-space-remains-for-rackspace-ahead-of- 
the-tape-1415557510 
©2014 Halo Business Intelligence | All Rights Reserved
Big Data in the Cloud 
The “Traditional” Approach 
Data Platform 
Traditional RDMS 
Commodity Storage Client 
©2014 Halo Business Intelligence | All Rights Reserved 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC
©2014 Halo Business Intelligence | All Rights Reserved 
Big Data in the Cloud 
Premise-Cloud Hybrid Approach 
Data Platform 
Traditional RDMS 
Commodity Storage Client 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC 
ETL and Pre-aggregate on-premise 
Analyze Visualize in Cloud
Enterprise Data Hub 
©2014 Halo Business Intelligence | All Rights Reserved 
On-premise Hadoop 
Clusters 
Data Warehouse 
Accelerator 
Cloud Hosting 
Cloud BI Reporting and 
Analytics
ROI Strategies 
Finding critical applications 
Cost of Labor 
 Use lower skill-lower cost resources 
 Avoid extra headcount 
 Share experiences among plants 
 Move experienced talent to higher 
value activity 
Cost of Capital 
 Use under-resourced equipment / 
assets more efficiently 
 Make equipment last longer, run more 
efficiently 
 Avoid more equipment purchases 
Cost of Materials 
 User fewer raw materials 
 Improve quality of raw materials 
sourced 
 Improve delivery and inventory 
Cost of Overheads 
 Reduce transportation costs 
 Reduce or optimize energy and 
resource costs 
 Reduce management layers 
©2014 Halo Business Intelligence | All Rights Reserved 
Cost of Lost Opportunities 
 Reduce time to market 
 Improve product end-of-life 
 Reduce downtime 
 Reduce order to cash 
Cost of Reputation 
 Reduce product defects 
 Anticipate customer reactions 
 Tailor service and response profiles 
More available: info@halobi.com
Warehouse Operations 
Machine sensor data for inventory and labor optimization 
$300K 
Cases per man hour 
Picking accuracy 
©2014 Halo Business Intelligence | All Rights Reserved
Drought Management for Growers 
Smarter water use 
©2014 Halo Business Intelligence | All Rights Reserved 
$475K potential 
Water per output
Retail promotions 
Demand forecasting, sentiment analysis, and pricing 
©2014 Halo Business Intelligence | All Rights Reserved 
$6.2M 
Sales per Square Foot 
Returns Rate
Summary 
• The value of investing in Big Data in the Cloud 
depends on your use case 
• Cost is an issue – 25 TB 
• Skills are an issue – steep learning curves 
• Process is an issue – requires change in the way 
people think and operate 
• Partners are an issue – do you want a large or niche 
provider 
• Database design is important 
©2014 Halo Business Intelligence | All Rights Reserved

More Related Content

What's hot

Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
Datameer
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) data
Oscar Renalias
 
Best Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerBest Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by Datameer
Datameer
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
Capgemini
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
Sense Corp
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells cars
Cloudera, Inc.
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?
SAS Canada
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
Ganes Kesari
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
Caserta
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
Datameer
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big dataDr. Wilfred Lin (Ph.D.)
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
Harvinder Atwal
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
VMware Tanzu
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Chief Analytics Officer Forum
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
Harvinder Atwal
 
Infrastructure Matters
Infrastructure MattersInfrastructure Matters
Infrastructure Matters
IBM Innovation Center Silicon Valley
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital Transformation
Cloudera, Inc.
 
Analytics - Trends and Prospects
Analytics - Trends and ProspectsAnalytics - Trends and Prospects
Analytics - Trends and Prospects
Dr. Umesh Rao.Hodeghatta
 
Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?
SAS Canada
 
Analytics Solutions from SAP
Analytics Solutions from SAPAnalytics Solutions from SAP
Analytics Solutions from SAP
SAP Analytics
 

What's hot (20)

Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) data
 
Best Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerBest Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by Datameer
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells cars
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
Infrastructure Matters
Infrastructure MattersInfrastructure Matters
Infrastructure Matters
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital Transformation
 
Analytics - Trends and Prospects
Analytics - Trends and ProspectsAnalytics - Trends and Prospects
Analytics - Trends and Prospects
 
Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?
 
Analytics Solutions from SAP
Analytics Solutions from SAPAnalytics Solutions from SAP
Analytics Solutions from SAP
 

Similar to Conflict in the Cloud – Issues & Solutions for Big Data

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Retail & CPG
Retail & CPGRetail & CPG
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
Cloudera, Inc.
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...DataWorks Summit
 
SoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in UtahSoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in Utah
Serhiy (Serge) Haziyev
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
DataWorks Summit
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
IntelAPAC
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
Datameer
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
DataWorks Summit/Hadoop Summit
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
South West Data Meetup
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
DataWorks Summit
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
Jeff Kelly
 
How to implement Hadoop successfully
How to implement Hadoop successfullyHow to implement Hadoop successfully
How to implement Hadoop successfully
Adir Sharabi
 

Similar to Conflict in the Cloud – Issues & Solutions for Big Data (20)

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
 
SoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in UtahSoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in Utah
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
 
How to implement Hadoop successfully
How to implement Hadoop successfullyHow to implement Hadoop successfully
How to implement Hadoop successfully
 

Recently uploaded

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 

Recently uploaded (20)

Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 

Conflict in the Cloud – Issues & Solutions for Big Data

  • 1. Conflict in the Cloud Big data and cloud computing Keith Peterson, CEO Halo BI ©2014 Halo Business Intelligence | All Rights Reserved
  • 2. Starting points • Information management and analytics issues hurting business objectives • Taking days and weeks to get to the data • Multiple copies of data around the organization • No shared view of the truth • ETL and data warehouse unable to handle loads • BI and reporting eating up capacity • Data volumes growing but budgets static • Desire to leverage new machine data sources ©2014 Halo Business Intelligence | All Rights Reserved
  • 3. The “Big Data” challenge (Executive View) … © 2014 Halo Business Intelligence | All Rights Reserved
  • 4. The “Big Data” challenge (Business View) … Big Data Is… • Ever escalating volumes • Expanding sources, such as Internet Of Things • Increasingly high velocities • With a widening variety of unstructured formats and semantic contexts © 2014 Halo Business Intelligence | All Rights Reserved Big Data Is Not Really Useful Unless insight can be gleaned through analytics...with a reasonable effort!
  • 5. Three Strategies to Deal with Big Data 1 212 313 Ignore It Archive It Analyze It ©2014 Halo Business Intelligence | All Rights Reserved Don’t jump on the bandwagon You have better things to focus on Just collect and store it You can always analyze it when resources free up With a clear business problem and ROI Invest in infrastructure to derive the insights needed
  • 6. Big Data Google 1 Trillion Web Pages per Year Facebook 1 Million GB of Disk Storage Yelp! 100 GB of log data per day Youtube.com 20 Petabytes new video per year . . Regional medical center – patient sensors 25 TB Mid-market retailer – POS 10 TB Mid-market manufacturer – machine sensors 6 TB http://www.google.com/trends/explore#q=%2Fm%2F04y7lrx%2C%20Amazon%20Aws%2C%20Rackspace&cmpt=q ©2014 Halo Business Intelligence | All Rights Reserved
  • 7. Five Big Data Questions 111 212 313 414 515 Left Behind? Cloud? Data? Tools? Usefulness? ©2014 Halo Business Intelligence | All Rights Reserved Everyone is doing it…you need to. Really? Or will cost exceed benefit? Big Data requires Big Compute Outsourcing risks: • Loss of Control • Platform Reliability • Privacy • Security Which data and sources? Too much to handle Some or all? All vendors have a Big Data suite Which one? How to query the data Skills needed Machine Learning
  • 8. Big Data and Cloud Computing Commodity computing to execute distributed queries across multiple data sets Rent commodity server instances to execute computation remotely Cloud hosting for $10/TB/Mo ©2014 Halo Business Intelligence | All Rights Reserved
  • 9. Traditional BI Architecture On Premise Or Cloud Operational Data • Data Volumes = 100 GB – 5 TB • Manageable on-premise or in the cloud ©2014 Halo Business Intelligence | All Rights Reserved Data Warehouse
  • 10. ©2014 Halo Business Intelligence | All Rights Reserved Add Big Data Data volumes = 6 TB + image Big Data Logs Cluster
  • 11. Big Data in the Cloud The “Traditional” Approach Data Platform Commodity Storage Traditional RDMS ©2014 Halo Business Intelligence | All Rights Reserved Client Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC • Use Amazon Redshift, Azure HDInsight or similar • Use Blob storage to persist big data • Spin up Compute clusters as needed • Keep Data in Cloud perpetually
  • 12. Big Data Storage Costs Cost per TB per year Sources: http://calculator.s3.amazonaws.com/index.html http://azure.microsoft.com/en-us/pricing/calculator/ As of Nov 2014 Provider Type ©2014 Halo Business Intelligence | All Rights Reserved Cost per PB per year Amazon EBS SSD storage $ 1,229 $ 1,258,291 Amazon EBS Magnetic Storage $ 614 $ 629,145 Amazon S3 Storage $ 411 $ 420,372 Azure Tables & Queues $ 792 $ 811,302 Azure Blob Storage $ 288 $ 294,912
  • 13. Hosting Considerations • What if you host big data on-premise? • Cost of managing hundreds of servers, expensive processing power • Costs can be hidden in data center budget until too late • What is your Big Data output? • Beyond about 25 TB of data, cloud hosting costs become significant. • Data Transfer costs must be considered as well • Inbound is usually free • Outbound can be $1,000’s per month • Direct connect or physically ship • For audit purposes, data may need to be kept for up to 7 years • Factor this into your storage costs • Location • Will regulations impact ability to store or process on machines in different countries © 2014 Halo Business Intelligence | All Rights Reserved
  • 14. Big Data Considerations Databases • High speed analysis of transactional data • Multi-step computations • Interactive querying • Lots of updates (adds/deletes/mods) MapReduce HDFS • Low cost storage and compute • High performance queries on large data • Complex data simple query • Simple scaling Note: Ideas in this slide are borrowed and adapted from “Running, Managing, and Adapting Hadoop at Sears,” by Andy McNalis, Senior Manager, Hadoop Infrastructure, Sears Holdings. © 2014 Halo Business Intelligence | All Rights Reserved
  • 15. Cloud Considerations • Big Data needs Big Compute • Which cloud services will you choose? • Time, effort and skills will vary considerably • Microsoft Azure • Amazon EC2 • Google Cloud Platform • Verizon Cloud • Rackspace http://online.wsj.com/articles/little-space-remains-for-rackspace-ahead-of- the-tape-1415557510 ©2014 Halo Business Intelligence | All Rights Reserved
  • 16. Big Data in the Cloud The “Traditional” Approach Data Platform Traditional RDMS Commodity Storage Client ©2014 Halo Business Intelligence | All Rights Reserved Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC
  • 17. ©2014 Halo Business Intelligence | All Rights Reserved Big Data in the Cloud Premise-Cloud Hybrid Approach Data Platform Traditional RDMS Commodity Storage Client Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC ETL and Pre-aggregate on-premise Analyze Visualize in Cloud
  • 18. Enterprise Data Hub ©2014 Halo Business Intelligence | All Rights Reserved On-premise Hadoop Clusters Data Warehouse Accelerator Cloud Hosting Cloud BI Reporting and Analytics
  • 19. ROI Strategies Finding critical applications Cost of Labor  Use lower skill-lower cost resources  Avoid extra headcount  Share experiences among plants  Move experienced talent to higher value activity Cost of Capital  Use under-resourced equipment / assets more efficiently  Make equipment last longer, run more efficiently  Avoid more equipment purchases Cost of Materials  User fewer raw materials  Improve quality of raw materials sourced  Improve delivery and inventory Cost of Overheads  Reduce transportation costs  Reduce or optimize energy and resource costs  Reduce management layers ©2014 Halo Business Intelligence | All Rights Reserved Cost of Lost Opportunities  Reduce time to market  Improve product end-of-life  Reduce downtime  Reduce order to cash Cost of Reputation  Reduce product defects  Anticipate customer reactions  Tailor service and response profiles More available: info@halobi.com
  • 20. Warehouse Operations Machine sensor data for inventory and labor optimization $300K Cases per man hour Picking accuracy ©2014 Halo Business Intelligence | All Rights Reserved
  • 21. Drought Management for Growers Smarter water use ©2014 Halo Business Intelligence | All Rights Reserved $475K potential Water per output
  • 22. Retail promotions Demand forecasting, sentiment analysis, and pricing ©2014 Halo Business Intelligence | All Rights Reserved $6.2M Sales per Square Foot Returns Rate
  • 23. Summary • The value of investing in Big Data in the Cloud depends on your use case • Cost is an issue – 25 TB • Skills are an issue – steep learning curves • Process is an issue – requires change in the way people think and operate • Partners are an issue – do you want a large or niche provider • Database design is important ©2014 Halo Business Intelligence | All Rights Reserved

Editor's Notes

  1. Static Slide Big Data Is… Being collected in ever escalating volumes From more and more sources, such as Internet Of Things In increasingly high velocities With a widening variety of unstructured formats and semantic contexts Big Data Is Not Really Useful unless insight can be gleaned through analytics!
  2. Static Slide Big Data Is… Being collected in ever escalating volumes From more and more sources, such as Internet Of Things In increasingly high velocities With a widening variety of unstructured formats and semantic contexts Big Data Is Not Really Useful unless insight can be gleaned through analytics!
  3. Google 1 Trillion web pages per year Facebook 1 M GB of dis storage Youtube 20 Petabyes to new video per year That a user confifgures and controls Rather than on a local desktop Cloud providers charge un $.10 per CPU hour for renting MIPS memory space
  4. Google 1 Trillion web pages per year Facebook 1 M GB of dis storage Youtube 20 Petabyes to new video per year That a user confifgures and controls Rather than on a local desktop Cloud providers charge un $.10 per CPU hour for renting MIPS memory space
  5. Static Slide
  6. Static Slide
  7. The advantage of using non-relational dbs to handle both types of data. But unstructured could be much harder to use long term. Hard choices about converting unstructured to structured. Initial DB designs wont support Have to load maintain and power hudnresd of servers if not in cloud. Jprocessing powerwill be expensive. Because cost rolled into DC, supriese Different in technology. Means amount of time effort and expertise will vary considerable.