SlideShare a Scribd company logo
© 2016 Datameer, Inc. All rights reserved.
John Morrell, Senior Director of Product Marketing
Sean Anderson, Senior Product Marketing Manager
Datameer 6 - The Modern BI Platform for Your Big
Data Journey
© 2016 Datameer, Inc. All rights reserved.
© 2016 Datameer, Inc. All rights reserved.
About the Speakers
John Morrell
Senior Director
Product Marketing
Sean Anderson
Senior Product
Marketing Manager
© Cloudera, Inc. All rights reserved. 3
Spark will replace MapReduce
To become the standard execution engine for Hadoop
© 2016 Datameer, Inc. All rights reserved.
Data Processing
Data may require unique processing characteristics
▪ Batch
▪ Streaming
▪ Real-time
Hadoop arose to address one and now the ecosystem
is answering the rest.
▪ “We’re doubling down on Spark. We invested earliest,
and we’ve invested most, in making Hadoop
enterprise-grade” Doug Cutting
Data Processing
Leverage the right processing for your job
© 2016 Datameer, Inc. All rights reserved.
Powerful Data Processing - The Most Apache Spark
Experience
5
STRUCTURED
Sqoop
UNSTRUCTURED
Kafka, Flume
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
BATCH
Spark, Hive,
Pig
MapReduce
STREAM
Spark
SQL
Impala
SEARCH
Solr
SDK
Kite
Spark: In-memory data processing for developers
and data scientists
• Easy development
• Flexible, extensible API
• Fast batch and stream processing
Cloudera: Most experience with Spark on Hadoop
for instant success
• First to ship and support
• Most Spark users trained
• Most customers running Spark
• Most engineering resources (committers, contributors, support)
• Only vendor focused on enterprise Spark
© 2016 Datameer, Inc. All rights reserved.
Apache Spark
Cloudera was the first Hadoop vendor to ship and support Spark
• Spark is a fully integrated part of Cloudera’s platform
• Shared data, metadata, resource management, administration, security,
and governance
• Complements specialized analytic tools for comprehensive big data platform
• Cloudera is the first Hadoop vendor to offer Spark training
• Trained more customers than any other vendor
• Most popular training course
• Cloudera has 5x the engineering resources of the next competitor
• Most committers on staff and most changes contributed
• Well-trained staff across the globe with expertise implementing a broad range
of Spark use cases
© 2016 Datameer, Inc. All rights reserved.
Cloudera’s Engineering Commitment to Spark
7
© 2016 Datameer, Inc. All rights reserved.
The Spark Ecosystem & Hadoop
8
STRUCTURED
Sqoop
UNSTRUCTURED
Kafka, Flume
UNIFIED SERVICES
RESOURCE MANAGEMENT
YARN
SECURITY
Sentry, RecordService
FILESYSTEM
HDFS
RELATIONAL
Kudu
NoSQL
HBase
STORE
INTEGRATE
SQL
Impala
SEARCH
Solr
SDK
Kite
BATCH & STREAM
Spark
Spark
Streaming Spark SQL DataFrames MLlib …
© 2016 Datameer, Inc. All rights reserved.
Uniting Spark and Hadoop - The One Platform Initiative Investment
Areas
9
Management
Leverage Hadoop-native
resource management.
Security
Full support for Hadoop security
and beyond.
Scale
Enable 10k-node clusters.
Streaming
Support for 80% of common stream
processing workloads.
© 2016 Datameer, Inc. All rights reserved.
Community Initiative: Spark Supersedes
MapReduce
10
© 2016 Datameer, Inc. All rights reserved.
Key Cloudera Contributions
11
• Spark-on-YARN integration
• Dynamic Resource
Allocation
• Kafka Integration
• HBase Integration
• Fixed operational issues at
scale
Integration with Hadoop
Ecosystem
Production-Ready Features Ongoing Initiatives
• Security
• Kerberos Integration
• HDFS Sync (Sentry)
• Governance
• Cloudera Navigator integration
(audit & lineage)
• Monitoring/Troubleshootin
g
• Improved debugging
• Zero Data Loss
• Spark Streaming Resilience
• Standard Execution Engine
• Hive on Spark
• Pig on Spark
• Crunch on Spark
• Solr indexing on Spark
© 2016 Datameer, Inc. All rights reserved.
Cloudera Customers
12
• More customers running Spark than all other vendors combined
• Over 170 customers
• Spark clusters as large as 800 nodes
• Diverse range of use cases across multiple industries
• Search personalization
• Genomics research
• Insurance modeling
• Advertising optimization
• Predictive modeling of disease conditions
© 2016 Datameer, Inc. All rights reserved.
Cloudera Enterprise, A New Way Forward
13
© 2016 Datameer, Inc. All rights reserved.
Key Cloudera Contributions
14
① ②
Download or Deploy
in the Cloud
Signup for Training Contact us or a Partner
to Start a POC
③
Getting Started is Easy
© 2016 Datameer, Inc. All rights reserved.
Datameer 6:
The Modern BI Platform for Your Big Data
Journey
© 2016 Datameer, Inc. All rights reserved.
Big Data Journey Challenges
Meeting
Demand
(Productivity)
Putting Your
Insights to
Work
Answering
New
Questions
Using More
of Your Data
Skill Gaps
© 2016 Datameer, Inc. All rights reserved.
Deep New-Age Questions
• What journey do customers take to purchase products?
• What actions do customers take before they churn?
• What attributes do customers with similar buying behavior have in common?
• Why do certain assets have a large impact on our overall risk?
• What series of events occur before equipment fails?
• Where are my network bottlenecks and how does this impact service?
© 2016 Datameer, Inc. All rights reserved.
More Data Sources
Business Data
Digital Interactions
Machine Data
Marketing Call Center Web Social Media
Devices IT Sensors Security Network
CRM Sales Financial
…
…
…
© 2016 Datameer, Inc. All rights reserved.
Datameer: Fastest Time to Insight
Time
Complexity
4 weeks 8 weeks 12 weeks 18 months
$$$
$$
$
Enterprise Data Warehouse
ETL Data Warehouse BI
Use
Case
#2
Use
Case
#3
Use
Case
#4
Use
Case
#5
Datameer Use Case #1
Use Case #1
Use
Case
#6
Use
Case
#7
Use
Case
#8
Use
Case
#9
6 weeks 10 weeks
Integrate Analyze Visualize
Use Case #2 Use Case #3
Custom Big Data
Flume Hive Pig Sqoop Raw Data Hadoop
NO
CODIN
G
© 2016 Datameer, Inc. All rights reserved.
End-to-end Modern BI Platform
Integrate
• 70+ Connectors
• Wizard-led
• Unstructured data
• High Performance
Prepare/Analyze
• 270+ Functions
• Instant Profiling
• Familiar Spreadsheet
UI
• Advanced analytics
• Smart Data Discovery
Visualize
• 30+ Widgets
• Infographics
• HTML 5
Operationalize
• Security
• Governance
• Process integration
© 2016 Datameer, Inc. All rights reserved.
Agile Self-Service Analytics without Chaos
Spreadsheet Collaborative Governance & ControlDrag-n-Drop
© 2016 Datameer, Inc. All rights reserved.
Advanced Analytics & Smart Data Discovery
Clustering Decision Trees Dependencies Recommendations
Time Series Analytics Graph & Path Analytics Text Analytics
© 2016 Datameer, Inc. All rights reserved.
Augment Existing Analytics
Visualization &
Exploration
Traditional BI
© 2016 Datameer, Inc. All rights reserved.
Enterprise Ready
Integrate,
Don’t Replace
Intelligent
Execution Framework
Flexible
Deployment Options
© 2016 Datameer, Inc. All rights reserved.
What’s New in Datameer 6?
Make Big Data Simple for Everyone
Speed-to-InsightEase-of-Use
Re-imagined User
Experience & Workflow
Spark
Enable Citizen Data Scientists Abstract Complexity
© 2016 Datameer, Inc. All rights reserved.
Smart Execution with Spark
© 2016 Datameer, Inc. All rights reserved.
Differences in Spark Implementations
BI on Spark Spark Cluster
Embedded
Spark
Smart
Execution
Spark
• Limited to structured
view of data
• Like using Hive or
Impala in Hadoop
• Fully programmatic
approach
• Need specialized
skills $$
• May force you to
code at points
• No future-proof story
• Limited execution
frameworks
• Eliminates technical
complexities
• Don’t need
specialized skills
• Optimizes
execution/performanc
e
© 2016 Datameer, Inc. All rights reserved.
New Backend Benefits
Future-proof
Fastest
processing, every
time
Concentrate on
analytics, not
backend
Abstract Complexity
© 2016 Datameer, Inc. All rights reserved.
Why does UI and Workflow Matter?
“That is the way to learn the most,
that when you are doing something
with such enjoyment that you don’t
notice the time passes.”
© 2016 Datameer, Inc. All rights reserved.
The Evolution of Analytic Workflow
Integrate Transform Analyze Visualize
1st Generation:
Multiple Steps
2nd Generation:
Self-Service
3rd Generation:
Iterative/Fluid
© 2016 Datameer, Inc. All rights reserved.
Datameer 6 User Experience
Parallel Workflow
Immediate insight
into downstream
effects
Fresh, Modern
UI/UX
Significant Time Savings Increased Productivity
© 2016 Datameer, Inc. All rights reserved.
Demonstration
© 2016 Datameer, Inc. All rights reserved.
Learn More
http://www.datameer.com/product/datameer-6/
© 2016 Datameer, Inc. All rights reserved.
Thank You!

More Related Content

What's hot

8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
Dr. Wilfred Lin (Ph.D.)
 

What's hot (20)

Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
 
How to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics WebinarHow to Avoid Pitfalls in Big Data Analytics Webinar
How to Avoid Pitfalls in Big Data Analytics Webinar
 
Cloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learningCloudera Fast Forward Labs: Accelerate machine learning
Cloudera Fast Forward Labs: Accelerate machine learning
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Conflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big DataConflict in the Cloud – Issues & Solutions for Big Data
Conflict in the Cloud – Issues & Solutions for Big Data
 
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analyticsOptimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
 
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
Operating in a Multi-execution Engine Hadoop Environment by Erik Halseth of D...
 
The Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent OffersThe Big Picture: Real-time Data is Defining Intelligent Offers
The Big Picture: Real-time Data is Defining Intelligent Offers
 
Finding fraud in large, diverse data sets
Finding fraud in large, diverse data setsFinding fraud in large, diverse data sets
Finding fraud in large, diverse data sets
 
Markerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and InnovationMarkerstudy Group Drives Growth and Innovation
Markerstudy Group Drives Growth and Innovation
 
Big Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial ServicesBig Data as Competitive Advantage in Financial Services
Big Data as Competitive Advantage in Financial Services
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Optimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big DataOptimizing Regulatory Compliance with Big Data
Optimizing Regulatory Compliance with Big Data
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
Analytics Solutions from SAP
Analytics Solutions from SAPAnalytics Solutions from SAP
Analytics Solutions from SAP
 
Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)Transforming Business for the Digital Age (Presented by Microsoft)
Transforming Business for the Digital Age (Presented by Microsoft)
 
Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17Transform Banking with Big Data and Automated Machine Learning 9.12.17
Transform Banking with Big Data and Automated Machine Learning 9.12.17
 
How to implement Hadoop successfully
How to implement Hadoop successfullyHow to implement Hadoop successfully
How to implement Hadoop successfully
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017
 

Viewers also liked

Cobra Guard Powerpoint
Cobra Guard PowerpointCobra Guard Powerpoint
Cobra Guard Powerpoint
ltcinfo
 
FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0
Saroj Kumar Sharma
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solution
templedf
 
Windows Azure Mobile Services
Windows Azure Mobile ServicesWindows Azure Mobile Services
Windows Azure Mobile Services
Jan Hentschel
 

Viewers also liked (17)

Datameer
DatameerDatameer
Datameer
 
An Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing DatameerAn Ops Primer to Productionalizing Datameer
An Ops Primer to Productionalizing Datameer
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Why Use Hadoop for Big Data Analytics?
Why Use Hadoop for Big Data Analytics?Why Use Hadoop for Big Data Analytics?
Why Use Hadoop for Big Data Analytics?
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
Rd big data & analytics v1.0
Rd big data & analytics v1.0Rd big data & analytics v1.0
Rd big data & analytics v1.0
 
Data analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - WebinarData analytics and analysis trends in 2015 - Webinar
Data analytics and analysis trends in 2015 - Webinar
 
Defigo Security Solutions
Defigo Security Solutions Defigo Security Solutions
Defigo Security Solutions
 
Cobra Guard Powerpoint
Cobra Guard PowerpointCobra Guard Powerpoint
Cobra Guard Powerpoint
 
Robol
Robol Robol
Robol
 
FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solution
 
Israel redefining innovation at International CES 2015
Israel redefining innovation at International CES 2015Israel redefining innovation at International CES 2015
Israel redefining innovation at International CES 2015
 
Service Cloud für Fortgeschrittene – Die Roadmap für 2012
Service Cloud für Fortgeschrittene – Die Roadmap für 2012Service Cloud für Fortgeschrittene – Die Roadmap für 2012
Service Cloud für Fortgeschrittene – Die Roadmap für 2012
 
Windows Azure Mobile Services
Windows Azure Mobile ServicesWindows Azure Mobile Services
Windows Azure Mobile Services
 
Model-Driven Software Development 2.0
Model-Driven Software Development 2.0Model-Driven Software Development 2.0
Model-Driven Software Development 2.0
 
Model Driven Software Development - Data Model Evolution
Model Driven Software Development - Data Model EvolutionModel Driven Software Development - Data Model Evolution
Model Driven Software Development - Data Model Evolution
 

Similar to Datameer6 for prospects - june 2016_v2

Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
inevitablecloud
 
SAP IQ 16 Product Annoucement
SAP IQ 16 Product AnnoucementSAP IQ 16 Product Annoucement
SAP IQ 16 Product Annoucement
Dobler Consulting
 

Similar to Datameer6 for prospects - june 2016_v2 (20)

Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
 
Journey to analytics in the cloud
Journey to analytics in the cloudJourney to analytics in the cloud
Journey to analytics in the cloud
 
2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation2015 HortonWorks MDA Roadshow Presentation
2015 HortonWorks MDA Roadshow Presentation
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
Big Data Day LA 2016/ Use Case Driven track - How to Use Design Thinking to J...
 
Oracle strategies for a modern business
 Oracle strategies for a modern business  Oracle strategies for a modern business
Oracle strategies for a modern business
 
C1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategyC1 keynote creating_your_enterprise_cloud_strategy
C1 keynote creating_your_enterprise_cloud_strategy
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
SAP IQ 16 Product Annoucement
SAP IQ 16 Product AnnoucementSAP IQ 16 Product Annoucement
SAP IQ 16 Product Annoucement
 
SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Webinar SAP BusinessObjects Cloud (English)
Webinar SAP BusinessObjects Cloud (English)Webinar SAP BusinessObjects Cloud (English)
Webinar SAP BusinessObjects Cloud (English)
 
Journey to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, PythonJourney to SAS Analytics Grid with SAS, R, Python
Journey to SAS Analytics Grid with SAS, R, Python
 
Cwin16 tls-partner-sas new-open_analytics_platform
Cwin16 tls-partner-sas new-open_analytics_platformCwin16 tls-partner-sas new-open_analytics_platform
Cwin16 tls-partner-sas new-open_analytics_platform
 
SAP Vora CodeJam
SAP Vora CodeJamSAP Vora CodeJam
SAP Vora CodeJam
 
Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and HadoopAccelerate Your Big Data Analytics Efforts with SAS and Hadoop
Accelerate Your Big Data Analytics Efforts with SAS and Hadoop
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 

More from Datameer

How to do Predictive Analytics with Limited Data
How to do Predictive Analytics with Limited DataHow to do Predictive Analytics with Limited Data
How to do Predictive Analytics with Limited Data
Datameer
 

More from Datameer (12)

Why Use Hadoop?
Why Use Hadoop?Why Use Hadoop?
Why Use Hadoop?
 
Online Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics WebinarOnline Fraud Detection Using Big Data Analytics Webinar
Online Fraud Detection Using Big Data Analytics Webinar
 
Instant Visualizations in Every Step of Analysis
Instant Visualizations in Every Step of AnalysisInstant Visualizations in Every Step of Analysis
Instant Visualizations in Every Step of Analysis
 
BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics?
 
Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?Is Your Hadoop Environment Secure?
Is Your Hadoop Environment Secure?
 
Fight Fraud with Big Data Analytics
Fight Fraud with Big Data AnalyticsFight Fraud with Big Data Analytics
Fight Fraud with Big Data Analytics
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Lean Production Meets Big Data: A Next Generation Use Case
Lean Production Meets Big Data: A Next Generation Use CaseLean Production Meets Big Data: A Next Generation Use Case
Lean Production Meets Big Data: A Next Generation Use Case
 
The Economics of SQL on Hadoop
The Economics of SQL on HadoopThe Economics of SQL on Hadoop
The Economics of SQL on Hadoop
 
Top 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big DataTop 3 Considerations for Machine Learning on Big Data
Top 3 Considerations for Machine Learning on Big Data
 
How to do Data Science Without the Scientist
How to do Data Science Without the ScientistHow to do Data Science Without the Scientist
How to do Data Science Without the Scientist
 
How to do Predictive Analytics with Limited Data
How to do Predictive Analytics with Limited DataHow to do Predictive Analytics with Limited Data
How to do Predictive Analytics with Limited Data
 

Recently uploaded

JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
Max Lee
 

Recently uploaded (20)

GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf10 Essential Software Testing Tools You Need to Know About.pdf
10 Essential Software Testing Tools You Need to Know About.pdf
 
How To Build a Successful SaaS Design.pdf
How To Build a Successful SaaS Design.pdfHow To Build a Successful SaaS Design.pdf
How To Build a Successful SaaS Design.pdf
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
A Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data MigrationA Guideline to Gorgias to to Re:amaze Data Migration
A Guideline to Gorgias to to Re:amaze Data Migration
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
Tree in the Forest - Managing Details in BDD Scenarios (live2test 2024)
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with StrimziStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 

Datameer6 for prospects - june 2016_v2

  • 1. © 2016 Datameer, Inc. All rights reserved. John Morrell, Senior Director of Product Marketing Sean Anderson, Senior Product Marketing Manager Datameer 6 - The Modern BI Platform for Your Big Data Journey © 2016 Datameer, Inc. All rights reserved.
  • 2. © 2016 Datameer, Inc. All rights reserved. About the Speakers John Morrell Senior Director Product Marketing Sean Anderson Senior Product Marketing Manager
  • 3. © Cloudera, Inc. All rights reserved. 3 Spark will replace MapReduce To become the standard execution engine for Hadoop
  • 4. © 2016 Datameer, Inc. All rights reserved. Data Processing Data may require unique processing characteristics ▪ Batch ▪ Streaming ▪ Real-time Hadoop arose to address one and now the ecosystem is answering the rest. ▪ “We’re doubling down on Spark. We invested earliest, and we’ve invested most, in making Hadoop enterprise-grade” Doug Cutting Data Processing Leverage the right processing for your job
  • 5. © 2016 Datameer, Inc. All rights reserved. Powerful Data Processing - The Most Apache Spark Experience 5 STRUCTURED Sqoop UNSTRUCTURED Kafka, Flume PROCESS, ANALYZE, SERVE UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH Solr SDK Kite Spark: In-memory data processing for developers and data scientists • Easy development • Flexible, extensible API • Fast batch and stream processing Cloudera: Most experience with Spark on Hadoop for instant success • First to ship and support • Most Spark users trained • Most customers running Spark • Most engineering resources (committers, contributors, support) • Only vendor focused on enterprise Spark
  • 6. © 2016 Datameer, Inc. All rights reserved. Apache Spark Cloudera was the first Hadoop vendor to ship and support Spark • Spark is a fully integrated part of Cloudera’s platform • Shared data, metadata, resource management, administration, security, and governance • Complements specialized analytic tools for comprehensive big data platform • Cloudera is the first Hadoop vendor to offer Spark training • Trained more customers than any other vendor • Most popular training course • Cloudera has 5x the engineering resources of the next competitor • Most committers on staff and most changes contributed • Well-trained staff across the globe with expertise implementing a broad range of Spark use cases
  • 7. © 2016 Datameer, Inc. All rights reserved. Cloudera’s Engineering Commitment to Spark 7
  • 8. © 2016 Datameer, Inc. All rights reserved. The Spark Ecosystem & Hadoop 8 STRUCTURED Sqoop UNSTRUCTURED Kafka, Flume UNIFIED SERVICES RESOURCE MANAGEMENT YARN SECURITY Sentry, RecordService FILESYSTEM HDFS RELATIONAL Kudu NoSQL HBase STORE INTEGRATE SQL Impala SEARCH Solr SDK Kite BATCH & STREAM Spark Spark Streaming Spark SQL DataFrames MLlib …
  • 9. © 2016 Datameer, Inc. All rights reserved. Uniting Spark and Hadoop - The One Platform Initiative Investment Areas 9 Management Leverage Hadoop-native resource management. Security Full support for Hadoop security and beyond. Scale Enable 10k-node clusters. Streaming Support for 80% of common stream processing workloads.
  • 10. © 2016 Datameer, Inc. All rights reserved. Community Initiative: Spark Supersedes MapReduce 10
  • 11. © 2016 Datameer, Inc. All rights reserved. Key Cloudera Contributions 11 • Spark-on-YARN integration • Dynamic Resource Allocation • Kafka Integration • HBase Integration • Fixed operational issues at scale Integration with Hadoop Ecosystem Production-Ready Features Ongoing Initiatives • Security • Kerberos Integration • HDFS Sync (Sentry) • Governance • Cloudera Navigator integration (audit & lineage) • Monitoring/Troubleshootin g • Improved debugging • Zero Data Loss • Spark Streaming Resilience • Standard Execution Engine • Hive on Spark • Pig on Spark • Crunch on Spark • Solr indexing on Spark
  • 12. © 2016 Datameer, Inc. All rights reserved. Cloudera Customers 12 • More customers running Spark than all other vendors combined • Over 170 customers • Spark clusters as large as 800 nodes • Diverse range of use cases across multiple industries • Search personalization • Genomics research • Insurance modeling • Advertising optimization • Predictive modeling of disease conditions
  • 13. © 2016 Datameer, Inc. All rights reserved. Cloudera Enterprise, A New Way Forward 13
  • 14. © 2016 Datameer, Inc. All rights reserved. Key Cloudera Contributions 14 ① ② Download or Deploy in the Cloud Signup for Training Contact us or a Partner to Start a POC ③ Getting Started is Easy
  • 15. © 2016 Datameer, Inc. All rights reserved. Datameer 6: The Modern BI Platform for Your Big Data Journey
  • 16. © 2016 Datameer, Inc. All rights reserved. Big Data Journey Challenges Meeting Demand (Productivity) Putting Your Insights to Work Answering New Questions Using More of Your Data Skill Gaps
  • 17. © 2016 Datameer, Inc. All rights reserved. Deep New-Age Questions • What journey do customers take to purchase products? • What actions do customers take before they churn? • What attributes do customers with similar buying behavior have in common? • Why do certain assets have a large impact on our overall risk? • What series of events occur before equipment fails? • Where are my network bottlenecks and how does this impact service?
  • 18. © 2016 Datameer, Inc. All rights reserved. More Data Sources Business Data Digital Interactions Machine Data Marketing Call Center Web Social Media Devices IT Sensors Security Network CRM Sales Financial … … …
  • 19. © 2016 Datameer, Inc. All rights reserved. Datameer: Fastest Time to Insight Time Complexity 4 weeks 8 weeks 12 weeks 18 months $$$ $$ $ Enterprise Data Warehouse ETL Data Warehouse BI Use Case #2 Use Case #3 Use Case #4 Use Case #5 Datameer Use Case #1 Use Case #1 Use Case #6 Use Case #7 Use Case #8 Use Case #9 6 weeks 10 weeks Integrate Analyze Visualize Use Case #2 Use Case #3 Custom Big Data Flume Hive Pig Sqoop Raw Data Hadoop NO CODIN G
  • 20. © 2016 Datameer, Inc. All rights reserved. End-to-end Modern BI Platform Integrate • 70+ Connectors • Wizard-led • Unstructured data • High Performance Prepare/Analyze • 270+ Functions • Instant Profiling • Familiar Spreadsheet UI • Advanced analytics • Smart Data Discovery Visualize • 30+ Widgets • Infographics • HTML 5 Operationalize • Security • Governance • Process integration
  • 21. © 2016 Datameer, Inc. All rights reserved. Agile Self-Service Analytics without Chaos Spreadsheet Collaborative Governance & ControlDrag-n-Drop
  • 22. © 2016 Datameer, Inc. All rights reserved. Advanced Analytics & Smart Data Discovery Clustering Decision Trees Dependencies Recommendations Time Series Analytics Graph & Path Analytics Text Analytics
  • 23. © 2016 Datameer, Inc. All rights reserved. Augment Existing Analytics Visualization & Exploration Traditional BI
  • 24. © 2016 Datameer, Inc. All rights reserved. Enterprise Ready Integrate, Don’t Replace Intelligent Execution Framework Flexible Deployment Options
  • 25. © 2016 Datameer, Inc. All rights reserved. What’s New in Datameer 6? Make Big Data Simple for Everyone Speed-to-InsightEase-of-Use Re-imagined User Experience & Workflow Spark Enable Citizen Data Scientists Abstract Complexity
  • 26. © 2016 Datameer, Inc. All rights reserved. Smart Execution with Spark
  • 27. © 2016 Datameer, Inc. All rights reserved. Differences in Spark Implementations BI on Spark Spark Cluster Embedded Spark Smart Execution Spark • Limited to structured view of data • Like using Hive or Impala in Hadoop • Fully programmatic approach • Need specialized skills $$ • May force you to code at points • No future-proof story • Limited execution frameworks • Eliminates technical complexities • Don’t need specialized skills • Optimizes execution/performanc e
  • 28. © 2016 Datameer, Inc. All rights reserved. New Backend Benefits Future-proof Fastest processing, every time Concentrate on analytics, not backend Abstract Complexity
  • 29. © 2016 Datameer, Inc. All rights reserved. Why does UI and Workflow Matter? “That is the way to learn the most, that when you are doing something with such enjoyment that you don’t notice the time passes.”
  • 30. © 2016 Datameer, Inc. All rights reserved. The Evolution of Analytic Workflow Integrate Transform Analyze Visualize 1st Generation: Multiple Steps 2nd Generation: Self-Service 3rd Generation: Iterative/Fluid
  • 31. © 2016 Datameer, Inc. All rights reserved. Datameer 6 User Experience Parallel Workflow Immediate insight into downstream effects Fresh, Modern UI/UX Significant Time Savings Increased Productivity
  • 32. © 2016 Datameer, Inc. All rights reserved. Demonstration
  • 33. © 2016 Datameer, Inc. All rights reserved. Learn More http://www.datameer.com/product/datameer-6/
  • 34. © 2016 Datameer, Inc. All rights reserved. Thank You!

Editor's Notes

  1. Spark is suited for iterative workloads such as ML models and is fast becoming good at general purpose computational workloads with more integrations coming down the road with frameworks like HBase, Solr etc. MapReduce is suited for I/O intensive workloads where a high level of fault tolerance and scale is required. Spark is slowly eating into the MapReduce workloads as it is maturing up.
  2. Ease of Use
  3. Uninterrupted analytics “flow”