1© Cloudera, Inc. All rights reserved.
The Journey to
Pervasive Analytics
2© Cloudera, Inc. All rights reserved.
We thought we knew the
value of data.
3© Cloudera, Inc. All rights reserved.
Limited Reports Bad Recommendations
4© Cloudera, Inc. All rights reserved.
Pervasive analytics has changed
the way we live.
5© Cloudera, Inc. All rights reserved.
Why pervasive analytics
now?
6© Cloudera, Inc. All rights reserved.
Data is fueling
this opportunity
Clickstream Social Sensor Audio &
Video
7© Cloudera, Inc. All rights reserved.
Access to
diverse analysis
techniques
SQL Time Series Text Graph
8© Cloudera, Inc. All rights reserved.
People require
analytics
“80% of CEOs cite data mining
and analytics as strategically
important.”
-2015 PWC CEO Survey
9© Cloudera, Inc. All rights reserved.
This WILL effect major
economic change
10© Cloudera, Inc. All rights reserved.
*,f ( )
Money People Technology
11© Cloudera, Inc. All rights reserved.
Pervasive Analytics will Spark a Revolution
Industrial Revolution Data Revolution
12© Cloudera, Inc. All rights reserved.
Data Drives Business
Sales Operations
Product
Marketing
Customer Satisfaction
Increase conversions by 2% Convert 5% more leads Reduce fraud by 3%
Reduce churn by 1% Increase user adoption by 10%
13© Cloudera, Inc. All rights reserved.
Marketing Drives Revenue Growth
Increase Conversions by 2%
Traditional Data Driven
10% 12%
Inquiries
Conversion Rate
Revenue
1000 1000
$100 $120
+$20
14© Cloudera, Inc. All rights reserved.
Data Drives Industries
Financial Services Public Sector
Healthcare
Telecommunications
Retail
Optimize network performance Money laundering detection Cyber security detection
Product recommendations Personalized medicine
15© 2014 Cloudera, Inc. All rights reserved.
Healthcare Drives Personalized Medicine
Traditional Data Driven
80% 85%
Patients
Survival Rate
Survivors
500 500
400 425
+25
Increase Conversions by 5%
16© Cloudera, Inc. All rights reserved.
Our customers already know.
17© Cloudera, Inc. All rights reserved.
The journey
is not easy
18© Cloudera, Inc. All rights reserved.
The right
team
is needed.
Product
Technical Data
19© Cloudera, Inc. All rights reserved.
Are you prepared?
Extend Innovate EmpowerTechnical Data Product
1) Diverse Data Ingest
2) Scalable Processing & Storage
3) Secure Environment
20© Cloudera, Inc. All rights reserved.
Are you prepared?
Extend Innovate EmpowerTechnical Data Product
1) Diverse Analytic Techniques
2) Historic and Diverse Data Access
3) Short Time to Value
21© Cloudera, Inc. All rights reserved.
Are you prepared?
Extend Innovate EmpowerTechnical Data Product
1) Little Application Latency
2) Analytic Consistency
3) Flexible Analytic Deployment
22© Cloudera, Inc. All rights reserved.
The right
platform
is needed.
Product
Technical Data
EDH
23© Cloudera, Inc. All rights reserved.
An Enterprise Data Hub, powered by ApacheTM Hadoop
Hadoop delivers:
• One place for unlimited data
• Unified, multi-framework data access
Cloudera delivers:
• Enterprise Security
• Data Governance
• Complete Management
• And more…
Security and Administration
Unlimited Storage
Process Discover Model Serve
Deployment
Flexibility
On-Premises
Appliances
Engineered Systems
Public Cloud
Private Cloud
Hybrid Cloud
24© Cloudera, Inc. All rights reserved.
Community
The right
community
is needed.
Product
Technical Data
EDH
25© Cloudera, Inc. All rights reserved.
The ApacheTM Community
2006 2008 2009 2010 2011 2012 Present
Core Hadoop
(HDFS, MR)
HBase
ZooKeeper
Core Hadoop
Hive
Pig
Mahout
HBase
ZooKeeper
Core Hadoop
Sqoop
Whirr
Avro
Hive
Pig
Mahout
HBase
ZooKeeper
Core Hadoop
Flume
Bigtop
Oozie
MRUnit
HCatalog
Sqoop
Whirr
Avro
Hive
Pig
Mahout
HBase
ZooKeeper
Spark
Impala
Solr
Kafka
Flume
Bigtop
Oozie
MRUnit
HCatalog
Sqoop
Whirr
Avro
Hive
Pig
Mahout
HBase
ZooKeeper
Parquet
Sentry
Spark
Impala
Solr
Kafka
Flume
Bigtop
Oozie
MRUnit
HCatalog
Sqoop
Whirr
Avro
Hive
Pig
Mahout
HBase
ZooKeeper
Core Hadoop
+YARN
Core Hadoop
+YARN
Core Hadoop
+YARN
Hadoop isn’t just
Hadoop anymore.
26© Cloudera, Inc. All rights reserved.
The Cloudera Community
Source: Apache JIRA
January 2012 – October 2014
56%
Other
Training ExpertiseEnterprise Expertise Hadoop Expertise
Security,
Compliance,
Cloud 50,000
27© Cloudera, Inc. All rights reserved.
The Big Data Community
Data
Systems
Enterprise Data Hub
Security and Administration
Unlimited Storage
Process Discover Model Serve
Applications
System Integration
Infrastructure
More than 1,450
partners
Operational
Tools
28© Cloudera, Inc. All rights reserved.
…and you.
CMO
Enterprise Architect
BI Manager
Product
ETL Developer
Sales
29© Cloudera, Inc. All rights reserved.
Its time to kickstart your
journey to pervasive analytics.
Data Discovery & Analytics:
Better analytics
Operational Analytics:
Smarter decisions
Operational Data Store:
More data in less time
Thank you.

Keynote: The Journey to Pervasive Analytics

  • 1.
    1© Cloudera, Inc.All rights reserved. The Journey to Pervasive Analytics
  • 2.
    2© Cloudera, Inc.All rights reserved. We thought we knew the value of data.
  • 3.
    3© Cloudera, Inc.All rights reserved. Limited Reports Bad Recommendations
  • 4.
    4© Cloudera, Inc.All rights reserved. Pervasive analytics has changed the way we live.
  • 5.
    5© Cloudera, Inc.All rights reserved. Why pervasive analytics now?
  • 6.
    6© Cloudera, Inc.All rights reserved. Data is fueling this opportunity Clickstream Social Sensor Audio & Video
  • 7.
    7© Cloudera, Inc.All rights reserved. Access to diverse analysis techniques SQL Time Series Text Graph
  • 8.
    8© Cloudera, Inc.All rights reserved. People require analytics “80% of CEOs cite data mining and analytics as strategically important.” -2015 PWC CEO Survey
  • 9.
    9© Cloudera, Inc.All rights reserved. This WILL effect major economic change
  • 10.
    10© Cloudera, Inc.All rights reserved. *,f ( ) Money People Technology
  • 11.
    11© Cloudera, Inc.All rights reserved. Pervasive Analytics will Spark a Revolution Industrial Revolution Data Revolution
  • 12.
    12© Cloudera, Inc.All rights reserved. Data Drives Business Sales Operations Product Marketing Customer Satisfaction Increase conversions by 2% Convert 5% more leads Reduce fraud by 3% Reduce churn by 1% Increase user adoption by 10%
  • 13.
    13© Cloudera, Inc.All rights reserved. Marketing Drives Revenue Growth Increase Conversions by 2% Traditional Data Driven 10% 12% Inquiries Conversion Rate Revenue 1000 1000 $100 $120 +$20
  • 14.
    14© Cloudera, Inc.All rights reserved. Data Drives Industries Financial Services Public Sector Healthcare Telecommunications Retail Optimize network performance Money laundering detection Cyber security detection Product recommendations Personalized medicine
  • 15.
    15© 2014 Cloudera,Inc. All rights reserved. Healthcare Drives Personalized Medicine Traditional Data Driven 80% 85% Patients Survival Rate Survivors 500 500 400 425 +25 Increase Conversions by 5%
  • 16.
    16© Cloudera, Inc.All rights reserved. Our customers already know.
  • 17.
    17© Cloudera, Inc.All rights reserved. The journey is not easy
  • 18.
    18© Cloudera, Inc.All rights reserved. The right team is needed. Product Technical Data
  • 19.
    19© Cloudera, Inc.All rights reserved. Are you prepared? Extend Innovate EmpowerTechnical Data Product 1) Diverse Data Ingest 2) Scalable Processing & Storage 3) Secure Environment
  • 20.
    20© Cloudera, Inc.All rights reserved. Are you prepared? Extend Innovate EmpowerTechnical Data Product 1) Diverse Analytic Techniques 2) Historic and Diverse Data Access 3) Short Time to Value
  • 21.
    21© Cloudera, Inc.All rights reserved. Are you prepared? Extend Innovate EmpowerTechnical Data Product 1) Little Application Latency 2) Analytic Consistency 3) Flexible Analytic Deployment
  • 22.
    22© Cloudera, Inc.All rights reserved. The right platform is needed. Product Technical Data EDH
  • 23.
    23© Cloudera, Inc.All rights reserved. An Enterprise Data Hub, powered by ApacheTM Hadoop Hadoop delivers: • One place for unlimited data • Unified, multi-framework data access Cloudera delivers: • Enterprise Security • Data Governance • Complete Management • And more… Security and Administration Unlimited Storage Process Discover Model Serve Deployment Flexibility On-Premises Appliances Engineered Systems Public Cloud Private Cloud Hybrid Cloud
  • 24.
    24© Cloudera, Inc.All rights reserved. Community The right community is needed. Product Technical Data EDH
  • 25.
    25© Cloudera, Inc.All rights reserved. The ApacheTM Community 2006 2008 2009 2010 2011 2012 Present Core Hadoop (HDFS, MR) HBase ZooKeeper Core Hadoop Hive Pig Mahout HBase ZooKeeper Core Hadoop Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Core Hadoop Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Spark Impala Solr Kafka Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Parquet Sentry Spark Impala Solr Kafka Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Core Hadoop +YARN Core Hadoop +YARN Core Hadoop +YARN Hadoop isn’t just Hadoop anymore.
  • 26.
    26© Cloudera, Inc.All rights reserved. The Cloudera Community Source: Apache JIRA January 2012 – October 2014 56% Other Training ExpertiseEnterprise Expertise Hadoop Expertise Security, Compliance, Cloud 50,000
  • 27.
    27© Cloudera, Inc.All rights reserved. The Big Data Community Data Systems Enterprise Data Hub Security and Administration Unlimited Storage Process Discover Model Serve Applications System Integration Infrastructure More than 1,450 partners Operational Tools
  • 28.
    28© Cloudera, Inc.All rights reserved. …and you. CMO Enterprise Architect BI Manager Product ETL Developer Sales
  • 29.
    29© Cloudera, Inc.All rights reserved. Its time to kickstart your journey to pervasive analytics. Data Discovery & Analytics: Better analytics Operational Analytics: Smarter decisions Operational Data Store: More data in less time
  • 30.

Editor's Notes

  • #3 Key Takeaway: Analytics are accelerating the pace of learning. But as they accelerate the pace of learning and continue to be applied to new use cases, we need to make sure we get the right analytics to the right people, and that is not always the case. People don’t always have access to the right information, if any at all.
  • #4 Key Takeaway: What does analytics look like today? Whether you are looking at the business or consumer space, analytics are supplying us value today. But this value is still limited and not always what we want. What happens when that report is not quite right? Or what happens to your product recommendations when your daughter makes a purchase?
  • #5 Key Takeaway: Analytics are becoming ingrained into everything we do. They are informing the companies and products that we use everyday. Sometimes customers are the users of the analytics, other times the company uses them to offer a better service to that customer. Tell a story piecing all of the use cases on the slides together (Don’t name customers)… A box with hardware in it (Netapp: predictive support) Electricity from light (Opower: Energy usage analytics). The right ad served to you on the computer (OpenX ad exchange) Detecting malware for your business (CounterTack security platform) A new technology is emerging that combines data science expertise with deep understanding of business problems. These solutions use algorithmic data mining on your own data and often on external third party data accessible by cloud ecosystems and APIs. Data Driven Solutions make predictions about business functions, prescribe what to do next, and in many cases take action autonomously. Trained analysts are not required to query databases. Instead, business users get answers directly from the software. These answers typically feed seamlessly into the flow of business activity, often invisibly. These solutions make predictions about business functions, prescribe what to do next, and in many cases take action autonomously. http://www.forbes.com/sites/ciocentral/2014/04/18/8-ways-to-build-and-use-the-new-breed-of-data-driven-applications/
  • #7 Key takeaway: We can measure and act on everything now. 16B connected devices. Only those that can harness this data can take advantage of it. “If you can’t measure it, you can’t fix it.” –DJ Patil Source: http://www.forbes.com/sites/gilpress/2014/08/22/internet-of-things-by-the-numbers-market-estimates-and-forecasts/
  • #8 Key Takeaway: We can analyze anything now. Numerical, text, audio, video. We are now able to discover insights in complex data. Leveraging text analytics, rich media analytics, graph analytics, time series, etc. All of these analytics allow us to get a complete understanding of any data problem we are trying to solve. And they are no longer limited by data. This allows us to enter new use cases and expand our understanding of the problems at hand. Analytics continues to drive more value. Early analytic returns 13X per $1 spent
  • #9 Key Takeaway: Source: 18th Annual Global CEO survey
  • #10 Key Takeaway: Through implementing countless analytics solutions across a variety of industries we have learned some secretes and have acquired some skills. Let’s examine what is needed in order to reach a pervasive analytics end state.
  • #11 Key Takeaways: The model for growth is simple. It is a function of money, people, and technology. The only purpose of technology is to make people more efficient. In the context of pervasive analytics there are two ways to think of this. People need to build, deploy, and manage analytics more efficiently and will this analytic make the person more efficient. If we can do both, then serious economic change is upon us. In order to create a revolution we must direct our attention to how we can make our labor more efficient. How can analytics help make our employees more efficient? We aren’t talking about data. Solow growth model for macro economic growth: Y = f (K, L*E) K= capital L= labor E = labor efficiency (technology)
  • #12 Key Takeaways: In order to create a revolution we must direct our attention to how we can make people more efficient. And the final answer isn’t having them sift through more data. We need to provide them the right information. Industrial revolution = manufacturing technology improved allowing for massive economic growth. Green revolution = Produce more per acre with less labor allowing people to leave the fields and add to the GDP. Data revolution = repetitive decision making will be automated. This time savings will allow humans to tackle unforeseen problems. When this happens, humans won’t sit idle, they are too curious.
  • #13 Key Takeaways: Employees are already asking the right questions, we just need to help them achieve their goals through the use of data.
  • #15 Key Takeaways: Industries are already beginning to transform. How can data help transform the way employees and customers interact with these industries.
  • #17 Key Takeaways: Our customers are already thinking this way. How do we offer a better product or service that differentiates us through the use of data. Tell a story piecing all of the customers on the slides together… A box with hardware in it (Netapp: predictive support) Electricity from light (Opower: Energy usage analytics). The right ad served to you on the computer (OpenX ad exchange) Detecting malware for your business (CounterTack security platform)
  • #18 Key Takeaway: Through implementing countless analytics solutions across a variety of industries we have learned some secretes and have acquired some skills. Let’s examine what is needed in order to reach a pervasive analytics end state.
  • #19 Key Takeaway: No one person can roll out an analytic solution to end consumers. There are 3 groups needed. IT, Analysts, and Users. What do each of the groups care about? Technical (IT or engineering) care about… Flexibility Scalability and robustness It just works Data (Analysts or Data Scientists) care about… Rapid experimentation Model development Building the Right metrics Product (Employees or customers) care about… Solution vision Impact Moving Success Metrics
  • #20 Key takeaway: But there are obvious challenges that the team must address in order to effectively build analytic applications for the average business user. Ingesting diverse data Securely storing more data Processing data efficiently for operational and analytical use so there is not latency. Can your system handle today’s and tomorrow’s use cases?
  • #21 Key takeaway: What are the data challenges that the data team is thinking about. Having access to the different analytic techniques they need. They need more than just SQL access. Historic and diverse data access for smarter analytics (large sample sets to test against, better predictions and view from through complementary data) Can they iterate fast? What is their model development time?
  • #22 Key takeaway: What are the user challenges they are facing. Latency in analytics being fed to users can causes users to bounce off of the applications Analytics need to be consistent Bring the analytics to the user
  • #23 Key Takeaway: Keeping in mind the needs and the challenges of the team, what is needed to help the team as they continue to bring analytics to the masses? An enterprise data hub,
  • #24 In response, many organizations have turned to a new architecture – an enterprise data hub – to complement and extend existing investments. An enterprise data hub can store unlimited data, cost-effectively and reliably, for as long as you need, and lets users access that data in a variety of ways. Data can be collected, stored, processed, explored, modeled, and served in one unified platform. It’s connected to the systems you already rely on. Cloudera’s enterprise data hub, powered by Apache Hadoop, the popular open source distributed data platform, is differentiated in several crucial areas. We provide: Leading query performance. The enterprise management and governance that you require of all of your mission-critical infrastructure. Comprehensive, transparent, compliance-ready security at the core. An open source platform that is also built of open standards – projects that are supported by multiple vendors to ensure sustainability, portability, and compatibility. Our platform runs in your choice of environment, whether on-premises or in the cloud. === Cheat Sheet version: Our enterprise data hub is: One place for unlimited data Accessible to anyone Connected to the systems you already depend on Secure, governed, managed & compliant Built on open source and open standards Deployed however you want Coupled with the support and enablement you need to succeed. Important Note: Our EDH emphasizes “unified analytics” over “unified data”: It’s not practical or probable that customers will actually unify all their data. Much of it lives in the cloud or on storage (e.g. Isilon), in remote datacenters, is of uncertain value vs. cost of moving it to a hub, or security mandates preclude collocation. We enable customers to gather unlimited data, while bringing diverse processing and analytics to that data.
  • #25 Key Takeaway: So we talked about the right team and the right platform, but no one organization can do it on their own, they need a supporting community to help them along the way.
  • #26 Key Takeaway: First and foremost we have the Apache Hadoop community. This ever growing community continues to add projects in order to more efficiently protect
  • #27 Key Takeaway: Expertise,
  • #28 Cloudera partners more broadly and deeply across the Hadoop ecosystem than any other vendor. With over 1200 partners and counting, our partnerships offer: Compatibility with your existing tools and skills 160+ certified on Cloudera 5, including all 12 of the 12 Gartner Business Intelligence Magic Quadrant leaders Flexible deployment options On-premises Public, private, or hybrid cloud Appliances and engineered systems Partnerships you can trust Deep engineering relationships Comprehensive certification program
  • #29 Key Takeaway: Everyone is a data innovator. It is not just IT, it is everyone in the business. Identify the solutions you want to build and understand the challenges that you need to overcome. You need to start asking the right questions… you are in control.
  • #30 Key Takeaway: Depending on your organizations current challenges, you will start the journey at a different place. Having trouble ingesting, storing, and processing data? Or having trouble creating a flexible analytic environment? Or do you want to serve analytics out to more end users?