SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Building a Modern Analytic
Database with Cloudera 5.8
Justin Erickson | Sr Director of Product | Cloudera
Andy Frey | CIO | Marketing Associates
2© Cloudera, Inc. All rights reserved.
Agenda
• Building a Modern Analytic Database with Hadoop
• Key Use Cases Enabled
• What’s New with Cloudera 5.8
• Marketing Associates Customer Case Study
• What’s Next?
3© Cloudera, Inc. All rights reserved.
Common Application Patterns
Operational Efficiency New Business Value
OPERATIONS
DATAMANAGEMENT
UNIFIED SERVICES
PROCESS,ANALYZE, SERVE
STORE
INTEGRATE
Process data, develop &
serve predictive models
Data
Engineering &
Science
ELT, reporting, exploratory
business intelligence
Analytic
Database
Build data-driven
applications
to deliver real-time insights.
Operational
Database
4© Cloudera, Inc. All rights reserved.
Analytic
Database
More data of all types is being
tapped for analytics, across
environments
Self-Service BI & Data
Open up new possibilities
for real-time insights as
data changes
Real-Time Analysis
BI & analytics are critical but
only tell part of the story. Get
more value by sharing data
across workloads
Converged Workloads
5© Cloudera, Inc. All rights reserved.
Key Use Cases
EDW
Optimization
Data
Preparation
Self-Service BI
& Exploration
Use your EDW more
efficiently by offloading
workloads to Hadoop
Fast, flexible ETL over large
data volumes, so data is always
ready for your business
Fastest time-to-insights with a modern
analytic database designed with
Hadoop’s flexibility and agility
6© Cloudera, Inc. All rights reserved.
Cloudera’s Analytic Database Solution
OPERATIONS
DATAMANAGEMENT
UNIFIED SERVICES
PROCESS,ANALYZE, SERVE
STORE
INTEGRATE
Identify, offload, &
optimize workloads to
Hadoop
Navigator
Optimizer
Intelligent SQL editor
Hue
Audit, lineage,
encryption, key
management, & policy
lifecycles
Navigator
Integration with the
leading BI tools
BI Partners
Interactive query engine
for BI & SQL analytics
Impala
Large-scale ETL & batch
processing engine
Hive-on-
Spark
7© Cloudera, Inc. All rights reserved.
ETL & Data Preparation
• Flexible & Scalable
• Process larger data volumes, of
any type
• Fastest Data Processing
• Distributed processing and
best-of-breed technologies for
the fastest performance
• Minimize Data Movement
• Prepared data immediately
available for analytics with
shared storage and metadata
8© Cloudera, Inc. All rights reserved.
Self-Service BI & Exploratory Analytics
• Self-Service Data Agility
• No rigid data modeling encumbrances for agile acquisition
• Iteratively analyze and flexibly model
• Self-Service Exploratory Analytics
• Interactive responses for iterative exploration
• Confidently handle all BI and SQL users
• Cost-Effective Scalability with Users/Data
• Easily add nodes to handle more data and users
• Leverage the full potential of available data
• Productively Use Existing Tools and Skills
• Integration with all leading BI tools & compatible analytic
SQL language
• Metadata and lineage for easy data discovery
• Intelligent SQL editor for greater developer productivity
9© Cloudera, Inc. All rights reserved.
Optimize the Enterprise Data Warehouse
• Decrease Storage Costs
• Focus on high-value reporting data in
the EDW
• Keep More/All Data Online
• Unlimited scale keeps data accessible
and out of archive
• Improve Performance
• Eliminate contention and meet SLAs for
routine reporting
• Get New Insights
• Enable ad hoc and exploratory analytics
Siemens’ TCO Assessment (cost/TB)
10© Cloudera, Inc. All rights reserved.
What’s New in Cloudera 5.8
11© Cloudera, Inc. All rights reserved.
Advancements with Cloudera 5.8
Impala Hue
Navigator
Optimizer
• Cloud-Native:
• Read/write directly
from Amazon S3
• Performance:
• >10x faster
performance on
secure clusters
• Data Discovery:
• Preview, tag, search, pin
tables in browser
• Query Design Assistance:
• Autocomplete of tables,
columns, syntax
• Efficient troubleshooting
• Collaboration & Sharing:
• Save & share queries
with peers
• Set permissions directly
on results
• Now GA!
• Ease offloading path to
Hadoop
• Active Data
Optimization to enable
peak performance for
Hive and Impala
12© Cloudera, Inc. All rights reserved.
Self-Service Data Discovery & BI
at Marketing Associates
Andy Frey
13© Cloudera, Inc. All rights reserved.
About Me – Andy Frey
From Assembler to Ajax, Modem to Mobile, and Mainframe to Cloud, Andy Frey,
developed his deep knowledge as a technologist, and CIO at leading national
corporations such as GAB Robins, Compuware, J. Walter Thompson, Coolfire and
now Marketing Associates, providing Fortune 100 corporations with
technologically advanced enterprise solutions.
14© Cloudera, Inc. All rights reserved.
Introducing Magnify and Marketing Associates
• Magnify Analytic Solutions — a wholly-owned division of Detroit, Michigan-based
Marketing Associates serving primarily Fortune 100 clients — uses technology-driven
data analysis to offer clients a range of informed business services that increase
profitability through its four lines of service: business intelligence, digital intelligence,
credit risk management, and marketing analytics.
• Established in 1967 Marketing Associates is a full-service, technology enabled marketing
services company headquartered in Detroit, Michigan with offices in Wilmington,
Delaware and Charlotte, North Carolina. MA offers private and public cloud hosting,
custom web development, and data transformation among its’ IT based services.
Offering Cloudera Hadoop IaaS and experienced Data Scientists
15© Cloudera, Inc. All rights reserved.
Different Challenges for Different Clients
The B2C Challenge
• Previously using expensive RDBMS systems to deliver B2C marketing contests and
product giveaways. Up to 150 in a year.
• Huge spikes in web event data posed challenges. 200,000 hits in first minute for
popular brands’ campaigns.
• Cost to license for biggest spike made projects unprofitable.
• Also needed to monitor and manipulate massive amounts of data in real time.
RDBMS could not respond adequately during massive data intake during
campaign run. “When has a campaign reached its limit? Has total supply of
product been allocated?”
16© Cloudera, Inc. All rights reserved.
Different Challenges for Different Clients
The CRM Challenge
• Another project for a large client involved managing a repository of customer
data from multiple sources. The magnitude was vast, data was multi-structured
and new sources were being added on a regular basis.
• Initially executed using 4 relational databases, query times slowed and costs
soared.
• Difficulty merging unstructured data from multiple sources using traditional
RDBMS.
• Deployment of prominent SQL RDBMS estimated @ $5 million cost (approx. 150
terabyte).
17© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spikes in web event data.
• Manage and manipulate massive data volumes in
real-time.
• Scalability and performance.
• Ability to skill transfer from current SQL based
programming team.
• Reduce costs.
18© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spikes in web event data.
• Manage and manipulate massive data volumes in
real-time.
• Scalability and performance.
• Ability to skill transfer from current SQL based
programming team.
• Reduce costs.
Considered various offerings:
• Considered SQL Server (discarded due to cost).
• Knew Hadoop could be the solution and started
looking at commercial implementations.
• Considered non-commercial & shorted listed two
Hadoop vendors: Cloudera & Hortonworks.
• Determined non-commercial too risky, too
burdensome – left it to the experts.
19© Cloudera, Inc. All rights reserved.
Evaluation & Decision
Key criteria for modern analytic database:
• Handle huge spikes in web event data.
• Manage and manipulate massive data volumes in
real-time.
• Scalability and performance.
• Ability to skill transfer from current SQL based
programming team.
• Reduce costs.
Considered various offerings:
• Considered SQL Server (discarded due to cost).
• Knew Hadoop could be the solution and started
looking at commercial implementations.
• Considered non-commercial & shorted listed two
Hadoop vendors: Cloudera & Hortonworks.
• Determined non-commercial too risky, too
burdensome – left it to the experts.
• Launched June 2014
• Why Hadoop: Cost, Tech Requirements, Data Size
• Why Cloudera: Most mature solution with better overall enterprise toolset; Cloudera Team
Decision
20© Cloudera, Inc. All rights reserved.
Solution
• Hadoop Platform:
• Cloudera Enterprise
• Hadoop Components:
• Apache Flume, Apache Sqoop, Apache Hive, MapReduce, Apache Impala (incubating),
Hue, Cloudera Manager
• Third-Party BI & Analytic Tools:
• D3.js, SAS, Tableau, R, Angoss
• Security Tools:
• Kerberos, Apache Sentry, Cloudera Navigator
21© Cloudera, Inc. All rights reserved.
Solution: Self-Service Data Discovery & BI
• Self-service data discovery capabilities allow us to eliminate the need for distribution of
multiple Excel reports instead allowing our clients to interact directly with Hadoop.
• Security enhanced as the need for distribution of Excel reports via email went away.
• Use of Tableau to run Impala queries produces real-time reporting resulting in significant
value add and convenience for our clients.
• Offers scalability and flexibility to accommodate diverse and growing client demands.
• Allows us to scale our web event product giveaways.
• Accommodates the addition of new data sources.
• Easily add nodes to avoid potential performance bottlenecks.
22© Cloudera, Inc. All rights reserved.
Why We Chose Cloudera
• Cloudera Manager became a major differentiator.
• Made cluster management easy
• User friendly = reduced learning curve
• Chose Impala for its real-time query performance.
• Proven Cloudera innovation and zeal to maintain an enterprise class solution by
offering new tools and functions while maintaining/supporting the Apache
project.
• Cloudera appeared to be the prominent choice of large Hadoop installs in the
Fortune 500. Best IT Analyst rating.
• Impressed with Cloudera team before purchase.
23© Cloudera, Inc. All rights reserved.
Benefits & Impact
• All-inclusive Cloudera Enterprise costs less than the required relational database licenses alone -
Over 90% cost reduction.
• Other benefits - cheaper hardware and easier to manage.
• Cloudera Navigator provides a single interface to locate and classify data, audit who is accessing
what data, and protect the data with centralized key management.
• Critical tool when handling PII and other sensitive data. Comprehensive audit trail allows for
easy monitoring of PII data access.
• Allows us to satisfy strict security compliance regulations with ease.
• Cloudera Professional Services are knowledgeable, responsive, and help establish best practices
for our internal development team. They helped us get it right the second time.
Any time we had a crisis they were there to help
Why we are glad we chose Cloudera?
24© Cloudera, Inc. All rights reserved.
Lessons Learned
• First used non-Cloudera consulting: Big mistake – design incorrect for data collected. Work with
Cloudera Professional Services to design it right the first time.
• Start small, if you can, and grow solution.
• Don’t need big capital investment upfront
• Get value out of small cluster (eg. 3 nodes) and expand as needed.
• Install services to meet your current needs. Install additional services as your data needs change.
• Look at all Cloudera solutions, learn them, and use them.
• Training: Be generous, conduct in phases to keep new skills relevant as you build and deploy.
• What’s next?
• Prebuilt analytical models as a platform.
• Evaluate Navigator Optimizer to improve query performance and identify best candidates for legacy
application migration
25© Cloudera, Inc. All rights reserved.
What’s Next for Cloudera’s Analytic
Database?
26© Cloudera, Inc. All rights reserved.
Analytic Database Roadmap
Faster, richer, more expressive
SQL
• Hive-on-Spark GA
• Insert, update delete via Kudu
• Performance improvements
• Nested JSON
Improved multitenancy
• Fewer OOM errors
• Graceful node decomission
• Admission control enhancements
• Improved YARN integration
Better SQL workbench
• Higher Hue concurrency
• SQL editor usability improvements
• Intelligent recommendations of tables,
joins & more for Hue users
• Exposing tags & lineage through the
Hue query experience
Deeper integration with BI
tools
• Joint workload optimizations
• Support for nested types and s
• Data discovery functionality injected
into the BI experience
Workload optimization
• Multi-platform workload profiling
• Recommendation of in-line
materialized views
Confidential – Do not Redistribute
27© Cloudera, Inc. All rights reserved.
Next Steps
• Download Cloudera 5.8
• cloudera.com/downloads
• Release Notes
• cloudera.com/documentation/enterprise/release-
notes/topics/rg_release_notes.html
• Learn more about Navigator Optimizer and BI in the Cloud
• Register for Parts 2 & 3 of the Webinar Series!
• cloudera.com/about-cloudera/events/webinars/5-8-webinar-series.html
28© Cloudera, Inc. All rights reserved.
Questions?

More Related Content

What's hot

Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with Search
Cloudera, Inc.
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Cloudera, Inc.
 
RecordService for Unified Access Control
RecordService for Unified Access ControlRecordService for Unified Access Control
RecordService for Unified Access Control
Cloudera, Inc.
 
Relying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services ExperienceRelying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services Experience
Cloudera, Inc.
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
Cloudera, Inc.
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
Cloudera, Inc.
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
Cloudera, Inc.
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
Cloudera, Inc.
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
Cloudera, Inc.
 
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
Cloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
 
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
Cloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
ArabNet ME
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
Cloudera, Inc.
 

What's hot (20)

Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
Turning Petabytes of Data into Profit with Hadoop for the World’s Biggest Ret...
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Intuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with SearchIntuitive Real-Time Analytics with Search
Intuitive Real-Time Analytics with Search
 
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache KuduMoving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 
RecordService for Unified Access Control
RecordService for Unified Access ControlRecordService for Unified Access Control
RecordService for Unified Access Control
 
Relying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services ExperienceRelying on Data for Strategic Decision-Making--Financial Services Experience
Relying on Data for Strategic Decision-Making--Financial Services Experience
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...
 
Advanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine LearningAdvanced Analytics for Investment Firms and Machine Learning
Advanced Analytics for Investment Firms and Machine Learning
 
Secure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game ChangersSecure Data - Why Encryption and Access Control are Game Changers
Secure Data - Why Encryption and Access Control are Game Changers
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
The Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in ChurnThe Big Picture: Learned Behaviors in Churn
The Big Picture: Learned Behaviors in Churn
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
Evolution from Apache Hadoop to the Enterprise Data Hub by Cloudera - ArabNet...
 
Engaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap UpEngaging with Cloudera & Morning Wrap Up
Engaging with Cloudera & Morning Wrap Up
 

Similar to Building a Modern Analytic Database with Cloudera 5.8

Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
IntelAPAC
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
Hortonworks
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
Edgar Alejandro Villegas
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
Cloudera, Inc.
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Cloudera, Inc.
 
Assessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-ModelAssessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-Model
DATAVERSITY
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

Cloudera, Inc.
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
DataWorks Summit/Hadoop Summit
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
MapR Technologies
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Precisely
 
Big Data
Big DataBig Data
Big Data
Charter Global
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
IBM
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
Datameer
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA ProfileZarul Zaabah
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concreto
HP Enterprise Italia
 

Similar to Building a Modern Analytic Database with Cloudera 5.8 (20)

Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014CSC - Presentation at Hortonworks Booth - Strata 2014
CSC - Presentation at Hortonworks Booth - Strata 2014
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Making Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the EnterpriseMaking Self-Service BI a Reality in the Enterprise
Making Self-Service BI a Reality in the Enterprise
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
Increase your ROI with Hadoop in Six Months - Presented by Dell, Cloudera and...
 
Assessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-ModelAssessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-Model
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
Big Data
Big DataBig Data
Big Data
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Come fare business con i big data in concreto
Come fare business con i big data in concretoCome fare business con i big data in concreto
Come fare business con i big data in concreto
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
Jelle | Nordend
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 

Recently uploaded (20)

In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
De mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FMEDe mooiste recreatieve routes ontdekken met RouteYou en FME
De mooiste recreatieve routes ontdekken met RouteYou en FME
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 

Building a Modern Analytic Database with Cloudera 5.8

  • 1. 1© Cloudera, Inc. All rights reserved. Building a Modern Analytic Database with Cloudera 5.8 Justin Erickson | Sr Director of Product | Cloudera Andy Frey | CIO | Marketing Associates
  • 2. 2© Cloudera, Inc. All rights reserved. Agenda • Building a Modern Analytic Database with Hadoop • Key Use Cases Enabled • What’s New with Cloudera 5.8 • Marketing Associates Customer Case Study • What’s Next?
  • 3. 3© Cloudera, Inc. All rights reserved. Common Application Patterns Operational Efficiency New Business Value OPERATIONS DATAMANAGEMENT UNIFIED SERVICES PROCESS,ANALYZE, SERVE STORE INTEGRATE Process data, develop & serve predictive models Data Engineering & Science ELT, reporting, exploratory business intelligence Analytic Database Build data-driven applications to deliver real-time insights. Operational Database
  • 4. 4© Cloudera, Inc. All rights reserved. Analytic Database More data of all types is being tapped for analytics, across environments Self-Service BI & Data Open up new possibilities for real-time insights as data changes Real-Time Analysis BI & analytics are critical but only tell part of the story. Get more value by sharing data across workloads Converged Workloads
  • 5. 5© Cloudera, Inc. All rights reserved. Key Use Cases EDW Optimization Data Preparation Self-Service BI & Exploration Use your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over large data volumes, so data is always ready for your business Fastest time-to-insights with a modern analytic database designed with Hadoop’s flexibility and agility
  • 6. 6© Cloudera, Inc. All rights reserved. Cloudera’s Analytic Database Solution OPERATIONS DATAMANAGEMENT UNIFIED SERVICES PROCESS,ANALYZE, SERVE STORE INTEGRATE Identify, offload, & optimize workloads to Hadoop Navigator Optimizer Intelligent SQL editor Hue Audit, lineage, encryption, key management, & policy lifecycles Navigator Integration with the leading BI tools BI Partners Interactive query engine for BI & SQL analytics Impala Large-scale ETL & batch processing engine Hive-on- Spark
  • 7. 7© Cloudera, Inc. All rights reserved. ETL & Data Preparation • Flexible & Scalable • Process larger data volumes, of any type • Fastest Data Processing • Distributed processing and best-of-breed technologies for the fastest performance • Minimize Data Movement • Prepared data immediately available for analytics with shared storage and metadata
  • 8. 8© Cloudera, Inc. All rights reserved. Self-Service BI & Exploratory Analytics • Self-Service Data Agility • No rigid data modeling encumbrances for agile acquisition • Iteratively analyze and flexibly model • Self-Service Exploratory Analytics • Interactive responses for iterative exploration • Confidently handle all BI and SQL users • Cost-Effective Scalability with Users/Data • Easily add nodes to handle more data and users • Leverage the full potential of available data • Productively Use Existing Tools and Skills • Integration with all leading BI tools & compatible analytic SQL language • Metadata and lineage for easy data discovery • Intelligent SQL editor for greater developer productivity
  • 9. 9© Cloudera, Inc. All rights reserved. Optimize the Enterprise Data Warehouse • Decrease Storage Costs • Focus on high-value reporting data in the EDW • Keep More/All Data Online • Unlimited scale keeps data accessible and out of archive • Improve Performance • Eliminate contention and meet SLAs for routine reporting • Get New Insights • Enable ad hoc and exploratory analytics Siemens’ TCO Assessment (cost/TB)
  • 10. 10© Cloudera, Inc. All rights reserved. What’s New in Cloudera 5.8
  • 11. 11© Cloudera, Inc. All rights reserved. Advancements with Cloudera 5.8 Impala Hue Navigator Optimizer • Cloud-Native: • Read/write directly from Amazon S3 • Performance: • >10x faster performance on secure clusters • Data Discovery: • Preview, tag, search, pin tables in browser • Query Design Assistance: • Autocomplete of tables, columns, syntax • Efficient troubleshooting • Collaboration & Sharing: • Save & share queries with peers • Set permissions directly on results • Now GA! • Ease offloading path to Hadoop • Active Data Optimization to enable peak performance for Hive and Impala
  • 12. 12© Cloudera, Inc. All rights reserved. Self-Service Data Discovery & BI at Marketing Associates Andy Frey
  • 13. 13© Cloudera, Inc. All rights reserved. About Me – Andy Frey From Assembler to Ajax, Modem to Mobile, and Mainframe to Cloud, Andy Frey, developed his deep knowledge as a technologist, and CIO at leading national corporations such as GAB Robins, Compuware, J. Walter Thompson, Coolfire and now Marketing Associates, providing Fortune 100 corporations with technologically advanced enterprise solutions.
  • 14. 14© Cloudera, Inc. All rights reserved. Introducing Magnify and Marketing Associates • Magnify Analytic Solutions — a wholly-owned division of Detroit, Michigan-based Marketing Associates serving primarily Fortune 100 clients — uses technology-driven data analysis to offer clients a range of informed business services that increase profitability through its four lines of service: business intelligence, digital intelligence, credit risk management, and marketing analytics. • Established in 1967 Marketing Associates is a full-service, technology enabled marketing services company headquartered in Detroit, Michigan with offices in Wilmington, Delaware and Charlotte, North Carolina. MA offers private and public cloud hosting, custom web development, and data transformation among its’ IT based services. Offering Cloudera Hadoop IaaS and experienced Data Scientists
  • 15. 15© Cloudera, Inc. All rights reserved. Different Challenges for Different Clients The B2C Challenge • Previously using expensive RDBMS systems to deliver B2C marketing contests and product giveaways. Up to 150 in a year. • Huge spikes in web event data posed challenges. 200,000 hits in first minute for popular brands’ campaigns. • Cost to license for biggest spike made projects unprofitable. • Also needed to monitor and manipulate massive amounts of data in real time. RDBMS could not respond adequately during massive data intake during campaign run. “When has a campaign reached its limit? Has total supply of product been allocated?”
  • 16. 16© Cloudera, Inc. All rights reserved. Different Challenges for Different Clients The CRM Challenge • Another project for a large client involved managing a repository of customer data from multiple sources. The magnitude was vast, data was multi-structured and new sources were being added on a regular basis. • Initially executed using 4 relational databases, query times slowed and costs soared. • Difficulty merging unstructured data from multiple sources using traditional RDBMS. • Deployment of prominent SQL RDBMS estimated @ $5 million cost (approx. 150 terabyte).
  • 17. 17© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs.
  • 18. 18© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs. Considered various offerings: • Considered SQL Server (discarded due to cost). • Knew Hadoop could be the solution and started looking at commercial implementations. • Considered non-commercial & shorted listed two Hadoop vendors: Cloudera & Hortonworks. • Determined non-commercial too risky, too burdensome – left it to the experts.
  • 19. 19© Cloudera, Inc. All rights reserved. Evaluation & Decision Key criteria for modern analytic database: • Handle huge spikes in web event data. • Manage and manipulate massive data volumes in real-time. • Scalability and performance. • Ability to skill transfer from current SQL based programming team. • Reduce costs. Considered various offerings: • Considered SQL Server (discarded due to cost). • Knew Hadoop could be the solution and started looking at commercial implementations. • Considered non-commercial & shorted listed two Hadoop vendors: Cloudera & Hortonworks. • Determined non-commercial too risky, too burdensome – left it to the experts. • Launched June 2014 • Why Hadoop: Cost, Tech Requirements, Data Size • Why Cloudera: Most mature solution with better overall enterprise toolset; Cloudera Team Decision
  • 20. 20© Cloudera, Inc. All rights reserved. Solution • Hadoop Platform: • Cloudera Enterprise • Hadoop Components: • Apache Flume, Apache Sqoop, Apache Hive, MapReduce, Apache Impala (incubating), Hue, Cloudera Manager • Third-Party BI & Analytic Tools: • D3.js, SAS, Tableau, R, Angoss • Security Tools: • Kerberos, Apache Sentry, Cloudera Navigator
  • 21. 21© Cloudera, Inc. All rights reserved. Solution: Self-Service Data Discovery & BI • Self-service data discovery capabilities allow us to eliminate the need for distribution of multiple Excel reports instead allowing our clients to interact directly with Hadoop. • Security enhanced as the need for distribution of Excel reports via email went away. • Use of Tableau to run Impala queries produces real-time reporting resulting in significant value add and convenience for our clients. • Offers scalability and flexibility to accommodate diverse and growing client demands. • Allows us to scale our web event product giveaways. • Accommodates the addition of new data sources. • Easily add nodes to avoid potential performance bottlenecks.
  • 22. 22© Cloudera, Inc. All rights reserved. Why We Chose Cloudera • Cloudera Manager became a major differentiator. • Made cluster management easy • User friendly = reduced learning curve • Chose Impala for its real-time query performance. • Proven Cloudera innovation and zeal to maintain an enterprise class solution by offering new tools and functions while maintaining/supporting the Apache project. • Cloudera appeared to be the prominent choice of large Hadoop installs in the Fortune 500. Best IT Analyst rating. • Impressed with Cloudera team before purchase.
  • 23. 23© Cloudera, Inc. All rights reserved. Benefits & Impact • All-inclusive Cloudera Enterprise costs less than the required relational database licenses alone - Over 90% cost reduction. • Other benefits - cheaper hardware and easier to manage. • Cloudera Navigator provides a single interface to locate and classify data, audit who is accessing what data, and protect the data with centralized key management. • Critical tool when handling PII and other sensitive data. Comprehensive audit trail allows for easy monitoring of PII data access. • Allows us to satisfy strict security compliance regulations with ease. • Cloudera Professional Services are knowledgeable, responsive, and help establish best practices for our internal development team. They helped us get it right the second time. Any time we had a crisis they were there to help Why we are glad we chose Cloudera?
  • 24. 24© Cloudera, Inc. All rights reserved. Lessons Learned • First used non-Cloudera consulting: Big mistake – design incorrect for data collected. Work with Cloudera Professional Services to design it right the first time. • Start small, if you can, and grow solution. • Don’t need big capital investment upfront • Get value out of small cluster (eg. 3 nodes) and expand as needed. • Install services to meet your current needs. Install additional services as your data needs change. • Look at all Cloudera solutions, learn them, and use them. • Training: Be generous, conduct in phases to keep new skills relevant as you build and deploy. • What’s next? • Prebuilt analytical models as a platform. • Evaluate Navigator Optimizer to improve query performance and identify best candidates for legacy application migration
  • 25. 25© Cloudera, Inc. All rights reserved. What’s Next for Cloudera’s Analytic Database?
  • 26. 26© Cloudera, Inc. All rights reserved. Analytic Database Roadmap Faster, richer, more expressive SQL • Hive-on-Spark GA • Insert, update delete via Kudu • Performance improvements • Nested JSON Improved multitenancy • Fewer OOM errors • Graceful node decomission • Admission control enhancements • Improved YARN integration Better SQL workbench • Higher Hue concurrency • SQL editor usability improvements • Intelligent recommendations of tables, joins & more for Hue users • Exposing tags & lineage through the Hue query experience Deeper integration with BI tools • Joint workload optimizations • Support for nested types and s • Data discovery functionality injected into the BI experience Workload optimization • Multi-platform workload profiling • Recommendation of in-line materialized views Confidential – Do not Redistribute
  • 27. 27© Cloudera, Inc. All rights reserved. Next Steps • Download Cloudera 5.8 • cloudera.com/downloads • Release Notes • cloudera.com/documentation/enterprise/release- notes/topics/rg_release_notes.html • Learn more about Navigator Optimizer and BI in the Cloud • Register for Parts 2 & 3 of the Webinar Series! • cloudera.com/about-cloudera/events/webinars/5-8-webinar-series.html
  • 28. 28© Cloudera, Inc. All rights reserved. Questions?