SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Speedpitch @ TDWI
Big Data Integration
Stefan Lipp
ACM, Cloudera
@snlipp
2© Cloudera, Inc. All rights reserved.
Cloudera - company snapshot
Founded 2008, by former employees of
Funding More than $1B invested, $740M primary investment from
NOW Publicly Traded on the NYSE: CLDR
Employees Today 1,500+ worldwide
World Class Support Pro-active & predictive support programs using our EDH
Mission Critical Production deployments in run-the-business applications worldwide
– Financial Services, Pharma, Retail, Telecom, Media, Health Care,
Energy, Government
Largest Ecosystem More than 2,600 Partners
Cloudera University Over 40,000 trained
Open Source Leaders Cloudera employees are leading developers & contributors to the
complete Apache Hadoop ecosystem of projects
3© Cloudera, Inc. All rights reserved.
4© Cloudera, Inc. All rights reserved.
LEGACY = Data to Compute MODERN = Compute to Data
Data
Information-centric
businesses use all data:
multi-structured,
internal & external data
of all types
CRM
Finance
Risk
Process-centric
businesses use:
Structured data mainly
Internal data only
“Important” data only
DWH
Risk
Mart
ELT
ETL
ETL
ETL
Siloed data sources
The “paradigm shift” to Hadoop / data centric platforms
5© Cloudera, Inc. All rights reserved.
Big Data Technology = Multi-In + Scale + Multi-Out
1. Multi-In: Process different types of data together
Structured: From relational and transactional systems (RDBMS).
Semi-structured: e.g. Server Logs, Sensor Logs, Clickstreams, …
Unstructured: e.g. Emails, Tweets, Images, Audio, Video, …
2. Scale technically & economically (reduce
cost/byte).
3. Multi-Out: Run different types of data processing
workloads as part of a unified data pipeline.
©2014 Cloudera, Inc. All rights reserved.
6© Cloudera, Inc. All rights reserved.
The Cloudera data management platform
Data Sources Data Ingest Data Storage & Processing
Serving, Analytics &
Machine Learning
Apache Kafka
Stream or batch ingestion of IoT data
Apache Sqoop
Ingestion of data from relational sources
Apache Hadoop
Storage (HDFS) & Batch (HIVE)
Apache Kudu
Storage & serving for fast changing data
Apache HBase
NoSQL data store for real-time apps
Apache Impala
MPP SQL for fast analytics
Cloudera Search
Real time searchConnected Things/
Data Sources
Structured Data
Sources
Security, Scalability & Easy Management
Deployment
Flexibility: Datacenter Cloud
Apache Spark
Stream & iterative processing, ML
7© Cloudera, Inc. All rights reserved.
Apache Flume
Log & EventAggregation for Hadoop
• Efficiently move large amounts
of streaming/log data
• Easily collect data from multiple
systems (sources)
• Built-in sources, sinks, and
channels
• Customize data flow to transform
data on-the-fly
• Reliable, scalable, and
extensible for production
• Manage and monitor with
Cloudera Manager
Log Files
Sensor Data
UNIX syslog
Hadoop Cluster
Program Output
Network Sockets
Status Updates
Social Media Posts
8© Cloudera, Inc. All rights reserved.
Apache Kafka
Pub-Sub Messaging for Hadoop • Backbone for real-time architectures
• Fast, flexible messaging for a wide
range of use cases
• Scale to support more data sources and
growing data volumes
• Zero data loss durability and always-on
fault-tolerance
• Built-in security and data protection
• Seamless integration across the
platform
• Connect to Flume, Spark Streaming,
HBase, and more
• Manage and monitor with Cloudera
Manager
Kafka decouples Data Pipelines
Source
System
Source
System
Source
System
Source
System
Hadoop
Security
Systems
Real-time
monitoring
Data
Warehouse
Kafka
9© Cloudera, Inc. All rights reserved.
Apache Sqoop
SQL to Hadoop
• Efficiently exchange data between database and Hadoop
• Bidirectional
• Import all or partial/new data
• Export for shared data access across systems
• Easily get started with high performance connectors
• Free to use
• Optimized connectors for popular RDBMS, EDW, and NoSQL options
Database Hadoop Cluster
10© Cloudera, Inc. All rights reserved.
Go beyond SQL with Python & Spark:
Cloudera Data Science Workbench
Accelerates data engineering from
development to production with:
• Secure self-service environments
for data scientists to work against
Cloudera clusters
• Support for Python, R, and Scala,
plus project dependency isolation
for multiple library versions
• Workflow automation, version
control, collaboration and sharing
11© Cloudera, Inc. All rights reserved.
Cloudera Altus PaaS for Data Engineering
Platform as a service for ETL
(machine learning, and data
processing)
● Pay as you Go
● Support for MR2, Hive, Spark,
Hive-on-Spark, Talend
● Job-first orientation
● Quick and easy workload
troubleshooting & analytics
12© Cloudera, Inc. All rights reserved.
DI/DQ/Profiling/Wrangling solutions from partners
13© Cloudera, Inc. All rights reserved.
Data stewardship and governance solutions
Centralized Stewardship End User Discovery
PlatformApplication
Unified technical metadata catalog
Extensible business metadata and glossary
Metadata rules engine
Comprehensive lineage
Unified audit/access logs
Dashboards and analytics
APIs for augmentation and consumption
Data wrangling
Data visualization
Query recommendations
Security profiling
Compliance: BCBS239,
GDPR
End user collaboration
Crowdsourced metadata
Data quality
Uniqueness
Data valuation
Data profiling
Content enrichment
Enterprise aggregation: metadata, lineage, SIEM,
auditing
Project management
Policy management
RACI
Stewardship workflows
ETL
Centralized curation
Centralized glossaries
14© Cloudera, Inc. All rights reserved.
Modern data warehouse landscape
Data
Sources
EDW
Analytic
Database
Operational
Database
Data Science
& Engineering
Shared Data
Layer
Modern Data Platform
Fixed
Reports
Dashboards/
Analytic
Applications
Non-SQL
Workloads
Self-
Service
BI/Ad Hoc
Flexible
Reporting
15© Cloudera, Inc. All rights reserved.
Powered by the best-of-breed technologies
Fastest ETL/ELT at Scale
for Data Engineers
• Flexible and scalable to handle any and all
data
• Fast data processing with distributed, in-
memory processing
• Processed data immediately available with
shared storage and metadata
• Cloud-native for contention-free resourcing
Self-Service BI & Reporting
for Analysts & SQL Developers
• Query data directly without rigid data
modeling
• Interactive multi-user performance for
iterative exploration
• Elastic scalability for more users/data on-
premises and cloud environments
• Cloud-native for insights over shared data
Impala
16© Cloudera, Inc. All rights reserved.
Cloudera’s goal: customer success with open source
By innovating in open source
Some vendors consume the open source community’s activity; others help drive it.
Cloudera leads in influencing the Hadoop platform's evolution by creating, contributing,
and supporting new capabilities that meet customer requirements for security, scale, and
usability.
By curating open standards
Cloudera has a long and proven track record of identifying, curating, and supporting the
open standards (including Apache HBase, Apache Spark, and Apache Kafka) that
provide the mainstream, long-term architecture upon which new customer use cases
are built.
By meeting the highest enterprise requirements
To ensure the best customer experience, Cloudera invests significant resources in multi-
dimensional testing on real workloads before releases, as well as in supportability of the
entire platform via extensive involvement in the open source community.
17© Cloudera, Inc. All rights reserved.
Thank you
Live Demo CDSW – Spark Data Pipelines
heute 10:20-10:30 / Cloudera Stand @ TDWI
Live Demo Altus “Job First” Big Data Integration
heute 13:10-13:20 / Cloudera Stand @ TDWI

More Related Content

What's hot

Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Data Con LA
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
hadooparchbook
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
larsgeorge
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
DataWorks Summit
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
Shravan (Sean) Pabba
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
Ayyappan Paramesh
 
Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloud
Steve Loughran
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Rakuten Group, Inc.
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Uwe Printz
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
markgrover
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing Meetup
Caserta
 
Introduction to Hadoop - The Essentials
Introduction to Hadoop - The EssentialsIntroduction to Hadoop - The Essentials
Introduction to Hadoop - The Essentials
Fadi Yousuf
 
Kudu Cloudera Meetup Paris
Kudu Cloudera Meetup ParisKudu Cloudera Meetup Paris
Kudu Cloudera Meetup Paris
نهاد مبارك
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Hadoop / Spark Conference Japan
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Cloudera, Inc.
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Jeremy Beard
 

What's hot (20)

Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Big data Hadoop
Big data  Hadoop   Big data  Hadoop
Big data Hadoop
 
Farming hadoop in_the_cloud
Farming hadoop in_the_cloudFarming hadoop in_the_cloud
Farming hadoop in_the_cloud
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
NYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache HadoopNYC HUG - Application Architectures with Apache Hadoop
NYC HUG - Application Architectures with Apache Hadoop
 
Introducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing MeetupIntroducing Kudu, Big Data Warehousing Meetup
Introducing Kudu, Big Data Warehousing Meetup
 
Introduction to Hadoop - The Essentials
Introduction to Hadoop - The EssentialsIntroduction to Hadoop - The Essentials
Introduction to Hadoop - The Essentials
 
Kudu Cloudera Meetup Paris
Kudu Cloudera Meetup ParisKudu Cloudera Meetup Paris
Kudu Cloudera Meetup Paris
 
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
 
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
Strata + Hadoop World 2012: Data Science on Hadoop: How Cloudera Impala Unloc...
 
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and KuduBuilding Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
 

Viewers also liked

Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
Susanna-Assunta Sansone
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
Anna Casey
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
Mike Crabb
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
Jonas Bonér
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
Derek Stainer
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Boris Otto
 

Viewers also liked (6)

Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
mini MAXI art exhibition
mini MAXI art exhibitionmini MAXI art exhibition
mini MAXI art exhibition
 
A Beginners Guide to noSQL
A Beginners Guide to noSQLA Beginners Guide to noSQL
A Beginners Guide to noSQL
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!Enabling the Industry 4.0 vision: Hype? Real Opportunity!
Enabling the Industry 4.0 vision: Hype? Real Opportunity!
 

Similar to Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017

Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
inevitablecloud
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
TheInevitableCloud
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

Cloudera, Inc.
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Cloudera, Inc.
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
Cloudera, Inc.
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Cloudera, Inc.
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
Alessandro Salvatico
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Cloudera, Inc.
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
Jeffrey T. Pollock
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

Cloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
Cloudera, Inc.
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
Emil Andreas Siemes
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
The Hive
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
Cloudera, Inc.
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
Cloudera, Inc.
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
Hortonworks
 

Similar to Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017 (20)

Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)Building a Data Hub that Empowers Customer Insight (Technical Workshop)
Building a Data Hub that Empowers Customer Insight (Technical Workshop)
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 

Recently uploaded

bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
902basic
 
React Native vs Flutter - SSTech System
React Native vs Flutter  - SSTech SystemReact Native vs Flutter  - SSTech System
React Native vs Flutter - SSTech System
SSTech System
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
bhumivarma35300
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
908dutch
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
aadhiyaeliza
 
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docxComprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Aardwolf Security
 
To Avoid Mistakes When Using Online Attendance Sheets
To Avoid Mistakes When Using Online Attendance SheetsTo Avoid Mistakes When Using Online Attendance Sheets
To Avoid Mistakes When Using Online Attendance Sheets
Task Tracker
 
TEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with YouTEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with You
marcofolio
 
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
kiara pandey
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
miso_uam
 
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdfAI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
Daniel Zivkovic
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
TaskSprint | Employee Efficiency Software
 
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
jealousviolet
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
neshakor5152
 
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
OnePlan Solutions
 
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to KnowThe Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
onemonitarsoftware
 
Artificial intelligence in customer services or chatbots
Artificial intelligence  in customer services or chatbotsArtificial intelligence  in customer services or chatbots
Artificial intelligence in customer services or chatbots
kayash1656
 
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
revolutionary575
 
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
Srinivas Dukka
 

Recently uploaded (20)

bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
 
React Native vs Flutter - SSTech System
React Native vs Flutter  - SSTech SystemReact Native vs Flutter  - SSTech System
React Native vs Flutter - SSTech System
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
 
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docxComprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
Comprehensive Vulnerability Assessments Process _ Aardwolf Security.docx
 
To Avoid Mistakes When Using Online Attendance Sheets
To Avoid Mistakes When Using Online Attendance SheetsTo Avoid Mistakes When Using Online Attendance Sheets
To Avoid Mistakes When Using Online Attendance Sheets
 
TEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with YouTEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with You
 
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
Celebrity Girls Call Mumbai 9930687706 Unlimited Short Providing Girls Servic...
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
 
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdfAI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
 
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
VVIP Girls Call Mumbai 9910780858 Provide Best And Top Girl Service And No1 i...
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
 
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to KnowThe Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
 
Artificial intelligence in customer services or chatbots
Artificial intelligence  in customer services or chatbotsArtificial intelligence  in customer services or chatbots
Artificial intelligence in customer services or chatbots
 
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
Busty Girls Call Mumbai 9930245274 Unlimited Short Providing Girls Service Av...
 
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
AWS DevOps-Tutorial CHANAKYA SRIYAN DUKKA.
 

Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017

  • 1. 1© Cloudera, Inc. All rights reserved. Speedpitch @ TDWI Big Data Integration Stefan Lipp ACM, Cloudera @snlipp
  • 2. 2© Cloudera, Inc. All rights reserved. Cloudera - company snapshot Founded 2008, by former employees of Funding More than $1B invested, $740M primary investment from NOW Publicly Traded on the NYSE: CLDR Employees Today 1,500+ worldwide World Class Support Pro-active & predictive support programs using our EDH Mission Critical Production deployments in run-the-business applications worldwide – Financial Services, Pharma, Retail, Telecom, Media, Health Care, Energy, Government Largest Ecosystem More than 2,600 Partners Cloudera University Over 40,000 trained Open Source Leaders Cloudera employees are leading developers & contributors to the complete Apache Hadoop ecosystem of projects
  • 3. 3© Cloudera, Inc. All rights reserved.
  • 4. 4© Cloudera, Inc. All rights reserved. LEGACY = Data to Compute MODERN = Compute to Data Data Information-centric businesses use all data: multi-structured, internal & external data of all types CRM Finance Risk Process-centric businesses use: Structured data mainly Internal data only “Important” data only DWH Risk Mart ELT ETL ETL ETL Siloed data sources The “paradigm shift” to Hadoop / data centric platforms
  • 5. 5© Cloudera, Inc. All rights reserved. Big Data Technology = Multi-In + Scale + Multi-Out 1. Multi-In: Process different types of data together Structured: From relational and transactional systems (RDBMS). Semi-structured: e.g. Server Logs, Sensor Logs, Clickstreams, … Unstructured: e.g. Emails, Tweets, Images, Audio, Video, … 2. Scale technically & economically (reduce cost/byte). 3. Multi-Out: Run different types of data processing workloads as part of a unified data pipeline. ©2014 Cloudera, Inc. All rights reserved.
  • 6. 6© Cloudera, Inc. All rights reserved. The Cloudera data management platform Data Sources Data Ingest Data Storage & Processing Serving, Analytics & Machine Learning Apache Kafka Stream or batch ingestion of IoT data Apache Sqoop Ingestion of data from relational sources Apache Hadoop Storage (HDFS) & Batch (HIVE) Apache Kudu Storage & serving for fast changing data Apache HBase NoSQL data store for real-time apps Apache Impala MPP SQL for fast analytics Cloudera Search Real time searchConnected Things/ Data Sources Structured Data Sources Security, Scalability & Easy Management Deployment Flexibility: Datacenter Cloud Apache Spark Stream & iterative processing, ML
  • 7. 7© Cloudera, Inc. All rights reserved. Apache Flume Log & EventAggregation for Hadoop • Efficiently move large amounts of streaming/log data • Easily collect data from multiple systems (sources) • Built-in sources, sinks, and channels • Customize data flow to transform data on-the-fly • Reliable, scalable, and extensible for production • Manage and monitor with Cloudera Manager Log Files Sensor Data UNIX syslog Hadoop Cluster Program Output Network Sockets Status Updates Social Media Posts
  • 8. 8© Cloudera, Inc. All rights reserved. Apache Kafka Pub-Sub Messaging for Hadoop • Backbone for real-time architectures • Fast, flexible messaging for a wide range of use cases • Scale to support more data sources and growing data volumes • Zero data loss durability and always-on fault-tolerance • Built-in security and data protection • Seamless integration across the platform • Connect to Flume, Spark Streaming, HBase, and more • Manage and monitor with Cloudera Manager Kafka decouples Data Pipelines Source System Source System Source System Source System Hadoop Security Systems Real-time monitoring Data Warehouse Kafka
  • 9. 9© Cloudera, Inc. All rights reserved. Apache Sqoop SQL to Hadoop • Efficiently exchange data between database and Hadoop • Bidirectional • Import all or partial/new data • Export for shared data access across systems • Easily get started with high performance connectors • Free to use • Optimized connectors for popular RDBMS, EDW, and NoSQL options Database Hadoop Cluster
  • 10. 10© Cloudera, Inc. All rights reserved. Go beyond SQL with Python & Spark: Cloudera Data Science Workbench Accelerates data engineering from development to production with: • Secure self-service environments for data scientists to work against Cloudera clusters • Support for Python, R, and Scala, plus project dependency isolation for multiple library versions • Workflow automation, version control, collaboration and sharing
  • 11. 11© Cloudera, Inc. All rights reserved. Cloudera Altus PaaS for Data Engineering Platform as a service for ETL (machine learning, and data processing) ● Pay as you Go ● Support for MR2, Hive, Spark, Hive-on-Spark, Talend ● Job-first orientation ● Quick and easy workload troubleshooting & analytics
  • 12. 12© Cloudera, Inc. All rights reserved. DI/DQ/Profiling/Wrangling solutions from partners
  • 13. 13© Cloudera, Inc. All rights reserved. Data stewardship and governance solutions Centralized Stewardship End User Discovery PlatformApplication Unified technical metadata catalog Extensible business metadata and glossary Metadata rules engine Comprehensive lineage Unified audit/access logs Dashboards and analytics APIs for augmentation and consumption Data wrangling Data visualization Query recommendations Security profiling Compliance: BCBS239, GDPR End user collaboration Crowdsourced metadata Data quality Uniqueness Data valuation Data profiling Content enrichment Enterprise aggregation: metadata, lineage, SIEM, auditing Project management Policy management RACI Stewardship workflows ETL Centralized curation Centralized glossaries
  • 14. 14© Cloudera, Inc. All rights reserved. Modern data warehouse landscape Data Sources EDW Analytic Database Operational Database Data Science & Engineering Shared Data Layer Modern Data Platform Fixed Reports Dashboards/ Analytic Applications Non-SQL Workloads Self- Service BI/Ad Hoc Flexible Reporting
  • 15. 15© Cloudera, Inc. All rights reserved. Powered by the best-of-breed technologies Fastest ETL/ELT at Scale for Data Engineers • Flexible and scalable to handle any and all data • Fast data processing with distributed, in- memory processing • Processed data immediately available with shared storage and metadata • Cloud-native for contention-free resourcing Self-Service BI & Reporting for Analysts & SQL Developers • Query data directly without rigid data modeling • Interactive multi-user performance for iterative exploration • Elastic scalability for more users/data on- premises and cloud environments • Cloud-native for insights over shared data Impala
  • 16. 16© Cloudera, Inc. All rights reserved. Cloudera’s goal: customer success with open source By innovating in open source Some vendors consume the open source community’s activity; others help drive it. Cloudera leads in influencing the Hadoop platform's evolution by creating, contributing, and supporting new capabilities that meet customer requirements for security, scale, and usability. By curating open standards Cloudera has a long and proven track record of identifying, curating, and supporting the open standards (including Apache HBase, Apache Spark, and Apache Kafka) that provide the mainstream, long-term architecture upon which new customer use cases are built. By meeting the highest enterprise requirements To ensure the best customer experience, Cloudera invests significant resources in multi- dimensional testing on real workloads before releases, as well as in supportability of the entire platform via extensive involvement in the open source community.
  • 17. 17© Cloudera, Inc. All rights reserved. Thank you Live Demo CDSW – Spark Data Pipelines heute 10:20-10:30 / Cloudera Stand @ TDWI Live Demo Altus “Job First” Big Data Integration heute 13:10-13:20 / Cloudera Stand @ TDWI