SlideShare a Scribd company logo
1 of 22
Building Smart Data Lake
Slalom Approach
May, 2018
"Better to bet on cloud providers
for infrastructure, Cloudera for data,
analytics and security fabric, and
leave the rest to the ecosystem"
The Ecosystem
Value Identification Value Demonstration Value Realization
Exploring big data and
working to identify a
business case
Past the business case
and need to demonstrate
value for broader adoption
Implemented early use
cases with limited value
and lacking traction
It takes a long time
We don’t have big data
I’m not sure how to
start
All our data is
structured data
We already have a
data warehouse
We don’t have
business case for it
It is hard to find the
required skillsets
No one using our
Hadoop data
Hadoop ecosystem is
overwhelming
Business users are not
happy
Big data has come long way and the enterprises are at different phase of their journey. However,
broader adoption of the computation ecosystem is still in its early stages.
It is too expensive
We yet to realize the
benefits it promises
Value Identification Value Demonstration Value Realization
Exploring big data and
working to identify a
business case
Past the business case
and need to demonstrate
value for broader adoption
Implemented early use
cases with limited value
and lacking traction
Value Identification Value Demonstration Value Realization
Traditional architectures use rigid data models, costly platforms, resource-intensive ETL and lack
support for new use cases.
Rigid Data Architecture
Early binding to the pre-defined schema makes it inflexible
and costly
Flexible Architecture
Data is ingested and transformed without prior knowledge of target
schema
Costly Infrastructure and Solution
Data duplicated across costly platforms
50-70% spend on acquisition and integration
Simplified Infrastructure and Solution
Flexible on-premise and cloud infrastructure
API-based pipelines automate data ingestion
Lacks Support for “New” Use Cases
Data silo’s impede real-time processing required to support
modern use cases
Best Suited for “New” Use Cases
Centralized hub for heterogeneous data and variety of tools enable
real-time analytics
Declining Talent Pool
The new talent lacks excitement for the traditional
technologies and tools
Growing Talent Pool
Elevated interest in data engineering and data science work
Traditional Modern
Requires army of costly professionals to support longer delivery cycles and brittle data processes.
Slower Speed-to-Market
Longer delivery lifecycle involving too many project phases
Accelerated Speed-to-Market
Separation of data management from discovery and analytics
accelerates solution delivery
Heavy Reliance on Costly IT Resources
Point-to-point ETL and early binding data model requires IT
resources for any data changes
Enabled Business Self-Service
Centralized data enables “data wrangling” and analytics by
business users and data scientists
Army of Data Professionals Streamlined Data Roles
Traditional Modern
Value Identification Value Demonstration Value Realization
Select outcome-based high impact use case(s) and deliver minimal viable product (MVP) to demonstrate
immediate success.
.
Story contact:
C L I E N T S O L U T I O N S I N D U S T R Y
The client had a vision to drive improved customer experience and engagement through
personalized marketing campaigns and needed an on-premise solution that enables the initial use
cases and provides foundation for enterprise-wide analytics. Slalom architected a multi-zone data
lake to harness and analyze internal and external customer and product data, enabling real-time
analytics and a personalized customer experience.
Financial ServicesA financial services company serving over 16 million
customers nationwide. They pride themselves on being
able to provide a personal touch for their customers,
and the size of their customer base meant they needed
a solution that would be able to integrate large amounts
of traditionally siloed customer and product data.
Enterprise data hub provides
foundation for data-driven culture
A L L I A N C E S
Data architecture and
solution design
Data governance
deployment
Multi-zone data lake
design and buildout
Ingestion and
integration using
metadata-based big
data integration tool
Data discovery
enabled and Tableau
dashboard deployed
Financial Services
INDUSTRY
BIG DATA SERVICES
Big Data Startup Planning
Big Data Governance
Big Data Implementation
Enablement and Adoption
STORC
Value Identification Value Demonstration Value Realization
We think, most of the organizations lack engineering skills required to fully leverage Hadoop ecosystem
and realize the potential of new technologies.
Approach Culture
Organizations are using
traditional source-to-target
approach of acquiring and
integrating data for known
use cases
Mindset has to change from
hoarding and protecting
information to making it
easy to access and use
data as an enterprise asset
Architecture Skills
Usually considered an IT
infrastructure project,
Hadoop is used as a large
file system to dump data
files with limited use and
marginal business value
Majority of the data
professionals (ETL
developers, data analysts)
lack engineering skills
required to fully leverage
Hadoop technologies
Smart data lake should be....
Enterpise Scale Auditable
Governed
SupportedSecured
Multi-Use Support
Extensible
Open SourceStandardized
Designed
Right
Governed for
Adoption
Economical
to Use
Data Lake
DATA SOURCES DATA MANAGEMENT
Data Lake
RAW
Persistence of
source data
Streaming
Files
Databases
EgressAPIs
Standardized,
reconciled, and
quality checked
ENRICHED
Discovery/
Sandbox
DISCOVERY
DATA STORAGE OPTIONS
HDFS, S3
MODELED
Data Governance & Master Data Management
DATA DELIVERY &
CONSUMPTION
BI & REPORTING
MOBILE &
WEB APPS
EDW On Premise
Relational NoSQL
EDW in Cloud
Relational NoSQL
EXTERNAL BUSINESS
PARTNERS
Multi-zone, self-governed data lake to provide secure and flexible data architecture to harness enterprise
data for accelerated speed to insight.
Data Lake
DATA SOURCES HADOOP SOLUTION COMPONENTS
Streaming
Files
Databases
Batch
Streaming
Acquisition and
Ingestion
Transformation
Discovery and
Modeling
RAW ENRICHED DISCOVERY
The architecture implements data pipelines using our purpose-built open source integration APIs
accelerating implementation by 9-12 weeks.
The accelerator enables self-service by allowing data analysts and data SMEs to ingest new data
sources and promote data through the lake with limited to no IT dependencies.
DATA SOURCES
Streaming
Files
Databases
RAW
Business/
Data SME
METADATA MANAGEMENT & CONTROL API
Files
TARGETS
Foundation Migration Optimization
Assess and Prioritize
Applications
Application Analysis
System
Backlog
Optimize SystemsImplement
Security,
Networking, and
Operating Models
Security & Operations
Assessment
Develop Security and
Operating Model
Application Migration Factory
Sprint 1 - n
Workload 1
Workload 2
Workload n
Workload 1
Workload 2
Workload n
Strategy Definition
Outline Desired
Outcomes
Build & Transition Organization
Process Service Model
Org Structure Capabilities
Governance Metrics
Communications, Training, Change Mgmt
Improvement
System Prioritization
& Roadmap
Design, Migrate, Integrate and Validate
Applications
Value
Realization
Continuous Feedback
Transition to
Operations
On-Premise Cloud
Cloud presents an opportunity to transform on premise workloads into purpose driven scalable solutions
Story contact:
P R O J E C T
PEM delivery
methodology was
used to deliver a cost
effective and scalable
solution
Client is exploring
opportunities to
monetize the solution
as an analytics
workbench
The data science
team can leverage
both SAS and R
integration with the
platform for advanced
analytics
Sunset existing
platforms, reducing
licensing and support
maintenance costs
R E S U L T S
Slalom partnered with a Fortune 500 healthcare company to deliver a next generation data platform.
The client’s existing platform could not support increasing data volumes and a growing need for
advanced analytics workloads. The new platform not only addressed these scalability concerns but
also allowed the client to host both structured and unstructured data in near-real time. Most
importantly, this data platform opened doors for new monetization opportunities
Slalom built a next generation Hadoop data platform to
meet the client’s needs. Leveraging the cloud enabled a
quick turnaround time as well as security features ideal for
storing PII and PHI data. Slalom team migrated and
optimized existing data to leverage Hadoop high-
performance features. Slalom also built a near-real time
platform that can ingest HL7 messages from several
hospitals and provide event-driven alerting. Maz Chaudhri
Next Generation Data Platform for
Healthcare Analytics
T E C H N O L O G Y
B A C K G R O U N D
Healthcare
INDUSTRY
BIG DATA SERVICES
Agile Delivery Approach
Big Data Implementation
Story contact:
P R O J E C T
Agile Delivery
Methodology
Real-time data
platform
Self-service
enablement
Up-to-the-minute view
into the operations of
over 6,000 restaurant
locations nation-wide.
Ability to monitor KPIs
and react with
targeted efforts to
boost sales exactly
where it is needed.
R E S U L T S
Our client in the fast-casual food industry was having widespread challenges accurately capturing and measuring key business
metrics. Due to inconsistent data integrity in the nightly batch process, executives and leaders were growing skeptical of the
reliability of reporting and analytics built from the data. Leaders were clamoring for timely visibility to better, cleaner data.
The Slalom team served as Scrum Master, Product
Owner and Analyst during the architecture and delivery
of the AWS-based Cloudera platform. Using a Kafka-
based publish-subscribe architecture, each restaurant
location in addition to the online ordering platform was
set up to stream data feeds to the unified Cloud
platform.
Maz Chaudhri
B A C K G R O U N D
T E C H N O L O G Y
Food Service
INDUSTRY
BIG DATA SERVICES
Agile Delivery Approach
Big Data Startup Planning
Platform Evaluation & Selection
Big Data Implementation
Story contact:
P R O J E C T
A scalable and
flexible Big Data
Platform
A universal XML
ingestion framework
HDFS Data lake that
ingests and persists
all data from source
system
Allowed the client to
sunset a reporting
product that saved
over $1MM annually
in support
maintenance cost
Qlik BI & Operational
reports utilizing
Hadoop as the
backend
R E S U L T S
A top 10 Pharmaceutical company, and top 150 Fortune 500, sought to implement a next generation
modern data platform. The platform needed to not only provide end to end supply chain visibility, but
also be flexible and scalable to handle a heavy volume of serialized data. The client also wanted to
establish a data lake so as to be able to predict and prescribe their inventory and shipments to better
serve their customers.
Slalom utilized AWS and Cloudera Hadoop to build this
next generation data platform. The data platform gave
visibility to inventory levels to help drive the
development of inventory optimization strategies and
integrated multiple disparate sources to give end to end
shipment visibility of the client’s supply chain.
Pharmaceuticals
INDUSTRY
Next Generation Data Platform &
Supply Chain visibility
A L L I A N C E S
B A C K G R O U N D
BIG DATA SERVICES
Agile Delivery Approach
Big Data Implementation
Time to Value Proven Approach and
Experience
Pre-Built Accelerators
AGILE ENGINEERING
APPROACH
Start small, deliver value and
evolve your Big Data program
BIG DATA STARTUP
PLANNING
Pre-defined epics and stories
for big data startup
DATA GOVERNANCE in
a BOX
Multi-faceted data governance
deployment and tools
READINESS AND ADOPTION
Org readiness and change
strategy and enablement
PLATFORM SELECTION
Best practices-based
evaluation toolset
BIG DATA INTEGRATION
TOOL
Open-source meta-data driven
integration API
1 32

More Related Content

What's hot

Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureCloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformCloudera, Inc.
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester WebinarCloudera, Inc.
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera, Inc.
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldCloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsCloudera, Inc.
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 

What's hot (20)

Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 

Similar to Big data journey to the cloud maz chaudhri 5.30.18

Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaCloudera, Inc.
 
What's New in Pentaho 7.0?
What's New in Pentaho 7.0?What's New in Pentaho 7.0?
What's New in Pentaho 7.0?Xpand IT
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US InformationJulian Tong
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshareJulianna DeLua
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsJane Roberts
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeMicrosoft
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewDataWorks Summit/Hadoop Summit
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...PwC
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Precisely
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloudredmondpulver
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analyticsThe Marketing Distillery
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Hortonworks
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTKiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardKiththi Perera
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...Big Data Week
 

Similar to Big data journey to the cloud maz chaudhri 5.30.18 (20)

Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
What's New in Pentaho 7.0?
What's New in Pentaho 7.0?What's New in Pentaho 7.0?
What's New in Pentaho 7.0?
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enoughCloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 

Recently uploaded

NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 

Recently uploaded (20)

NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 

Big data journey to the cloud maz chaudhri 5.30.18

  • 1. Building Smart Data Lake Slalom Approach May, 2018
  • 2. "Better to bet on cloud providers for infrastructure, Cloudera for data, analytics and security fabric, and leave the rest to the ecosystem"
  • 4. Value Identification Value Demonstration Value Realization Exploring big data and working to identify a business case Past the business case and need to demonstrate value for broader adoption Implemented early use cases with limited value and lacking traction
  • 5. It takes a long time We don’t have big data I’m not sure how to start All our data is structured data We already have a data warehouse We don’t have business case for it It is hard to find the required skillsets No one using our Hadoop data Hadoop ecosystem is overwhelming Business users are not happy Big data has come long way and the enterprises are at different phase of their journey. However, broader adoption of the computation ecosystem is still in its early stages. It is too expensive We yet to realize the benefits it promises Value Identification Value Demonstration Value Realization Exploring big data and working to identify a business case Past the business case and need to demonstrate value for broader adoption Implemented early use cases with limited value and lacking traction
  • 6. Value Identification Value Demonstration Value Realization
  • 7. Traditional architectures use rigid data models, costly platforms, resource-intensive ETL and lack support for new use cases. Rigid Data Architecture Early binding to the pre-defined schema makes it inflexible and costly Flexible Architecture Data is ingested and transformed without prior knowledge of target schema Costly Infrastructure and Solution Data duplicated across costly platforms 50-70% spend on acquisition and integration Simplified Infrastructure and Solution Flexible on-premise and cloud infrastructure API-based pipelines automate data ingestion Lacks Support for “New” Use Cases Data silo’s impede real-time processing required to support modern use cases Best Suited for “New” Use Cases Centralized hub for heterogeneous data and variety of tools enable real-time analytics Declining Talent Pool The new talent lacks excitement for the traditional technologies and tools Growing Talent Pool Elevated interest in data engineering and data science work Traditional Modern
  • 8. Requires army of costly professionals to support longer delivery cycles and brittle data processes. Slower Speed-to-Market Longer delivery lifecycle involving too many project phases Accelerated Speed-to-Market Separation of data management from discovery and analytics accelerates solution delivery Heavy Reliance on Costly IT Resources Point-to-point ETL and early binding data model requires IT resources for any data changes Enabled Business Self-Service Centralized data enables “data wrangling” and analytics by business users and data scientists Army of Data Professionals Streamlined Data Roles Traditional Modern
  • 9. Value Identification Value Demonstration Value Realization
  • 10. Select outcome-based high impact use case(s) and deliver minimal viable product (MVP) to demonstrate immediate success. .
  • 11. Story contact: C L I E N T S O L U T I O N S I N D U S T R Y The client had a vision to drive improved customer experience and engagement through personalized marketing campaigns and needed an on-premise solution that enables the initial use cases and provides foundation for enterprise-wide analytics. Slalom architected a multi-zone data lake to harness and analyze internal and external customer and product data, enabling real-time analytics and a personalized customer experience. Financial ServicesA financial services company serving over 16 million customers nationwide. They pride themselves on being able to provide a personal touch for their customers, and the size of their customer base meant they needed a solution that would be able to integrate large amounts of traditionally siloed customer and product data. Enterprise data hub provides foundation for data-driven culture A L L I A N C E S Data architecture and solution design Data governance deployment Multi-zone data lake design and buildout Ingestion and integration using metadata-based big data integration tool Data discovery enabled and Tableau dashboard deployed Financial Services INDUSTRY BIG DATA SERVICES Big Data Startup Planning Big Data Governance Big Data Implementation Enablement and Adoption STORC
  • 12. Value Identification Value Demonstration Value Realization
  • 13. We think, most of the organizations lack engineering skills required to fully leverage Hadoop ecosystem and realize the potential of new technologies. Approach Culture Organizations are using traditional source-to-target approach of acquiring and integrating data for known use cases Mindset has to change from hoarding and protecting information to making it easy to access and use data as an enterprise asset Architecture Skills Usually considered an IT infrastructure project, Hadoop is used as a large file system to dump data files with limited use and marginal business value Majority of the data professionals (ETL developers, data analysts) lack engineering skills required to fully leverage Hadoop technologies
  • 14. Smart data lake should be.... Enterpise Scale Auditable Governed SupportedSecured Multi-Use Support Extensible Open SourceStandardized Designed Right Governed for Adoption Economical to Use
  • 15. Data Lake DATA SOURCES DATA MANAGEMENT Data Lake RAW Persistence of source data Streaming Files Databases EgressAPIs Standardized, reconciled, and quality checked ENRICHED Discovery/ Sandbox DISCOVERY DATA STORAGE OPTIONS HDFS, S3 MODELED Data Governance & Master Data Management DATA DELIVERY & CONSUMPTION BI & REPORTING MOBILE & WEB APPS EDW On Premise Relational NoSQL EDW in Cloud Relational NoSQL EXTERNAL BUSINESS PARTNERS Multi-zone, self-governed data lake to provide secure and flexible data architecture to harness enterprise data for accelerated speed to insight.
  • 16. Data Lake DATA SOURCES HADOOP SOLUTION COMPONENTS Streaming Files Databases Batch Streaming Acquisition and Ingestion Transformation Discovery and Modeling RAW ENRICHED DISCOVERY The architecture implements data pipelines using our purpose-built open source integration APIs accelerating implementation by 9-12 weeks.
  • 17. The accelerator enables self-service by allowing data analysts and data SMEs to ingest new data sources and promote data through the lake with limited to no IT dependencies. DATA SOURCES Streaming Files Databases RAW Business/ Data SME METADATA MANAGEMENT & CONTROL API Files TARGETS
  • 18. Foundation Migration Optimization Assess and Prioritize Applications Application Analysis System Backlog Optimize SystemsImplement Security, Networking, and Operating Models Security & Operations Assessment Develop Security and Operating Model Application Migration Factory Sprint 1 - n Workload 1 Workload 2 Workload n Workload 1 Workload 2 Workload n Strategy Definition Outline Desired Outcomes Build & Transition Organization Process Service Model Org Structure Capabilities Governance Metrics Communications, Training, Change Mgmt Improvement System Prioritization & Roadmap Design, Migrate, Integrate and Validate Applications Value Realization Continuous Feedback Transition to Operations On-Premise Cloud Cloud presents an opportunity to transform on premise workloads into purpose driven scalable solutions
  • 19. Story contact: P R O J E C T PEM delivery methodology was used to deliver a cost effective and scalable solution Client is exploring opportunities to monetize the solution as an analytics workbench The data science team can leverage both SAS and R integration with the platform for advanced analytics Sunset existing platforms, reducing licensing and support maintenance costs R E S U L T S Slalom partnered with a Fortune 500 healthcare company to deliver a next generation data platform. The client’s existing platform could not support increasing data volumes and a growing need for advanced analytics workloads. The new platform not only addressed these scalability concerns but also allowed the client to host both structured and unstructured data in near-real time. Most importantly, this data platform opened doors for new monetization opportunities Slalom built a next generation Hadoop data platform to meet the client’s needs. Leveraging the cloud enabled a quick turnaround time as well as security features ideal for storing PII and PHI data. Slalom team migrated and optimized existing data to leverage Hadoop high- performance features. Slalom also built a near-real time platform that can ingest HL7 messages from several hospitals and provide event-driven alerting. Maz Chaudhri Next Generation Data Platform for Healthcare Analytics T E C H N O L O G Y B A C K G R O U N D Healthcare INDUSTRY BIG DATA SERVICES Agile Delivery Approach Big Data Implementation
  • 20. Story contact: P R O J E C T Agile Delivery Methodology Real-time data platform Self-service enablement Up-to-the-minute view into the operations of over 6,000 restaurant locations nation-wide. Ability to monitor KPIs and react with targeted efforts to boost sales exactly where it is needed. R E S U L T S Our client in the fast-casual food industry was having widespread challenges accurately capturing and measuring key business metrics. Due to inconsistent data integrity in the nightly batch process, executives and leaders were growing skeptical of the reliability of reporting and analytics built from the data. Leaders were clamoring for timely visibility to better, cleaner data. The Slalom team served as Scrum Master, Product Owner and Analyst during the architecture and delivery of the AWS-based Cloudera platform. Using a Kafka- based publish-subscribe architecture, each restaurant location in addition to the online ordering platform was set up to stream data feeds to the unified Cloud platform. Maz Chaudhri B A C K G R O U N D T E C H N O L O G Y Food Service INDUSTRY BIG DATA SERVICES Agile Delivery Approach Big Data Startup Planning Platform Evaluation & Selection Big Data Implementation
  • 21. Story contact: P R O J E C T A scalable and flexible Big Data Platform A universal XML ingestion framework HDFS Data lake that ingests and persists all data from source system Allowed the client to sunset a reporting product that saved over $1MM annually in support maintenance cost Qlik BI & Operational reports utilizing Hadoop as the backend R E S U L T S A top 10 Pharmaceutical company, and top 150 Fortune 500, sought to implement a next generation modern data platform. The platform needed to not only provide end to end supply chain visibility, but also be flexible and scalable to handle a heavy volume of serialized data. The client also wanted to establish a data lake so as to be able to predict and prescribe their inventory and shipments to better serve their customers. Slalom utilized AWS and Cloudera Hadoop to build this next generation data platform. The data platform gave visibility to inventory levels to help drive the development of inventory optimization strategies and integrated multiple disparate sources to give end to end shipment visibility of the client’s supply chain. Pharmaceuticals INDUSTRY Next Generation Data Platform & Supply Chain visibility A L L I A N C E S B A C K G R O U N D BIG DATA SERVICES Agile Delivery Approach Big Data Implementation
  • 22. Time to Value Proven Approach and Experience Pre-Built Accelerators AGILE ENGINEERING APPROACH Start small, deliver value and evolve your Big Data program BIG DATA STARTUP PLANNING Pre-defined epics and stories for big data startup DATA GOVERNANCE in a BOX Multi-faceted data governance deployment and tools READINESS AND ADOPTION Org readiness and change strategy and enablement PLATFORM SELECTION Best practices-based evaluation toolset BIG DATA INTEGRATION TOOL Open-source meta-data driven integration API 1 32

Editor's Notes

  1. Thanks Rohit. Asher talked about Cloudera as a data platform and Rohit walked us through how you could do more with the data platform in the cloud. In next 30 min, Navendu and I will talk about how to design and build a smart data lake.
  2. In earlier discussion, you heard….however…
  3. …the ecosystem is complex and continue to grow. You need an experienced and knowledgeable implementation partner to do it right.
  4. You could be at different stages of your big data journey…
  5. …based on what we hear from our clients, we define the journey in three stages: Value Identification Value Demonstration Value Realization
  6. At this stage you are educating key stakeholders using various concepts with the goals of identifying impactful use case(s)
  7. Multi-Use Support Support multiple use cases or services: data analytics, data delivery, reporting, People: Platform Ownership Skill and Role Gap Analysis Adoption Plan & Roadmap Learning Plan Sustainability Plan Process – Operating Model Onboarding Processes Support Model Team Ownership Monitoring & Measurement Maintenance & Promotion Processes Implementation Roadmap Technology: Platform Utilization Use case coverage Tool support Information: Governance & Controls Governance Process Data Sharing Certification
  8. Accelerate Ingestion: separation of data ingestion from discovery and use of metadata-based API accelerate the most time-consuming and resource-intensive part of any data management projects Standardize Data: Data is standardized with common transformations ensuring consistent use of data for discovery and modeling Automate Transformation: As new transformations are identified as “common”, they are applied in Transformation zone