SlideShare a Scribd company logo
Building Smart Data Lake
Slalom Approach
May, 2018
"Better to bet on cloud providers
for infrastructure, Cloudera for data,
analytics and security fabric, and
leave the rest to the ecosystem"
The Ecosystem
Value Identification Value Demonstration Value Realization
Exploring big data and
working to identify a
business case
Past the business case
and need to demonstrate
value for broader adoption
Implemented early use
cases with limited value
and lacking traction
It takes a long time
We don’t have big data
I’m not sure how to
start
All our data is
structured data
We already have a
data warehouse
We don’t have
business case for it
It is hard to find the
required skillsets
No one using our
Hadoop data
Hadoop ecosystem is
overwhelming
Business users are not
happy
Big data has come long way and the enterprises are at different phase of their journey. However,
broader adoption of the computation ecosystem is still in its early stages.
It is too expensive
We yet to realize the
benefits it promises
Value Identification Value Demonstration Value Realization
Exploring big data and
working to identify a
business case
Past the business case
and need to demonstrate
value for broader adoption
Implemented early use
cases with limited value
and lacking traction
Value Identification Value Demonstration Value Realization
Traditional architectures use rigid data models, costly platforms, resource-intensive ETL and lack
support for new use cases.
Rigid Data Architecture
Early binding to the pre-defined schema makes it inflexible
and costly
Flexible Architecture
Data is ingested and transformed without prior knowledge of target
schema
Costly Infrastructure and Solution
Data duplicated across costly platforms
50-70% spend on acquisition and integration
Simplified Infrastructure and Solution
Flexible on-premise and cloud infrastructure
API-based pipelines automate data ingestion
Lacks Support for “New” Use Cases
Data silo’s impede real-time processing required to support
modern use cases
Best Suited for “New” Use Cases
Centralized hub for heterogeneous data and variety of tools enable
real-time analytics
Declining Talent Pool
The new talent lacks excitement for the traditional
technologies and tools
Growing Talent Pool
Elevated interest in data engineering and data science work
Traditional Modern
Requires army of costly professionals to support longer delivery cycles and brittle data processes.
Slower Speed-to-Market
Longer delivery lifecycle involving too many project phases
Accelerated Speed-to-Market
Separation of data management from discovery and analytics
accelerates solution delivery
Heavy Reliance on Costly IT Resources
Point-to-point ETL and early binding data model requires IT
resources for any data changes
Enabled Business Self-Service
Centralized data enables “data wrangling” and analytics by
business users and data scientists
Army of Data Professionals Streamlined Data Roles
Traditional Modern
Value Identification Value Demonstration Value Realization
Select outcome-based high impact use case(s) and deliver minimal viable product (MVP) to demonstrate
immediate success.
.
Story contact:
C L I E N T S O L U T I O N S I N D U S T R Y
The client had a vision to drive improved customer experience and engagement through
personalized marketing campaigns and needed an on-premise solution that enables the initial use
cases and provides foundation for enterprise-wide analytics. Slalom architected a multi-zone data
lake to harness and analyze internal and external customer and product data, enabling real-time
analytics and a personalized customer experience.
Financial ServicesA financial services company serving over 16 million
customers nationwide. They pride themselves on being
able to provide a personal touch for their customers,
and the size of their customer base meant they needed
a solution that would be able to integrate large amounts
of traditionally siloed customer and product data.
Enterprise data hub provides
foundation for data-driven culture
A L L I A N C E S
Data architecture and
solution design
Data governance
deployment
Multi-zone data lake
design and buildout
Ingestion and
integration using
metadata-based big
data integration tool
Data discovery
enabled and Tableau
dashboard deployed
Financial Services
INDUSTRY
BIG DATA SERVICES
Big Data Startup Planning
Big Data Governance
Big Data Implementation
Enablement and Adoption
STORC
Value Identification Value Demonstration Value Realization
We think, most of the organizations lack engineering skills required to fully leverage Hadoop ecosystem
and realize the potential of new technologies.
Approach Culture
Organizations are using
traditional source-to-target
approach of acquiring and
integrating data for known
use cases
Mindset has to change from
hoarding and protecting
information to making it
easy to access and use
data as an enterprise asset
Architecture Skills
Usually considered an IT
infrastructure project,
Hadoop is used as a large
file system to dump data
files with limited use and
marginal business value
Majority of the data
professionals (ETL
developers, data analysts)
lack engineering skills
required to fully leverage
Hadoop technologies
Smart data lake should be....
Enterpise Scale Auditable
Governed
SupportedSecured
Multi-Use Support
Extensible
Open SourceStandardized
Designed
Right
Governed for
Adoption
Economical
to Use
Data Lake
DATA SOURCES DATA MANAGEMENT
Data Lake
RAW
Persistence of
source data
Streaming
Files
Databases
EgressAPIs
Standardized,
reconciled, and
quality checked
ENRICHED
Discovery/
Sandbox
DISCOVERY
DATA STORAGE OPTIONS
HDFS, S3
MODELED
Data Governance & Master Data Management
DATA DELIVERY &
CONSUMPTION
BI & REPORTING
MOBILE &
WEB APPS
EDW On Premise
Relational NoSQL
EDW in Cloud
Relational NoSQL
EXTERNAL BUSINESS
PARTNERS
Multi-zone, self-governed data lake to provide secure and flexible data architecture to harness enterprise
data for accelerated speed to insight.
Data Lake
DATA SOURCES HADOOP SOLUTION COMPONENTS
Streaming
Files
Databases
Batch
Streaming
Acquisition and
Ingestion
Transformation
Discovery and
Modeling
RAW ENRICHED DISCOVERY
The architecture implements data pipelines using our purpose-built open source integration APIs
accelerating implementation by 9-12 weeks.
The accelerator enables self-service by allowing data analysts and data SMEs to ingest new data
sources and promote data through the lake with limited to no IT dependencies.
DATA SOURCES
Streaming
Files
Databases
RAW
Business/
Data SME
METADATA MANAGEMENT & CONTROL API
Files
TARGETS
Foundation Migration Optimization
Assess and Prioritize
Applications
Application Analysis
System
Backlog
Optimize SystemsImplement
Security,
Networking, and
Operating Models
Security & Operations
Assessment
Develop Security and
Operating Model
Application Migration Factory
Sprint 1 - n
Workload 1
Workload 2
Workload n
Workload 1
Workload 2
Workload n
Strategy Definition
Outline Desired
Outcomes
Build & Transition Organization
Process Service Model
Org Structure Capabilities
Governance Metrics
Communications, Training, Change Mgmt
Improvement
System Prioritization
& Roadmap
Design, Migrate, Integrate and Validate
Applications
Value
Realization
Continuous Feedback
Transition to
Operations
On-Premise Cloud
Cloud presents an opportunity to transform on premise workloads into purpose driven scalable solutions
Story contact:
P R O J E C T
PEM delivery
methodology was
used to deliver a cost
effective and scalable
solution
Client is exploring
opportunities to
monetize the solution
as an analytics
workbench
The data science
team can leverage
both SAS and R
integration with the
platform for advanced
analytics
Sunset existing
platforms, reducing
licensing and support
maintenance costs
R E S U L T S
Slalom partnered with a Fortune 500 healthcare company to deliver a next generation data platform.
The client’s existing platform could not support increasing data volumes and a growing need for
advanced analytics workloads. The new platform not only addressed these scalability concerns but
also allowed the client to host both structured and unstructured data in near-real time. Most
importantly, this data platform opened doors for new monetization opportunities
Slalom built a next generation Hadoop data platform to
meet the client’s needs. Leveraging the cloud enabled a
quick turnaround time as well as security features ideal for
storing PII and PHI data. Slalom team migrated and
optimized existing data to leverage Hadoop high-
performance features. Slalom also built a near-real time
platform that can ingest HL7 messages from several
hospitals and provide event-driven alerting. Maz Chaudhri
Next Generation Data Platform for
Healthcare Analytics
T E C H N O L O G Y
B A C K G R O U N D
Healthcare
INDUSTRY
BIG DATA SERVICES
Agile Delivery Approach
Big Data Implementation
Story contact:
P R O J E C T
Agile Delivery
Methodology
Real-time data
platform
Self-service
enablement
Up-to-the-minute view
into the operations of
over 6,000 restaurant
locations nation-wide.
Ability to monitor KPIs
and react with
targeted efforts to
boost sales exactly
where it is needed.
R E S U L T S
Our client in the fast-casual food industry was having widespread challenges accurately capturing and measuring key business
metrics. Due to inconsistent data integrity in the nightly batch process, executives and leaders were growing skeptical of the
reliability of reporting and analytics built from the data. Leaders were clamoring for timely visibility to better, cleaner data.
The Slalom team served as Scrum Master, Product
Owner and Analyst during the architecture and delivery
of the AWS-based Cloudera platform. Using a Kafka-
based publish-subscribe architecture, each restaurant
location in addition to the online ordering platform was
set up to stream data feeds to the unified Cloud
platform.
Maz Chaudhri
B A C K G R O U N D
T E C H N O L O G Y
Food Service
INDUSTRY
BIG DATA SERVICES
Agile Delivery Approach
Big Data Startup Planning
Platform Evaluation & Selection
Big Data Implementation
Story contact:
P R O J E C T
A scalable and
flexible Big Data
Platform
A universal XML
ingestion framework
HDFS Data lake that
ingests and persists
all data from source
system
Allowed the client to
sunset a reporting
product that saved
over $1MM annually
in support
maintenance cost
Qlik BI & Operational
reports utilizing
Hadoop as the
backend
R E S U L T S
A top 10 Pharmaceutical company, and top 150 Fortune 500, sought to implement a next generation
modern data platform. The platform needed to not only provide end to end supply chain visibility, but
also be flexible and scalable to handle a heavy volume of serialized data. The client also wanted to
establish a data lake so as to be able to predict and prescribe their inventory and shipments to better
serve their customers.
Slalom utilized AWS and Cloudera Hadoop to build this
next generation data platform. The data platform gave
visibility to inventory levels to help drive the
development of inventory optimization strategies and
integrated multiple disparate sources to give end to end
shipment visibility of the client’s supply chain.
Pharmaceuticals
INDUSTRY
Next Generation Data Platform &
Supply Chain visibility
A L L I A N C E S
B A C K G R O U N D
BIG DATA SERVICES
Agile Delivery Approach
Big Data Implementation
Time to Value Proven Approach and
Experience
Pre-Built Accelerators
AGILE ENGINEERING
APPROACH
Start small, deliver value and
evolve your Big Data program
BIG DATA STARTUP
PLANNING
Pre-defined epics and stories
for big data startup
DATA GOVERNANCE in
a BOX
Multi-faceted data governance
deployment and tools
READINESS AND ADOPTION
Org readiness and change
strategy and enablement
PLATFORM SELECTION
Best practices-based
evaluation toolset
BIG DATA INTEGRATION
TOOL
Open-source meta-data driven
integration API
1 32

More Related Content

What's hot

Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
Cloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Cloudera, Inc.
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
Cloudera, Inc.
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
Cloudera, Inc.
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
Cloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera, Inc.
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
Cloudera, Inc.
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Cloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
Cloudera, Inc.
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
Cloudera, Inc.
 

What's hot (20)

Self-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft AzureSelf-service Big Data Analytics on Microsoft Azure
Self-service Big Data Analytics on Microsoft Azure
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Kudu Forrester Webinar
Kudu Forrester WebinarKudu Forrester Webinar
Kudu Forrester Webinar
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
Cloudera Fast Forward Labs: The Vision and the Challenge of Applied Machine L...
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud WorldPart 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
Part 1: Cloudera’s Analytic Database: BI & SQL Analytics in a Hybrid Cloud World
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
 
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence
Driving Better Products with Customer Intelligence

Driving Better Products with Customer Intelligence

 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 

Similar to Big data journey to the cloud maz chaudhri 5.30.18

Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Cloudera, Inc.
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
DataScienceConferenc1
 
What's New in Pentaho 7.0?
What's New in Pentaho 7.0?What's New in Pentaho 7.0?
What's New in Pentaho 7.0?
Xpand IT
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
Devon Ziegenfuss
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
Julian Tong
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
Ricky Barron
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
Julianna DeLua
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Streamsets Inc.
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
Microsoft
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
PwC
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
Sai Paravastu
 
Accelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data VirtualizationAccelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data Virtualization
Denodo
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
Caserta
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
Precisely
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
The Marketing Distillery
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Hortonworks
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Kiththi Perera
 

Similar to Big data journey to the cloud maz chaudhri 5.30.18 (20)

Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and ClouderaIs your big data journey stalling? Take the Leap with Capgemini and Cloudera
Is your big data journey stalling? Take the Leap with Capgemini and Cloudera
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
What's New in Pentaho 7.0?
What's New in Pentaho 7.0?What's New in Pentaho 7.0?
What's New in Pentaho 7.0?
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare8.17.11 big data and hadoop with informatica slideshare
8.17.11 big data and hadoop with informatica slideshare
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Accelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data VirtualizationAccelerate Cloud Migrations and Architecture with Data Virtualization
Accelerate Cloud Migrations and Architecture with Data Virtualization
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 
Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​Democratized Data & Analytics for the Cloud​
Democratized Data & Analytics for the Cloud​
 
Data and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the CloudData and Application Modernization in the Age of the Cloud
Data and Application Modernization in the Age of the Cloud
 
Getting down to business on Big Data analytics
Getting down to business on Big Data analyticsGetting down to business on Big Data analytics
Getting down to business on Big Data analytics
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
Cloudera, Inc.
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
When SAP alone is not enough
When SAP alone is not enoughWhen SAP alone is not enough
When SAP alone is not enough
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
FODUU
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Things to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUUThings to Consider When Choosing a Website Developer for your Website | FODUU
Things to Consider When Choosing a Website Developer for your Website | FODUU
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 

Big data journey to the cloud maz chaudhri 5.30.18

  • 1. Building Smart Data Lake Slalom Approach May, 2018
  • 2. "Better to bet on cloud providers for infrastructure, Cloudera for data, analytics and security fabric, and leave the rest to the ecosystem"
  • 4. Value Identification Value Demonstration Value Realization Exploring big data and working to identify a business case Past the business case and need to demonstrate value for broader adoption Implemented early use cases with limited value and lacking traction
  • 5. It takes a long time We don’t have big data I’m not sure how to start All our data is structured data We already have a data warehouse We don’t have business case for it It is hard to find the required skillsets No one using our Hadoop data Hadoop ecosystem is overwhelming Business users are not happy Big data has come long way and the enterprises are at different phase of their journey. However, broader adoption of the computation ecosystem is still in its early stages. It is too expensive We yet to realize the benefits it promises Value Identification Value Demonstration Value Realization Exploring big data and working to identify a business case Past the business case and need to demonstrate value for broader adoption Implemented early use cases with limited value and lacking traction
  • 6. Value Identification Value Demonstration Value Realization
  • 7. Traditional architectures use rigid data models, costly platforms, resource-intensive ETL and lack support for new use cases. Rigid Data Architecture Early binding to the pre-defined schema makes it inflexible and costly Flexible Architecture Data is ingested and transformed without prior knowledge of target schema Costly Infrastructure and Solution Data duplicated across costly platforms 50-70% spend on acquisition and integration Simplified Infrastructure and Solution Flexible on-premise and cloud infrastructure API-based pipelines automate data ingestion Lacks Support for “New” Use Cases Data silo’s impede real-time processing required to support modern use cases Best Suited for “New” Use Cases Centralized hub for heterogeneous data and variety of tools enable real-time analytics Declining Talent Pool The new talent lacks excitement for the traditional technologies and tools Growing Talent Pool Elevated interest in data engineering and data science work Traditional Modern
  • 8. Requires army of costly professionals to support longer delivery cycles and brittle data processes. Slower Speed-to-Market Longer delivery lifecycle involving too many project phases Accelerated Speed-to-Market Separation of data management from discovery and analytics accelerates solution delivery Heavy Reliance on Costly IT Resources Point-to-point ETL and early binding data model requires IT resources for any data changes Enabled Business Self-Service Centralized data enables “data wrangling” and analytics by business users and data scientists Army of Data Professionals Streamlined Data Roles Traditional Modern
  • 9. Value Identification Value Demonstration Value Realization
  • 10. Select outcome-based high impact use case(s) and deliver minimal viable product (MVP) to demonstrate immediate success. .
  • 11. Story contact: C L I E N T S O L U T I O N S I N D U S T R Y The client had a vision to drive improved customer experience and engagement through personalized marketing campaigns and needed an on-premise solution that enables the initial use cases and provides foundation for enterprise-wide analytics. Slalom architected a multi-zone data lake to harness and analyze internal and external customer and product data, enabling real-time analytics and a personalized customer experience. Financial ServicesA financial services company serving over 16 million customers nationwide. They pride themselves on being able to provide a personal touch for their customers, and the size of their customer base meant they needed a solution that would be able to integrate large amounts of traditionally siloed customer and product data. Enterprise data hub provides foundation for data-driven culture A L L I A N C E S Data architecture and solution design Data governance deployment Multi-zone data lake design and buildout Ingestion and integration using metadata-based big data integration tool Data discovery enabled and Tableau dashboard deployed Financial Services INDUSTRY BIG DATA SERVICES Big Data Startup Planning Big Data Governance Big Data Implementation Enablement and Adoption STORC
  • 12. Value Identification Value Demonstration Value Realization
  • 13. We think, most of the organizations lack engineering skills required to fully leverage Hadoop ecosystem and realize the potential of new technologies. Approach Culture Organizations are using traditional source-to-target approach of acquiring and integrating data for known use cases Mindset has to change from hoarding and protecting information to making it easy to access and use data as an enterprise asset Architecture Skills Usually considered an IT infrastructure project, Hadoop is used as a large file system to dump data files with limited use and marginal business value Majority of the data professionals (ETL developers, data analysts) lack engineering skills required to fully leverage Hadoop technologies
  • 14. Smart data lake should be.... Enterpise Scale Auditable Governed SupportedSecured Multi-Use Support Extensible Open SourceStandardized Designed Right Governed for Adoption Economical to Use
  • 15. Data Lake DATA SOURCES DATA MANAGEMENT Data Lake RAW Persistence of source data Streaming Files Databases EgressAPIs Standardized, reconciled, and quality checked ENRICHED Discovery/ Sandbox DISCOVERY DATA STORAGE OPTIONS HDFS, S3 MODELED Data Governance & Master Data Management DATA DELIVERY & CONSUMPTION BI & REPORTING MOBILE & WEB APPS EDW On Premise Relational NoSQL EDW in Cloud Relational NoSQL EXTERNAL BUSINESS PARTNERS Multi-zone, self-governed data lake to provide secure and flexible data architecture to harness enterprise data for accelerated speed to insight.
  • 16. Data Lake DATA SOURCES HADOOP SOLUTION COMPONENTS Streaming Files Databases Batch Streaming Acquisition and Ingestion Transformation Discovery and Modeling RAW ENRICHED DISCOVERY The architecture implements data pipelines using our purpose-built open source integration APIs accelerating implementation by 9-12 weeks.
  • 17. The accelerator enables self-service by allowing data analysts and data SMEs to ingest new data sources and promote data through the lake with limited to no IT dependencies. DATA SOURCES Streaming Files Databases RAW Business/ Data SME METADATA MANAGEMENT & CONTROL API Files TARGETS
  • 18. Foundation Migration Optimization Assess and Prioritize Applications Application Analysis System Backlog Optimize SystemsImplement Security, Networking, and Operating Models Security & Operations Assessment Develop Security and Operating Model Application Migration Factory Sprint 1 - n Workload 1 Workload 2 Workload n Workload 1 Workload 2 Workload n Strategy Definition Outline Desired Outcomes Build & Transition Organization Process Service Model Org Structure Capabilities Governance Metrics Communications, Training, Change Mgmt Improvement System Prioritization & Roadmap Design, Migrate, Integrate and Validate Applications Value Realization Continuous Feedback Transition to Operations On-Premise Cloud Cloud presents an opportunity to transform on premise workloads into purpose driven scalable solutions
  • 19. Story contact: P R O J E C T PEM delivery methodology was used to deliver a cost effective and scalable solution Client is exploring opportunities to monetize the solution as an analytics workbench The data science team can leverage both SAS and R integration with the platform for advanced analytics Sunset existing platforms, reducing licensing and support maintenance costs R E S U L T S Slalom partnered with a Fortune 500 healthcare company to deliver a next generation data platform. The client’s existing platform could not support increasing data volumes and a growing need for advanced analytics workloads. The new platform not only addressed these scalability concerns but also allowed the client to host both structured and unstructured data in near-real time. Most importantly, this data platform opened doors for new monetization opportunities Slalom built a next generation Hadoop data platform to meet the client’s needs. Leveraging the cloud enabled a quick turnaround time as well as security features ideal for storing PII and PHI data. Slalom team migrated and optimized existing data to leverage Hadoop high- performance features. Slalom also built a near-real time platform that can ingest HL7 messages from several hospitals and provide event-driven alerting. Maz Chaudhri Next Generation Data Platform for Healthcare Analytics T E C H N O L O G Y B A C K G R O U N D Healthcare INDUSTRY BIG DATA SERVICES Agile Delivery Approach Big Data Implementation
  • 20. Story contact: P R O J E C T Agile Delivery Methodology Real-time data platform Self-service enablement Up-to-the-minute view into the operations of over 6,000 restaurant locations nation-wide. Ability to monitor KPIs and react with targeted efforts to boost sales exactly where it is needed. R E S U L T S Our client in the fast-casual food industry was having widespread challenges accurately capturing and measuring key business metrics. Due to inconsistent data integrity in the nightly batch process, executives and leaders were growing skeptical of the reliability of reporting and analytics built from the data. Leaders were clamoring for timely visibility to better, cleaner data. The Slalom team served as Scrum Master, Product Owner and Analyst during the architecture and delivery of the AWS-based Cloudera platform. Using a Kafka- based publish-subscribe architecture, each restaurant location in addition to the online ordering platform was set up to stream data feeds to the unified Cloud platform. Maz Chaudhri B A C K G R O U N D T E C H N O L O G Y Food Service INDUSTRY BIG DATA SERVICES Agile Delivery Approach Big Data Startup Planning Platform Evaluation & Selection Big Data Implementation
  • 21. Story contact: P R O J E C T A scalable and flexible Big Data Platform A universal XML ingestion framework HDFS Data lake that ingests and persists all data from source system Allowed the client to sunset a reporting product that saved over $1MM annually in support maintenance cost Qlik BI & Operational reports utilizing Hadoop as the backend R E S U L T S A top 10 Pharmaceutical company, and top 150 Fortune 500, sought to implement a next generation modern data platform. The platform needed to not only provide end to end supply chain visibility, but also be flexible and scalable to handle a heavy volume of serialized data. The client also wanted to establish a data lake so as to be able to predict and prescribe their inventory and shipments to better serve their customers. Slalom utilized AWS and Cloudera Hadoop to build this next generation data platform. The data platform gave visibility to inventory levels to help drive the development of inventory optimization strategies and integrated multiple disparate sources to give end to end shipment visibility of the client’s supply chain. Pharmaceuticals INDUSTRY Next Generation Data Platform & Supply Chain visibility A L L I A N C E S B A C K G R O U N D BIG DATA SERVICES Agile Delivery Approach Big Data Implementation
  • 22. Time to Value Proven Approach and Experience Pre-Built Accelerators AGILE ENGINEERING APPROACH Start small, deliver value and evolve your Big Data program BIG DATA STARTUP PLANNING Pre-defined epics and stories for big data startup DATA GOVERNANCE in a BOX Multi-faceted data governance deployment and tools READINESS AND ADOPTION Org readiness and change strategy and enablement PLATFORM SELECTION Best practices-based evaluation toolset BIG DATA INTEGRATION TOOL Open-source meta-data driven integration API 1 32

Editor's Notes

  1. Thanks Rohit. Asher talked about Cloudera as a data platform and Rohit walked us through how you could do more with the data platform in the cloud. In next 30 min, Navendu and I will talk about how to design and build a smart data lake.
  2. In earlier discussion, you heard….however…
  3. …the ecosystem is complex and continue to grow. You need an experienced and knowledgeable implementation partner to do it right.
  4. You could be at different stages of your big data journey…
  5. …based on what we hear from our clients, we define the journey in three stages: Value Identification Value Demonstration Value Realization
  6. At this stage you are educating key stakeholders using various concepts with the goals of identifying impactful use case(s)
  7. Multi-Use Support Support multiple use cases or services: data analytics, data delivery, reporting, People: Platform Ownership Skill and Role Gap Analysis Adoption Plan & Roadmap Learning Plan Sustainability Plan Process – Operating Model Onboarding Processes Support Model Team Ownership Monitoring & Measurement Maintenance & Promotion Processes Implementation Roadmap Technology: Platform Utilization Use case coverage Tool support Information: Governance & Controls Governance Process Data Sharing Certification
  8. Accelerate Ingestion: separation of data ingestion from discovery and use of metadata-based API accelerate the most time-consuming and resource-intensive part of any data management projects Standardize Data: Data is standardized with common transformations ensuring consistent use of data for discovery and modeling Automate Transformation: As new transformations are identified as “common”, they are applied in Transformation zone