SlideShare a Scribd company logo
1 of 61
Download to read offline
Everything Has
Changed Except Us
Modernizing the Data
Warehouse Architecture
June, 2016
Mark Madsen
www.ThirdNature.net
@markmadsen
Copyright Third Nature, Inc.
The big data market is:
… a leap forward
A leap in evolution to a more
flexible way of gathering data
and generating useful
information.
▪ Decide if data is good enough
at the time of use
▪ Files are flexible
▪ Save time collecting data
▪ “self service”
A new approach to managing and
using data.
Copyright Third Nature, Inc.
The big data market is:
… a step backward
A step backward to methods not
capable of providing the quality,
manageability, accessibility and
reuse we need.
▪ Ensure data quality up front
rather than after the fact
▪ Tables are stable
▪ Save time using data
▪ “self service”
A reinvention of the wheel, going
back to the manual era.
Copyright Third Nature, Inc.
The architecture we are (still) using today
The general concept of a
separate architecture for BI
has been around longer, but
this paper by Devlin and
Murphy is the first formal
data warehouse architecture
and definition published.
4
“An architecture for a business and
information system”, B. A. Devlin,
P. T. Murphy, IBM Systems Journal,
Vol.27, No. 1, (1988)
Slide 4Copyright Third Nature, Inc.
Copyright Third Nature, Inc.
Environment in 1988: no big data, only big hair.
▪ No real commercial email, public internet barely started
▪ Storage state of the art: 100MB, cost $10,000/GB
▪ Oracle Applications v1 GL released; SAP goes public,
enters US market
▪ Unix is mostly run by long-haired freaks
▪ Mobile was this
The environment: scarcity of data, of reports, of computer systems
(outside of core financials), of resources, of money to pay for storage.
Copyright Third Nature, Inc.
Data warehouse: centralize, that solves all problems!
Creates bottlenecks
Causes scale problems
Enforces a single model
Copyright Third Nature, Inc.
The data lake solution: no central authority
wtf, it was fully
operational!
Copyright Third Nature, Inc.
The data lake solution?
There’s a problem: as
the lake is envisioned,
it is still a centralized
data architecture, but
this time there is no
single global model.
Instead it’s files and
not modeled.
It’s still a death star.
Copyright Third Nature, Inc.
Eventually we run into the same problems
Seriously, wtf?
It was agile
and operational
Copyright Third Nature, Inc.
Growing complexity has changed our environment
You can barely keep up with source changes
You can’t keep up with new data requests
You are scale, performance and latency limited
But:
The organizations want more, different, current data
The reality is continuous change – lots of maintenance
Copyright Third Nature, Inc.
Meanwhile, we’ve become
the department of “No”
• Environments are more complex
• Data volumes are 10,000X larger
• Data is more complex
• More use of BI
• We have entirely new data uses
We’ve been struggling with
shrinking load windows,
performance problems, and an
inability to quickly meet new data
and analytics requests for years,
yet we keep using the same design.
Copyright Third Nature, Inc.
I never said the
“E” in EDW meant
“everything”…
What do
you mean,
“3 months?”
Copyright Third Nature, Inc.
It’s going to get a lot worse
Not E
E
Conclusion: any methodology built on the premise that you
must know and model all data before using it is untenable.
Copyright Third Nature, Inc.
Streams of events and observations
This data doesn’t fit well with current methods of collection and
storage, or with the technology to process and analyze it.
Copyright Third Nature, Inc.
Copyright Third Nature, Inc.
Declarations
The problems of events apply to text. Which can be an event payload
Copyright Third Nature, Inc.
Categorizing the measurement data we collect
The convenient data is the
transactional data.
▪ Goes in the DW and is used, even
if it isn’t the right measurement.
The inconvenient data is the
observational data.
▪ It’s not neat, clean, or designed
into most systems of operation.
The difficult data is the
declarative data.
▪ What people say and what they
do require ground truth.
We need an architecture that
supports all three categories.
Copyright Third Nature, Inc.
Copyright Third Nature, Inc.
New uses: analytics embiggens the data volume problem
Many of the algorithms we need them to make sense of the
observations and declarations are are O(n2) or worse.
Copyright Third Nature, Inc.
Publishing is the primary view of BI, self service, big data
Copyright Third Nature, Inc.
What people do with data today: not just read it
Explore and
Understand
Inform and
Explain
Convince
and Decice
Deliver
Process
Collect
Copyright Third Nature, Inc.
Data architecture requires understanding data use
so we can build the right infrastructure
Collect
new data
Monitor
Analyze
Exceptions
Analyze
Causes
Decide Act
Act on the process
Act within the process
We need to focus on what people do with information as
the primary task, not on the data or the technology.
Copyright Third Nature, Inc.
The usage models for conventional BI
Collect
new data
Monitor
Analyze
Exceptions
Analyze
Causes
Decide Act
No problem No idea Do nothing
Act on the process
Usually days/longer timeframe
Act within the process
Usually real-time to daily
This is what we’ve been
doing with BI so far: static
reporting, dashboards,
ad-hoc query, OLAP
Copyright Third Nature, Inc.
The usage models for analytics and “big data”
Collect
new data
Monitor
Analyze
Exceptions
Analyze
Causes
Decide Act
No problem No idea Do nothing
Act on the process
Usually days/longer timeframe
Act within the process
Usually real-time to daily
Analytics and exploration
is focused on new use
cases: deeper analysis,
causes, prediction,
optimizing decisions
This isn’t ad-hoc query,
reporting, or OLAP.
Copyright Third Nature, Inc.
BI is part of a dynamic system. There are feedback loops
that can change both the data and the model.
Collect
new data
Monitor
Analyze
Exceptions
Analyze
Causes
Decide Act
Act on the process
Act within the process
Known: stable, can be modeled in advance
Unknown: new, you can’t model in advance
Copyright Third Nature, Inc.
The root cause of our problems is change
Our rate of change is slower than
the business. This can’t be fixed
via buying more technology.
Copyright Third Nature, Inc.
We have a design for stability. We need one for adaptability
Copyright Third Nature, Inc.
Agile methods without agile architectures fail
Copyright Third Nature, Inc.
A core problem with one big schema is change
Copyright Third Nature, Inc.
The big data market has an answer…
Copyright Third Nature, Inc.
Big data answer?
Schema-on-read!
Rather than make data
fit the model, make the
model fit the data.
There’s a price to pay
with using “schema-on-
read” for everything.
You won’t see the
problems with this until
you add a second
application, and a third.
Copyright Third Nature, Inc.
The data lake: just dump the data in!
Copyright Third Nature, Inc.
Combine
with self-
service:
we’ll figure
it all out
later!
Aren’t we
back where
we started?
Copyright Third Nature, Inc.
Data hoarding is not a data management strategy
Copyright Third Nature, Inc.
What does all of this imply?
The data is not always known in advance, so it can’t
be modeled in advance.
The data architecture must be read-write from both
the back and front, not a one-way data flow.
The data written back may be repeatedly used,
persistent data, or it may be temporary.
The data may arrive with any frequency, and the rate
may not be under your control.
Copyright Third Nature, Inc. Slide 34
The solution to our problems isn’t
technology, it’s architecture.
Copyright Third Nature, Inc.
Big data is a
leap forward
flexibility
speed
save dev time
Big data is a
fall backward
repeatability
quality
save user time
Choose one
or the other.
Copyright Third Nature, Inc.
What if we could reconcile two opposing ideas
using ideas borrowed from other domains?
Big data
flexibility
speed
save dev time
BI / DW
repeatability
quality
save user time
Choose both
© Third Nature Inc.© Third Nature Inc.
Separating fast from slow:
pace layering and change in buildings
“How Buildings Learn”, Stewart Brand
Complex systems can
be decomposed into
layers that change at
different rates.
Copyright Third Nature, Inc.
fast layers:
absorb change
propose solutions
learn
get all the attention: fixtures
like
Data discovery tools
Copyright Third Nature, Inc.
integrate change
constrain options
remember
slow layers:
do all the work: plumbing
like
Data warehouses
Copyright Third Nature, Inc.
Architecture: components and layering
Components above: flexibility,
repurposing, quicker change above
Application
Layers below: stability, reuse, slow
predictable change below
Infrastructure
We thought this was the schema…
Infrastructure is just a layer carefully chosen after a lot of experience
Copyright Third Nature, Inc.
Decoupling: design for isolation of unrelated change
Copyright Third Nature, Inc.
In reality, you are building three systems, not one
There are three
things happening
inside a DW:
▪ Data acquisition
▪ Data management
▪ Data delivery
Isolate them and
their uses of data
from one another.
Data
Warehouse
Separate the component systems to isolate unrelated change
Copyright Third Nature, Inc.
The goal is to decouple: solve the application and
infrastructure problems separately
Platform Services
Data Access
Deliver & Use
Data storage
This separates BI from other uses of data , allowing each type of
use to structure the data specific to its own requirements.
Data access is already
somewhat separate
today. Make the
separation of different
access methods a formal
part of the architecture.
Don’t force one model.
Copyright Third Nature, Inc.
The goal is to decouple: solve the application and
infrastructure problems separately
Platform Services
Data Management
Process & Integrate
Data storage
Data management should not be subject to the constraints of a single use
Data management
was blended with both
data acquisition and
structuring data for
client tools. It should
be an independent
function.
Copyright Third Nature, Inc.
The goal is to decouple: solve the application and
infrastructure problems separately
Data Acquisition
Collect & Store
Incremental
Batch
One-time copy
Real time
Platform Services
Data storage
Data arrives in many latencies, from real-time to one-time. Acquisition
can’t be limited by the management or consumption layers.
Data acquisition should not be
directly tied to the needs of
consumption. It must operate
independently of data use.
Copyright Third Nature, Inc.
The full analytic environment subsumes all the functions
of the data warehouse, and extends them
Data Acquisition
Collect & Store
Incremental
Batch
One-time copy
Real time
Platform Services
Data Management
Process & Integrate
Data Access
Deliver & Use
Data storage
The platform has to do more than serve queries; it has to be read-write.
Copyright Third Nature, Inc.
DATA ARCHITECTURE
We’re so focused on the light switch that we’re not
talking about the light
Copyright Third Nature, Inc.
As with the code, decouple the data architecture
The core of the data warehouse isn’t the
database, it’s the data architecture that the
database and tools implement.
We need a data architecture that is not limiting:
▪ Deals with data and schema change easily
▪ Does not always require up front modeling
▪ Does not limit the format or structure of data
▪ Assumes a full range of data latencies, from
streaming to one-time bulk loads, both in and out
▪ Support different uses of the same data
Copyright Third Nature, Inc.
The data architecture must align with system components
because each of them addresses different data needs
Incremental
Collect
Batch
One-time copy
Real time
Manage & Integrate
Data Acquisition
Collect & Store
Data Management
Process & Integrate
Data Access
Deliver & Use
Data architecture is part of the mechanism for change isolation
Copyright Third Nature, Inc.
The data is in zones of management, not isolating layers
Raw data in an immutable
storage area
Standardized or
enhanced data
Common or
usage-
specific
data
Transient data
Relax control to enable self-service while avoiding a mess.
Do not constrain access to one zone or to a single tool.
Focus on visibility of data use, not control of data.
Copyright Third Nature, Inc.
Food supply chain: an analogy for data
Multiple contexts of use, differing quality levels
You need to keep the original because just like baking,
you can’t unmake dough once it’s mixed.
Copyright Third Nature, Inc.
This is not a single global data model
Data can live in more than one place,
in more than one zone,
in more than one form
Copyright Third Nature, Inc.
This data architecture resolves rate of change problems
Raw data in an immutable
storage area
Standardized or
enhanced data
Common or
usage-
specific data
Transient data
New data of
unknown value,
simple requests for
new data can land
here first, with little
work by IT.
More effort applied to
management, slower.
Optimized for
specific uses /
workloads. Generally
the slowest change.
Copyright Third Nature, Inc.
With abundant data our approach has to be inverted
The process we still use today:
1. Model
2. Collect
3. Analyze
The new process is:
1. Collect
2. Analyze
3. Model
4. Promote
This is a shift from
planned design to
adaptive design for
data management
Copyright Third Nature, Inc.
New environment enables adaptive use of data
Data needs change, so you need a system for evolution as well as
data infrastructure that provides stability.
Data Acquisition
Collect & Store
Incremental
Batch
One-time copy
Real time
Data Lake / LDW AE Platform Services
Data Management
Process & Integrate
Data Access
Deliver & Use
Data storage
You can’t build this all at once. You need to grow it over time.
Copyright Third Nature, Inc.
What this means is…
In this new environment the data warehouse still
has enormous value.
Data standards, rules and models provide the
most directly usable data.
Self-service can operate independently, yet still be
part of the larger framework that includes the DW
Data management still functions, and the useful
discoveries filter through from less-managed to
more-managed areas.
Copyright Third Nature, Inc.
However, our role shifts.
We no longer control so much as we guide
We aren’t designers of
perfect governance and control,
we are designers of an environment that is
resilient in the face of change
Copyright Third Nature, Inc.
Be the department of Yes
Instead of trying to
model everything in
advance, collect it.
Instead of trying to
control change,
accept it.
Instead of trying to
control what people
do with data, focus
on visibility.
Copyright Third Nature, Inc.
CC Image Attributions
Thanks to the people who supplied the creative commons licensed images used in this presentation:
pyramid_camel_rider.jpg - http://www.flickr.com/photos/khalid-almasoud/1528054134/
House on fire - http://flickr.com/photos/oldonliner/1485881035/
glass_buildings.jpg - http://www.flickr.com/photos/erikvanhannen/547701721
Building demolition - https://www.flickr.com/photos/gregpc/4429888820
peek_fence_dog.jpg - http://www.flickr.com/photos/webwalker/114998078/
donuts_4_views.jpg - http://www.flickr.com/photos/le_hibou/76718773/
chinese garden gate.jpg - http://www.flickr.com/photos/musaeum/420467301/
Copyright Third Nature, Inc.
About the Presenter
Mark Madsen is president of Third
Nature, a technology research and
consulting firm focused on business
intelligence, data integration and data
management. Mark is an award-winning
author, architect and CTO whose work
has been featured in numerous industry
publications. Over the past ten years
Mark received awards for his work from
the American Productivity & Quality
Center, TDWI, and the Smithsonian
Institute. He is an international speaker,
a contributor to Forbes Online and on
the O’Reilly Strata program committee.
For more information or to contact
Mark, follow @markmadsen on Twitter
or visit http://ThirdNature.net
About Third Nature
Third Nature is a research and consulting firm focused on new and
emerging technology and practices in analytics, business intelligence,
information strategy and data management. If your question is related to
data, analytics, information strategy and technology infrastructure then
you‘re at the right place.
Our goal is to help organizations solve problems using data. We offer
education, consulting and research services to support business and IT
organizations as well as technology vendors.
We fill the gap between what the industry analyst firms cover and what IT
needs. We specialize in product and technology analysis, so we look at
emerging technologies and markets, evaluating technology and hw it is
applied rather than vendor market positions.

More Related Content

What's hot

How to understand trends in the data & software market
How to understand trends in the data & software marketHow to understand trends in the data & software market
How to understand trends in the data & software marketmark madsen
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data FrameworkseXascale Infolab
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for EveryoneCaserta
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019mark madsen
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayXoriant Corporation
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...i_scienceEU
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Managementmark madsen
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
2013: Trends from the Trenches
2013: Trends from the Trenches2013: Trends from the Trenches
2013: Trends from the TrenchesChris Dagdigian
 
2015 CDC Workshop on ScienceDMZ
2015 CDC Workshop on ScienceDMZ2015 CDC Workshop on ScienceDMZ
2015 CDC Workshop on ScienceDMZChris Dagdigian
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data TipsQubole
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data TechnologiesDATAVERSITY
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & ChallengesRupen Momaya
 
Leveraging open source for big data stack
Leveraging open source for big data stackLeveraging open source for big data stack
Leveraging open source for big data stackFlytxt
 

What's hot (19)

How to understand trends in the data & software market
How to understand trends in the data & software marketHow to understand trends in the data & software market
How to understand trends in the data & software market
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
The Evolution of Big Data Frameworks
The Evolution of Big Data FrameworksThe Evolution of Big Data Frameworks
The Evolution of Big Data Frameworks
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
2013: Trends from the Trenches
2013: Trends from the Trenches2013: Trends from the Trenches
2013: Trends from the Trenches
 
2015 CDC Workshop on ScienceDMZ
2015 CDC Workshop on ScienceDMZ2015 CDC Workshop on ScienceDMZ
2015 CDC Workshop on ScienceDMZ
 
Expert Big Data Tips
Expert Big Data TipsExpert Big Data Tips
Expert Big Data Tips
 
Integrating Big Data Technologies
Integrating Big Data TechnologiesIntegrating Big Data Technologies
Integrating Big Data Technologies
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
 
Leveraging open source for big data stack
Leveraging open source for big data stackLeveraging open source for big data stack
Leveraging open source for big data stack
 

Viewers also liked

Estado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCEstado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCJosep Curto
 
Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09Matt Jones
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Jos van Dongen
 
5 Signs You Need to Re-Think Your Data Integration Strategy
5 Signs You Need to Re-Think Your Data Integration Strategy5 Signs You Need to Re-Think Your Data Integration Strategy
5 Signs You Need to Re-Think Your Data Integration StrategyDarren Cunningham
 
Using Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big DataUsing Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big Datamark madsen
 
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationDavid Walker
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBeing Topper
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationRobert Gleave
 
Data Warehouse Concepts and Architecture
Data Warehouse Concepts and ArchitectureData Warehouse Concepts and Architecture
Data Warehouse Concepts and ArchitectureMohd Tousif
 
Parametric report slide support
Parametric report slide supportParametric report slide support
Parametric report slide supportSpagoWorld
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architectureuncleRhyme
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Mike Frampton
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecturemark madsen
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouseJ M
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 

Viewers also liked (20)

Estado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOCEstado del arte del BI | Jornada Madrid 2014 | UOC
Estado del arte del BI | Jornada Madrid 2014 | UOC
 
Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09Data as Seductive Material, Spring Summit, Umeå March09
Data as Seductive Material, Spring Summit, Umeå March09
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Business Intelligence In The Cloud
Business Intelligence In The CloudBusiness Intelligence In The Cloud
Business Intelligence In The Cloud
 
5 Signs You Need to Re-Think Your Data Integration Strategy
5 Signs You Need to Re-Think Your Data Integration Strategy5 Signs You Need to Re-Think Your Data Integration Strategy
5 Signs You Need to Re-Think Your Data Integration Strategy
 
Using Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big DataUsing Data Virtualization to Integrate With Big Data
Using Data Virtualization to Integrate With Big Data
 
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
 
Benefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topperBenefits of a data warehouse presentation by Being topper
Benefits of a data warehouse presentation by Being topper
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Bimodal IT and EDW Modernization
Bimodal IT and EDW ModernizationBimodal IT and EDW Modernization
Bimodal IT and EDW Modernization
 
Data Warehouse Concepts and Architecture
Data Warehouse Concepts and ArchitectureData Warehouse Concepts and Architecture
Data Warehouse Concepts and Architecture
 
Parametric report slide support
Parametric report slide supportParametric report slide support
Parametric report slide support
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Inmon & kimball method
Inmon & kimball methodInmon & kimball method
Inmon & kimball method
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 
Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2Data warehouse inmon versus kimball 2
Data warehouse inmon versus kimball 2
 
Building the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architectureBuilding the Enterprise Data Lake: A look at architecture
Building the Enterprise Data Lake: A look at architecture
 
3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
OLAP
OLAPOLAP
OLAP
 

Similar to Modernizing the Data Warehouse Architecture for Flexibility and Change

What is big data
What is big dataWhat is big data
What is big dataShubShubi
 
big data and machine learning ppt.pptx
big data and machine learning ppt.pptxbig data and machine learning ppt.pptx
big data and machine learning ppt.pptxNATASHABANO
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the datamark madsen
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InSnapLogic
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big dataDigimark
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Gregg Barrett
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? ScaleFocus
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleVasu S
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPeter Wang
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 
CollaborativeDatasetBuilding
CollaborativeDatasetBuildingCollaborativeDatasetBuilding
CollaborativeDatasetBuildingArmaan Bindra
 

Similar to Modernizing the Data Warehouse Architecture for Flexibility and Change (20)

What is big data
What is big dataWhat is big data
What is big data
 
Big data rmoug
Big data rmougBig data rmoug
Big data rmoug
 
big data and machine learning ppt.pptx
big data and machine learning ppt.pptxbig data and machine learning ppt.pptx
big data and machine learning ppt.pptx
 
The ABCs of Big Data
The ABCs of Big DataThe ABCs of Big Data
The ABCs of Big Data
 
Wake up and smell the data
Wake up and smell the dataWake up and smell the data
Wake up and smell the data
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 
Ab cs of big data
Ab cs of big dataAb cs of big data
Ab cs of big data
 
Fundamentals of Big Data
Fundamentals of Big DataFundamentals of Big Data
Fundamentals of Big Data
 
Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...Overview of mit sloan case study on ge data and analytics initiative titled g...
Overview of mit sloan case study on ge data and analytics initiative titled g...
 
1
11
1
 
Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it? Big Data: Are you ready for it? Can you handle it?
Big Data: Are you ready for it? Can you handle it?
 
Big data
Big dataBig data
Big data
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
The Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | QuboleThe Evolving Role of the Data Engineer - Whitepaper | Qubole
The Evolving Role of the Data Engineer - Whitepaper | Qubole
 
Python's Role in the Future of Data Analysis
Python's Role in the Future of Data AnalysisPython's Role in the Future of Data Analysis
Python's Role in the Future of Data Analysis
 
Video 1 big data
Video 1 big dataVideo 1 big data
Video 1 big data
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 
CollaborativeDatasetBuilding
CollaborativeDatasetBuildingCollaborativeDatasetBuilding
CollaborativeDatasetBuilding
 

More from mark madsen

Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprisemark madsen
 
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Rangemark madsen
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...mark madsen
 
A Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing CustomersA Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing Customersmark madsen
 
Briefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsBriefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsmark madsen
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)mark madsen
 
Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...mark madsen
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good storymark madsen
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followersmark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Datamark madsen
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousingmark madsen
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolutionmark madsen
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 

More from mark madsen (14)

Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou RangeA Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
A Brief Tour through the Geology & Endemic Botany of the Klamath-Siskiyou Range
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
A Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing CustomersA Pragmatic Approach to Analyzing Customers
A Pragmatic Approach to Analyzing Customers
 
Briefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analyticsBriefing Room analyst comments - streaming analytics
Briefing Room analyst comments - streaming analytics
 
On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)On the edge: analytics for the modern enterprise (analyst comments)
On the edge: analytics for the modern enterprise (analyst comments)
 
Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...Crossing the chasm with a high performance dynamically scalable open source p...
Crossing the chasm with a high performance dynamically scalable open source p...
 
Don't let data get in the way of a good story
Don't let data get in the way of a good storyDon't let data get in the way of a good story
Don't let data get in the way of a good story
 
Don't follow the followers
Don't follow the followersDon't follow the followers
Don't follow the followers
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Open Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing DataOpen Data: Free Data Isn't the Same as Freeing Data
Open Data: Free Data Isn't the Same as Freeing Data
 
Exploring cloud for data warehousing
Exploring cloud for data warehousingExploring cloud for data warehousing
Exploring cloud for data warehousing
 
Big Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data RevolutionBig Data Wonderland: Two Views on the Big Data Revolution
Big Data Wonderland: Two Views on the Big Data Revolution
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 

Modernizing the Data Warehouse Architecture for Flexibility and Change

  • 1. Everything Has Changed Except Us Modernizing the Data Warehouse Architecture June, 2016 Mark Madsen www.ThirdNature.net @markmadsen
  • 2. Copyright Third Nature, Inc. The big data market is: … a leap forward A leap in evolution to a more flexible way of gathering data and generating useful information. ▪ Decide if data is good enough at the time of use ▪ Files are flexible ▪ Save time collecting data ▪ “self service” A new approach to managing and using data.
  • 3. Copyright Third Nature, Inc. The big data market is: … a step backward A step backward to methods not capable of providing the quality, manageability, accessibility and reuse we need. ▪ Ensure data quality up front rather than after the fact ▪ Tables are stable ▪ Save time using data ▪ “self service” A reinvention of the wheel, going back to the manual era.
  • 4. Copyright Third Nature, Inc. The architecture we are (still) using today The general concept of a separate architecture for BI has been around longer, but this paper by Devlin and Murphy is the first formal data warehouse architecture and definition published. 4 “An architecture for a business and information system”, B. A. Devlin, P. T. Murphy, IBM Systems Journal, Vol.27, No. 1, (1988) Slide 4Copyright Third Nature, Inc.
  • 5. Copyright Third Nature, Inc. Environment in 1988: no big data, only big hair. ▪ No real commercial email, public internet barely started ▪ Storage state of the art: 100MB, cost $10,000/GB ▪ Oracle Applications v1 GL released; SAP goes public, enters US market ▪ Unix is mostly run by long-haired freaks ▪ Mobile was this The environment: scarcity of data, of reports, of computer systems (outside of core financials), of resources, of money to pay for storage.
  • 6. Copyright Third Nature, Inc. Data warehouse: centralize, that solves all problems! Creates bottlenecks Causes scale problems Enforces a single model
  • 7. Copyright Third Nature, Inc. The data lake solution: no central authority wtf, it was fully operational!
  • 8. Copyright Third Nature, Inc. The data lake solution? There’s a problem: as the lake is envisioned, it is still a centralized data architecture, but this time there is no single global model. Instead it’s files and not modeled. It’s still a death star.
  • 9. Copyright Third Nature, Inc. Eventually we run into the same problems Seriously, wtf? It was agile and operational
  • 10. Copyright Third Nature, Inc. Growing complexity has changed our environment You can barely keep up with source changes You can’t keep up with new data requests You are scale, performance and latency limited But: The organizations want more, different, current data The reality is continuous change – lots of maintenance
  • 11. Copyright Third Nature, Inc. Meanwhile, we’ve become the department of “No” • Environments are more complex • Data volumes are 10,000X larger • Data is more complex • More use of BI • We have entirely new data uses We’ve been struggling with shrinking load windows, performance problems, and an inability to quickly meet new data and analytics requests for years, yet we keep using the same design.
  • 12. Copyright Third Nature, Inc. I never said the “E” in EDW meant “everything”… What do you mean, “3 months?”
  • 13. Copyright Third Nature, Inc. It’s going to get a lot worse Not E E Conclusion: any methodology built on the premise that you must know and model all data before using it is untenable.
  • 14. Copyright Third Nature, Inc. Streams of events and observations This data doesn’t fit well with current methods of collection and storage, or with the technology to process and analyze it. Copyright Third Nature, Inc.
  • 15. Copyright Third Nature, Inc. Declarations The problems of events apply to text. Which can be an event payload
  • 16. Copyright Third Nature, Inc. Categorizing the measurement data we collect The convenient data is the transactional data. ▪ Goes in the DW and is used, even if it isn’t the right measurement. The inconvenient data is the observational data. ▪ It’s not neat, clean, or designed into most systems of operation. The difficult data is the declarative data. ▪ What people say and what they do require ground truth. We need an architecture that supports all three categories. Copyright Third Nature, Inc.
  • 17. Copyright Third Nature, Inc. New uses: analytics embiggens the data volume problem Many of the algorithms we need them to make sense of the observations and declarations are are O(n2) or worse.
  • 18. Copyright Third Nature, Inc. Publishing is the primary view of BI, self service, big data
  • 19. Copyright Third Nature, Inc. What people do with data today: not just read it Explore and Understand Inform and Explain Convince and Decice Deliver Process Collect
  • 20. Copyright Third Nature, Inc. Data architecture requires understanding data use so we can build the right infrastructure Collect new data Monitor Analyze Exceptions Analyze Causes Decide Act Act on the process Act within the process We need to focus on what people do with information as the primary task, not on the data or the technology.
  • 21. Copyright Third Nature, Inc. The usage models for conventional BI Collect new data Monitor Analyze Exceptions Analyze Causes Decide Act No problem No idea Do nothing Act on the process Usually days/longer timeframe Act within the process Usually real-time to daily This is what we’ve been doing with BI so far: static reporting, dashboards, ad-hoc query, OLAP
  • 22. Copyright Third Nature, Inc. The usage models for analytics and “big data” Collect new data Monitor Analyze Exceptions Analyze Causes Decide Act No problem No idea Do nothing Act on the process Usually days/longer timeframe Act within the process Usually real-time to daily Analytics and exploration is focused on new use cases: deeper analysis, causes, prediction, optimizing decisions This isn’t ad-hoc query, reporting, or OLAP.
  • 23. Copyright Third Nature, Inc. BI is part of a dynamic system. There are feedback loops that can change both the data and the model. Collect new data Monitor Analyze Exceptions Analyze Causes Decide Act Act on the process Act within the process Known: stable, can be modeled in advance Unknown: new, you can’t model in advance
  • 24. Copyright Third Nature, Inc. The root cause of our problems is change Our rate of change is slower than the business. This can’t be fixed via buying more technology.
  • 25. Copyright Third Nature, Inc. We have a design for stability. We need one for adaptability
  • 26. Copyright Third Nature, Inc. Agile methods without agile architectures fail
  • 27. Copyright Third Nature, Inc. A core problem with one big schema is change
  • 28. Copyright Third Nature, Inc. The big data market has an answer…
  • 29. Copyright Third Nature, Inc. Big data answer? Schema-on-read! Rather than make data fit the model, make the model fit the data. There’s a price to pay with using “schema-on- read” for everything. You won’t see the problems with this until you add a second application, and a third.
  • 30. Copyright Third Nature, Inc. The data lake: just dump the data in!
  • 31. Copyright Third Nature, Inc. Combine with self- service: we’ll figure it all out later! Aren’t we back where we started?
  • 32. Copyright Third Nature, Inc. Data hoarding is not a data management strategy
  • 33. Copyright Third Nature, Inc. What does all of this imply? The data is not always known in advance, so it can’t be modeled in advance. The data architecture must be read-write from both the back and front, not a one-way data flow. The data written back may be repeatedly used, persistent data, or it may be temporary. The data may arrive with any frequency, and the rate may not be under your control.
  • 34. Copyright Third Nature, Inc. Slide 34 The solution to our problems isn’t technology, it’s architecture.
  • 35. Copyright Third Nature, Inc. Big data is a leap forward flexibility speed save dev time Big data is a fall backward repeatability quality save user time Choose one or the other.
  • 36. Copyright Third Nature, Inc. What if we could reconcile two opposing ideas using ideas borrowed from other domains? Big data flexibility speed save dev time BI / DW repeatability quality save user time Choose both
  • 37. © Third Nature Inc.© Third Nature Inc. Separating fast from slow: pace layering and change in buildings “How Buildings Learn”, Stewart Brand Complex systems can be decomposed into layers that change at different rates.
  • 38. Copyright Third Nature, Inc. fast layers: absorb change propose solutions learn get all the attention: fixtures like Data discovery tools
  • 39. Copyright Third Nature, Inc. integrate change constrain options remember slow layers: do all the work: plumbing like Data warehouses
  • 40. Copyright Third Nature, Inc. Architecture: components and layering Components above: flexibility, repurposing, quicker change above Application Layers below: stability, reuse, slow predictable change below Infrastructure We thought this was the schema… Infrastructure is just a layer carefully chosen after a lot of experience
  • 41. Copyright Third Nature, Inc. Decoupling: design for isolation of unrelated change
  • 42. Copyright Third Nature, Inc. In reality, you are building three systems, not one There are three things happening inside a DW: ▪ Data acquisition ▪ Data management ▪ Data delivery Isolate them and their uses of data from one another. Data Warehouse Separate the component systems to isolate unrelated change
  • 43. Copyright Third Nature, Inc. The goal is to decouple: solve the application and infrastructure problems separately Platform Services Data Access Deliver & Use Data storage This separates BI from other uses of data , allowing each type of use to structure the data specific to its own requirements. Data access is already somewhat separate today. Make the separation of different access methods a formal part of the architecture. Don’t force one model.
  • 44. Copyright Third Nature, Inc. The goal is to decouple: solve the application and infrastructure problems separately Platform Services Data Management Process & Integrate Data storage Data management should not be subject to the constraints of a single use Data management was blended with both data acquisition and structuring data for client tools. It should be an independent function.
  • 45. Copyright Third Nature, Inc. The goal is to decouple: solve the application and infrastructure problems separately Data Acquisition Collect & Store Incremental Batch One-time copy Real time Platform Services Data storage Data arrives in many latencies, from real-time to one-time. Acquisition can’t be limited by the management or consumption layers. Data acquisition should not be directly tied to the needs of consumption. It must operate independently of data use.
  • 46. Copyright Third Nature, Inc. The full analytic environment subsumes all the functions of the data warehouse, and extends them Data Acquisition Collect & Store Incremental Batch One-time copy Real time Platform Services Data Management Process & Integrate Data Access Deliver & Use Data storage The platform has to do more than serve queries; it has to be read-write.
  • 47. Copyright Third Nature, Inc. DATA ARCHITECTURE We’re so focused on the light switch that we’re not talking about the light
  • 48. Copyright Third Nature, Inc. As with the code, decouple the data architecture The core of the data warehouse isn’t the database, it’s the data architecture that the database and tools implement. We need a data architecture that is not limiting: ▪ Deals with data and schema change easily ▪ Does not always require up front modeling ▪ Does not limit the format or structure of data ▪ Assumes a full range of data latencies, from streaming to one-time bulk loads, both in and out ▪ Support different uses of the same data
  • 49. Copyright Third Nature, Inc. The data architecture must align with system components because each of them addresses different data needs Incremental Collect Batch One-time copy Real time Manage & Integrate Data Acquisition Collect & Store Data Management Process & Integrate Data Access Deliver & Use Data architecture is part of the mechanism for change isolation
  • 50. Copyright Third Nature, Inc. The data is in zones of management, not isolating layers Raw data in an immutable storage area Standardized or enhanced data Common or usage- specific data Transient data Relax control to enable self-service while avoiding a mess. Do not constrain access to one zone or to a single tool. Focus on visibility of data use, not control of data.
  • 51. Copyright Third Nature, Inc. Food supply chain: an analogy for data Multiple contexts of use, differing quality levels You need to keep the original because just like baking, you can’t unmake dough once it’s mixed.
  • 52. Copyright Third Nature, Inc. This is not a single global data model Data can live in more than one place, in more than one zone, in more than one form
  • 53. Copyright Third Nature, Inc. This data architecture resolves rate of change problems Raw data in an immutable storage area Standardized or enhanced data Common or usage- specific data Transient data New data of unknown value, simple requests for new data can land here first, with little work by IT. More effort applied to management, slower. Optimized for specific uses / workloads. Generally the slowest change.
  • 54. Copyright Third Nature, Inc. With abundant data our approach has to be inverted The process we still use today: 1. Model 2. Collect 3. Analyze The new process is: 1. Collect 2. Analyze 3. Model 4. Promote This is a shift from planned design to adaptive design for data management
  • 55. Copyright Third Nature, Inc. New environment enables adaptive use of data Data needs change, so you need a system for evolution as well as data infrastructure that provides stability. Data Acquisition Collect & Store Incremental Batch One-time copy Real time Data Lake / LDW AE Platform Services Data Management Process & Integrate Data Access Deliver & Use Data storage You can’t build this all at once. You need to grow it over time.
  • 56. Copyright Third Nature, Inc. What this means is… In this new environment the data warehouse still has enormous value. Data standards, rules and models provide the most directly usable data. Self-service can operate independently, yet still be part of the larger framework that includes the DW Data management still functions, and the useful discoveries filter through from less-managed to more-managed areas.
  • 57. Copyright Third Nature, Inc. However, our role shifts. We no longer control so much as we guide We aren’t designers of perfect governance and control, we are designers of an environment that is resilient in the face of change
  • 58. Copyright Third Nature, Inc. Be the department of Yes Instead of trying to model everything in advance, collect it. Instead of trying to control change, accept it. Instead of trying to control what people do with data, focus on visibility.
  • 59. Copyright Third Nature, Inc. CC Image Attributions Thanks to the people who supplied the creative commons licensed images used in this presentation: pyramid_camel_rider.jpg - http://www.flickr.com/photos/khalid-almasoud/1528054134/ House on fire - http://flickr.com/photos/oldonliner/1485881035/ glass_buildings.jpg - http://www.flickr.com/photos/erikvanhannen/547701721 Building demolition - https://www.flickr.com/photos/gregpc/4429888820 peek_fence_dog.jpg - http://www.flickr.com/photos/webwalker/114998078/ donuts_4_views.jpg - http://www.flickr.com/photos/le_hibou/76718773/ chinese garden gate.jpg - http://www.flickr.com/photos/musaeum/420467301/
  • 60. Copyright Third Nature, Inc. About the Presenter Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor to Forbes Online and on the O’Reilly Strata program committee. For more information or to contact Mark, follow @markmadsen on Twitter or visit http://ThirdNature.net
  • 61. About Third Nature Third Nature is a research and consulting firm focused on new and emerging technology and practices in analytics, business intelligence, information strategy and data management. If your question is related to data, analytics, information strategy and technology infrastructure then you‘re at the right place. Our goal is to help organizations solve problems using data. We offer education, consulting and research services to support business and IT organizations as well as technology vendors. We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.