SlideShare a Scribd company logo
1 of 35
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
1
Data Product Thinking
Will ‘the Data Mesh’ save us from analytics misery?
Rogier Werschkull, RogerData
version 1.0, date 1-6-2022
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
2
Betteridge's law of headlines…
Any headline that ends in a question mark can
be answered by the word NO
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
3
No, it won’t ‘save us’….
There is NO quick fix to become ‘Data Driven’
 or ‘information supported’
 or whatever you want to call what we are doing here…
But there are Data Mesh aspects that make sense!
 …if you ignore most of the (tech) vendor washing…
As in…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
4
About me...
 Rogier Werschkull
 21 years ‘in field’ of Data, Data warehousing and
Business Intelligence
 Data architecture advise, data modeling, data
engineering, data-analytics product owner
 Blogger, trainer, conference speaker
 Contact details:
 www.linkedin.com/in/rogierwerschkull/
 rogier@rogerdata.nl
 @rwerschkull
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
5
80-90%?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
6
Failure…
Photo credit: https://highfiveexports.wordpress.com/2010/06/25/3000-pieces-lego-mix-specialty-pieces-rare-pieces-bricks-blocks-
parts-more-ultimate-lot-of-lego-parts-pieces-lego-for-sale-lego-batman-lego-starwars-lego-technic-lego-minifigur/
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
7
“in an average organization the car park or art
collection is better managed than data.”
Gartner analyst Frank Buytendijk
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
8
“Every single company I've worked at and talked to has the
same problem without a single exception so far:
poor data quality...
Either there's incomplete data, missing data,
duplicative data.”
Ruslan Belkin, former VP of Engineering @ Twitter and Salesforce
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
9
Data mesh tries (again?) to combat the ‘Analytics Misery’
we tend to create…
HOW?
 By decentralizing most ‘data warehousing concerns’ to individual
business domains. As in:
1. In either the operational source system
2. Or in a decentralized DWH team that sits ‘closer’ to the source systems
 By ‘calling out’ the required organizational / cultural change to
accomplish this…
What is data Mesh-1?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
10
Data mesh is described in the official Data Mesh Book by
Zhamak Dehghani (thoughtworks) as follows:
‘Data mesh is a decentralized sociotechnical approach to share,
access, and manage analytical data in complex and large-scale
environments—within or across organizations’
What is data Mesh-2?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
11
Only for large organizations, as in:
 Having a lot of source systems
 A lot of employees
 Where there are clear / separated business domains
Only for large organizations that…
 are not afraid to experiment
• That can live with the current absence of viable implementation patterns
 have a mature (centralized?) data / analytics department
 preferably can influence the design / development of the operational
applications they use
For WHO and WHEN?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
12
Data mesh is foremost about
people and processes
 About chaining the data-analytics
‘culture’
About the process of
collaborating on…
 creating decentralized, valuable
data products within a ‘business
domain’
 and sharing data between
these domains
NOT about technology!
People
Process
Data
Technology
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
13
Tech bullshit example: starburst.io
…’sort of’ resolved for now!
In January 2022:
In May 2022:
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
14
 It’s not that I don’t agree with the core data mesh principles
 It’s the way they are explained: ‘quite academic’
 Implementation guidelines are still missing
• Which Zhamak also clearly mentions, as in that it is still an emerging concept!
 But then still, some vital context is truly missing
 A mayor issue I have with the book (that we need to counter)
 I really have doubts on the amount of data integration experience the
contributors have, based on that
• it states that folks building DWH’s are still striving for the ‘single version of the truth’
(if you lived 10 years ago yes…)
• there is no mention at all of modern ELM-based data modeling patterns
(that are there to help data integration)
Quite often the book’s content does
not help in this aspect…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
15
Why we might need to take this seriously…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
16
“The primary purpose of a
data warehouse
is to transform data from
an application state into an integrated corporate
state”
Bill Inmon, the father of datawarehousing
Is it a new DWH pattern?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
17
After all,
‘Data
Warehousin
g’ is a tech
agnostic
activity…
Subject
Oriented
Integrated
Time Variant Non-Volatile
DWH
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
18
It’s about doing
this
analytical
work with
data
somehow
somewhere
Structure information so it can
be consumed easily. Shaped
for a diverse type of users, use
cases and tools
Reliable, durable
integration / unification
of data
Register the history and
history of changes
to data
Store data you receive
once, protected from
ungoverned deletion
DWH
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
19
Common DWH patterns overview
Data -
analytics
Architectur
e
What is it? Architecture
solution
features
Most common
development
style
Data warehousing ‘concern’
Subject Oriented Integrated Time-Variant Non-Volatile
Data lake A repository of raw
data of any type for
analytics purposes
Decentralized
tech.
On premise or
cloud
Decentralized
‘puddles of lake’
No No Yes, in general
implemented as a
file store
Yes, in general
implemented as a file
store
DWH 3.0 /
‘Lakehouse
’
The re-merger of
data lake and
‘classical’ DWH
concerns, also
known as the
‘Modern DWH’
Centralized
tech. Cloud
based
Centralized or
decentralized
depending on
business
complexity
Yes, via database
transformation
rules
Yes, via database
integration rules
Yes, via a database
Historical Staging
Area
Yes, via a database
Historical Staging
Area
Data
Mesh
Distributed data
architecture that
pushes down
‘DWH concerns’
the source /
‘business domain’
Highly
decentralized
tech.
On premise
or cloud
Highly
decentralized
by definition.
Focus on ‘data
as a product’
Yes, but
mainly pushed
to the ‘business
domains’
Yes, local
withing the
business
domain and
centralized via
a ‘knowledge
graph’ like
‘mapping’
Yes, but pushed
to the ‘business
domains’
Yes, but pushed
to the ‘business
domains’
Data Fabric Distributed data
architecture where
‘time variant / non
volatile’ concerns are
pushed down to the
source systems
Centralized
tech
On premise or
cloud.
Sources
decentralized
Centralized or
decentralized
depending on
business
complexity
Yes, via
centralised
virtual
transformation
rules
Yes, via
centralised
virtual integration
logic
What the
operational
system provides
or by creating a
Historical Staging
area in an analytical
What the
operational system
provides or by
creating a Historical
Staging area in an
analytical database
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
20
Data Mesh proposes to separate and push down these ‘DWH
concerns’ like operational applications do in a microservices
based architecture
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
21
1. Principle of domain ownership
1. Analytical data should be owned by either the source system or its main
consumers
2.Data as a product
1.Build data artifacts with a true product
(management) mindset
3. Self Service data Platform
 Use (shared) Infrastructure as a platform (in the cloud?) to build this
4. Federated Computational Governance
 Data governance operating model based on a federated decision-making and
accountability
Based on these foundational principles…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
22
The
proposed
changes…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
23
DHW created artifacts
should become ‘data
as a product’
For a consumer, a
data product should
be…
The ‘Data as a Product’ principle
Feasible
Usable
Valuable
Data as a
product
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
24
 Feasible
 Can the product be made in an acceptable time & for acceptable costs?
 Valuable
 What are the desires of my customers?
 What is my market?
• How to do marketing?
 What is the USP?
 What ‘price is justified?’
 Are my customers happy?
 Usable
 Is the product being used?
 Is the product easy (enough) to use?
 Are my customers happy?
Some examples of work you’ll need to do
here!
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
25
To see ‘data / information as a product’*it practically needs to be:
 Discoverable
 An easy, google-like way to find data sets
 Addressable
 The product needs to have a permanent unique identifier that stays stable over time
 Understandable
 The product needs to accompanied with metadata that describes WHAT something is
 Trustworthy and ‘truthful’
 The product needs to have a lot a data quality metrics and lineage metadata attached
 Natively accessible
 Accessible via any interface that suits the consumer, ie as API / via ODBC-SQL / stream ‘topic’
 Interoperable and composable
 The product needs to be accompanied with metadata on HOW it can be combined with other products
 Valuable on its own
 Useable without the need to first combine it with other data products
 Secure
 Data security / privacy needs to work on the product without needing ‘something else’
Data Mesh Data Product principles
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
26
…an extension / repackaging of the existing FAIR data
principles
 https://www.go-fair.org/fair-principles/
This is not completely new…
FAIR DATA MESH
Findable Discoverable
Understandable
Accessible Addressable
Natively accessible
Secure
Interoperable Trustworthy and ‘truthful’
Interoperable and composable
Valuable on its own
Reusable Natively accessible
Trustworthy and ‘truthful’
Understandable
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
27
A majority agrees this makes sense…
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
28
Missing from the
book:
When
implementing
each data
product,
these concerns
need to be
addressed…
Subject
Oriented
Integrated
Time Variant Non-Volatile
DWH
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
29
‘Pushing down’ DWH concerns to the operational systems
will likely be a long journey
In addition, a lot of the tech mentioned in the book to cover
some vital aspects of the data mesh does not exist yet
The alternative I see…
 Use the DWH 3.0 / Lakehouse pattern
 Make sure to cover the mentioned Data Mesh principles there
• I think the key there is to use Data Vault or other ELM-based data
modeling style as an enabler
Overall, this would be my starting point
when MVP-ing a ‘meshy architecture’
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
30
 Subject Oriented
 Create domain specific and centralized hubs
 Create domain specific satellites
 Integrated
 Domain specific hubs should (obviously) be integrated within a domain
 Centralized hubs should be fed from all domains and / or an central MDM
source
 Centralized same as links should be created to enable integrating across
domains
• Domain specific satellites can then be shared too
 Time variant & non volatile
 Before the Subject Oriented / Integrated step, data should be loaded RAW in
a Historical Staging Area first
Implementing the ‘DWH concerns’ using
Data Vault
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
31
Hubs & centralization:
It will remain complicated!
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
32
Implementing ‘data as a product’ in
the DWH 3.0 / Lakehouse pattern
Product Principle My first implementation ‘idea’ / suggestion
Discoverable • Data product metadata should be pushed or pulled from all domains towards a data catalog with a good
search interface where NO MANUAL CURATION is needed!
Addressable • If source entity names change, the entity names should remain stable. That’s also the purpose of a GOOD
data vault hub
Trustworthy and ‘truthful’ • Data quality test should be part of each data product, not having them should block releasing them
• The data catalog mentioned in discoverable should handle lineage
Natively accessible • Next to storing data products ‘in a database’, and using ODBC/ JDBC to access, create a data-API on top
of each data product or make sure it can be used via an API too
• Database systems with ‘low friction’ data sharing capabilities could help here
Interoperable and
composable
• Embed metadata from parent / child data products in the data product itself
• Again, a data catalog plays a central role here
Valuable on its own • ELM based data modeling patterns (like Data Vault) and a datamart modeling style like Kimball /
Dimensional is still the way to go
Secure • Use the native row and column level security features of modern cloud based analytical databases.
• Register these policies as metadata.
• This requires data product consumers to consume using named accounts only
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
33
5 tips wat NU te doen
Nog toevoegen
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
34
 All relevant data mesh info is collected at on
https://datameshlearning.com/user-stories/
 Scott Hireleman drives his initiative
 He also host the accompanying Data Mesh Radio podcast
• Listen to Shane Gibson’s (Knowledge Gap presenter) here:
https://daappod.com/data-mesh-radio/repeatable-patterns-and-data-mesh-shane-gibson/
Check out the user journeys here:
 https://datameshlearning.com/user-stories/
LAST: Where can I find more info?
@rwerschkull
nl.linkedin.com/in/rogierwerschkull
35
Thank you!
Contact details:
www.linkedin.com/in/rogierwerschkull/
rogier@rogerdata.nl
@rwerschkull

More Related Content

What's hot

Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogDATAVERSITY
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introductionIBM Analytics
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Architecting Modern Data Platforms
Architecting Modern Data PlatformsArchitecting Modern Data Platforms
Architecting Modern Data PlatformsAnkit Rathi
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemJames Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management DATAVERSITY
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshSion Smith
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 

What's hot (20)

Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Data Lake: A simple introduction
Data Lake: A simple introductionData Lake: A simple introduction
Data Lake: A simple introduction
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Data mesh
Data meshData mesh
Data mesh
 
Architecting Modern Data Platforms
Architecting Modern Data PlatformsArchitecting Modern Data Platforms
Architecting Modern Data Platforms
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 

Similar to Data product thinking-Will the Data Mesh save us from analytics history

Agile BI Development Through Automation
Agile BI Development Through AutomationAgile BI Development Through Automation
Agile BI Development Through AutomationManta Tools
 
Implementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White PaperImplementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White Papershashanksalunkhe12
 
2021 Trends from the Trenches
2021 Trends from the Trenches2021 Trends from the Trenches
2021 Trends from the TrenchesChris Dagdigian
 
The Case for Business Modeling
The Case for Business ModelingThe Case for Business Modeling
The Case for Business ModelingNeil Raden
 
Data Product Management by Tinder Group PM
Data Product Management by Tinder Group PMData Product Management by Tinder Group PM
Data Product Management by Tinder Group PMProduct School
 
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGYBIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGYGeorgeDiamandis11
 
Information is at the heart of all architecture disciplines & why Conceptual ...
Information is at the heart of all architecture disciplines & why Conceptual ...Information is at the heart of all architecture disciplines & why Conceptual ...
Information is at the heart of all architecture disciplines & why Conceptual ...Christopher Bradley
 
Make compliance fulfillment count double
Make compliance fulfillment count doubleMake compliance fulfillment count double
Make compliance fulfillment count doubleDirk Ortloff
 
Library systems a changing market. Ken Chad (April2013)
Library systems a changing market. Ken Chad  (April2013)Library systems a changing market. Ken Chad  (April2013)
Library systems a changing market. Ken Chad (April2013)Ken Chad Consulting Ltd
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationInside Analysis
 
Data Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleData Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleDr. Arif Wider
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...Boston Data Engineering
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationInside Analysis
 
Lunch and Learn: You have the data, now what?
Lunch and Learn: You have the data, now what?Lunch and Learn: You have the data, now what?
Lunch and Learn: You have the data, now what?DiUS
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data IntegrationDATAVERSITY
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceJuuso Parkkinen
 

Similar to Data product thinking-Will the Data Mesh save us from analytics history (20)

Agile BI Development Through Automation
Agile BI Development Through AutomationAgile BI Development Through Automation
Agile BI Development Through Automation
 
Implementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White PaperImplementing Data Mesh WP LTIMindtree White Paper
Implementing Data Mesh WP LTIMindtree White Paper
 
2021 Trends from the Trenches
2021 Trends from the Trenches2021 Trends from the Trenches
2021 Trends from the Trenches
 
Big Data : a 360° Overview
Big Data : a 360° Overview Big Data : a 360° Overview
Big Data : a 360° Overview
 
The Case for Business Modeling
The Case for Business ModelingThe Case for Business Modeling
The Case for Business Modeling
 
Data Product Management by Tinder Group PM
Data Product Management by Tinder Group PMData Product Management by Tinder Group PM
Data Product Management by Tinder Group PM
 
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGYBIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
 
Mighty Guides Data Disruption
Mighty Guides Data DisruptionMighty Guides Data Disruption
Mighty Guides Data Disruption
 
Information is at the heart of all architecture disciplines & why Conceptual ...
Information is at the heart of all architecture disciplines & why Conceptual ...Information is at the heart of all architecture disciplines & why Conceptual ...
Information is at the heart of all architecture disciplines & why Conceptual ...
 
Make compliance fulfillment count double
Make compliance fulfillment count doubleMake compliance fulfillment count double
Make compliance fulfillment count double
 
Library systems a changing market. Ken Chad (April2013)
Library systems a changing market. Ken Chad  (April2013)Library systems a changing market. Ken Chad  (April2013)
Library systems a changing market. Ken Chad (April2013)
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
Data Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about peopleData Mesh - It's not about technology, it's about people
Data Mesh - It's not about technology, it's about people
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
Boston Data Engineering: Designing and Implementing Data Mesh at Your Company...
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
 
Lunch and Learn: You have the data, now what?
Lunch and Learn: You have the data, now what?Lunch and Learn: You have the data, now what?
Lunch and Learn: You have the data, now what?
 
Making sense of BI
Making sense of BIMaking sense of BI
Making sense of BI
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data Science
 

Recently uploaded

Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 

Data product thinking-Will the Data Mesh save us from analytics history

  • 1. @rwerschkull nl.linkedin.com/in/rogierwerschkull 1 Data Product Thinking Will ‘the Data Mesh’ save us from analytics misery? Rogier Werschkull, RogerData version 1.0, date 1-6-2022
  • 2. @rwerschkull nl.linkedin.com/in/rogierwerschkull 2 Betteridge's law of headlines… Any headline that ends in a question mark can be answered by the word NO
  • 3. @rwerschkull nl.linkedin.com/in/rogierwerschkull 3 No, it won’t ‘save us’…. There is NO quick fix to become ‘Data Driven’  or ‘information supported’  or whatever you want to call what we are doing here… But there are Data Mesh aspects that make sense!  …if you ignore most of the (tech) vendor washing… As in…
  • 4. @rwerschkull nl.linkedin.com/in/rogierwerschkull 4 About me...  Rogier Werschkull  21 years ‘in field’ of Data, Data warehousing and Business Intelligence  Data architecture advise, data modeling, data engineering, data-analytics product owner  Blogger, trainer, conference speaker  Contact details:  www.linkedin.com/in/rogierwerschkull/  rogier@rogerdata.nl  @rwerschkull
  • 7. @rwerschkull nl.linkedin.com/in/rogierwerschkull 7 “in an average organization the car park or art collection is better managed than data.” Gartner analyst Frank Buytendijk
  • 8. @rwerschkull nl.linkedin.com/in/rogierwerschkull 8 “Every single company I've worked at and talked to has the same problem without a single exception so far: poor data quality... Either there's incomplete data, missing data, duplicative data.” Ruslan Belkin, former VP of Engineering @ Twitter and Salesforce
  • 9. @rwerschkull nl.linkedin.com/in/rogierwerschkull 9 Data mesh tries (again?) to combat the ‘Analytics Misery’ we tend to create… HOW?  By decentralizing most ‘data warehousing concerns’ to individual business domains. As in: 1. In either the operational source system 2. Or in a decentralized DWH team that sits ‘closer’ to the source systems  By ‘calling out’ the required organizational / cultural change to accomplish this… What is data Mesh-1?
  • 10. @rwerschkull nl.linkedin.com/in/rogierwerschkull 10 Data mesh is described in the official Data Mesh Book by Zhamak Dehghani (thoughtworks) as follows: ‘Data mesh is a decentralized sociotechnical approach to share, access, and manage analytical data in complex and large-scale environments—within or across organizations’ What is data Mesh-2?
  • 11. @rwerschkull nl.linkedin.com/in/rogierwerschkull 11 Only for large organizations, as in:  Having a lot of source systems  A lot of employees  Where there are clear / separated business domains Only for large organizations that…  are not afraid to experiment • That can live with the current absence of viable implementation patterns  have a mature (centralized?) data / analytics department  preferably can influence the design / development of the operational applications they use For WHO and WHEN?
  • 12. @rwerschkull nl.linkedin.com/in/rogierwerschkull 12 Data mesh is foremost about people and processes  About chaining the data-analytics ‘culture’ About the process of collaborating on…  creating decentralized, valuable data products within a ‘business domain’  and sharing data between these domains NOT about technology! People Process Data Technology
  • 13. @rwerschkull nl.linkedin.com/in/rogierwerschkull 13 Tech bullshit example: starburst.io …’sort of’ resolved for now! In January 2022: In May 2022:
  • 14. @rwerschkull nl.linkedin.com/in/rogierwerschkull 14  It’s not that I don’t agree with the core data mesh principles  It’s the way they are explained: ‘quite academic’  Implementation guidelines are still missing • Which Zhamak also clearly mentions, as in that it is still an emerging concept!  But then still, some vital context is truly missing  A mayor issue I have with the book (that we need to counter)  I really have doubts on the amount of data integration experience the contributors have, based on that • it states that folks building DWH’s are still striving for the ‘single version of the truth’ (if you lived 10 years ago yes…) • there is no mention at all of modern ELM-based data modeling patterns (that are there to help data integration) Quite often the book’s content does not help in this aspect…
  • 16. @rwerschkull nl.linkedin.com/in/rogierwerschkull 16 “The primary purpose of a data warehouse is to transform data from an application state into an integrated corporate state” Bill Inmon, the father of datawarehousing Is it a new DWH pattern?
  • 17. @rwerschkull nl.linkedin.com/in/rogierwerschkull 17 After all, ‘Data Warehousin g’ is a tech agnostic activity… Subject Oriented Integrated Time Variant Non-Volatile DWH
  • 18. @rwerschkull nl.linkedin.com/in/rogierwerschkull 18 It’s about doing this analytical work with data somehow somewhere Structure information so it can be consumed easily. Shaped for a diverse type of users, use cases and tools Reliable, durable integration / unification of data Register the history and history of changes to data Store data you receive once, protected from ungoverned deletion DWH
  • 19. @rwerschkull nl.linkedin.com/in/rogierwerschkull 19 Common DWH patterns overview Data - analytics Architectur e What is it? Architecture solution features Most common development style Data warehousing ‘concern’ Subject Oriented Integrated Time-Variant Non-Volatile Data lake A repository of raw data of any type for analytics purposes Decentralized tech. On premise or cloud Decentralized ‘puddles of lake’ No No Yes, in general implemented as a file store Yes, in general implemented as a file store DWH 3.0 / ‘Lakehouse ’ The re-merger of data lake and ‘classical’ DWH concerns, also known as the ‘Modern DWH’ Centralized tech. Cloud based Centralized or decentralized depending on business complexity Yes, via database transformation rules Yes, via database integration rules Yes, via a database Historical Staging Area Yes, via a database Historical Staging Area Data Mesh Distributed data architecture that pushes down ‘DWH concerns’ the source / ‘business domain’ Highly decentralized tech. On premise or cloud Highly decentralized by definition. Focus on ‘data as a product’ Yes, but mainly pushed to the ‘business domains’ Yes, local withing the business domain and centralized via a ‘knowledge graph’ like ‘mapping’ Yes, but pushed to the ‘business domains’ Yes, but pushed to the ‘business domains’ Data Fabric Distributed data architecture where ‘time variant / non volatile’ concerns are pushed down to the source systems Centralized tech On premise or cloud. Sources decentralized Centralized or decentralized depending on business complexity Yes, via centralised virtual transformation rules Yes, via centralised virtual integration logic What the operational system provides or by creating a Historical Staging area in an analytical What the operational system provides or by creating a Historical Staging area in an analytical database
  • 20. @rwerschkull nl.linkedin.com/in/rogierwerschkull 20 Data Mesh proposes to separate and push down these ‘DWH concerns’ like operational applications do in a microservices based architecture
  • 21. @rwerschkull nl.linkedin.com/in/rogierwerschkull 21 1. Principle of domain ownership 1. Analytical data should be owned by either the source system or its main consumers 2.Data as a product 1.Build data artifacts with a true product (management) mindset 3. Self Service data Platform  Use (shared) Infrastructure as a platform (in the cloud?) to build this 4. Federated Computational Governance  Data governance operating model based on a federated decision-making and accountability Based on these foundational principles…
  • 23. @rwerschkull nl.linkedin.com/in/rogierwerschkull 23 DHW created artifacts should become ‘data as a product’ For a consumer, a data product should be… The ‘Data as a Product’ principle Feasible Usable Valuable Data as a product
  • 24. @rwerschkull nl.linkedin.com/in/rogierwerschkull 24  Feasible  Can the product be made in an acceptable time & for acceptable costs?  Valuable  What are the desires of my customers?  What is my market? • How to do marketing?  What is the USP?  What ‘price is justified?’  Are my customers happy?  Usable  Is the product being used?  Is the product easy (enough) to use?  Are my customers happy? Some examples of work you’ll need to do here!
  • 25. @rwerschkull nl.linkedin.com/in/rogierwerschkull 25 To see ‘data / information as a product’*it practically needs to be:  Discoverable  An easy, google-like way to find data sets  Addressable  The product needs to have a permanent unique identifier that stays stable over time  Understandable  The product needs to accompanied with metadata that describes WHAT something is  Trustworthy and ‘truthful’  The product needs to have a lot a data quality metrics and lineage metadata attached  Natively accessible  Accessible via any interface that suits the consumer, ie as API / via ODBC-SQL / stream ‘topic’  Interoperable and composable  The product needs to be accompanied with metadata on HOW it can be combined with other products  Valuable on its own  Useable without the need to first combine it with other data products  Secure  Data security / privacy needs to work on the product without needing ‘something else’ Data Mesh Data Product principles
  • 26. @rwerschkull nl.linkedin.com/in/rogierwerschkull 26 …an extension / repackaging of the existing FAIR data principles  https://www.go-fair.org/fair-principles/ This is not completely new… FAIR DATA MESH Findable Discoverable Understandable Accessible Addressable Natively accessible Secure Interoperable Trustworthy and ‘truthful’ Interoperable and composable Valuable on its own Reusable Natively accessible Trustworthy and ‘truthful’ Understandable
  • 28. @rwerschkull nl.linkedin.com/in/rogierwerschkull 28 Missing from the book: When implementing each data product, these concerns need to be addressed… Subject Oriented Integrated Time Variant Non-Volatile DWH
  • 29. @rwerschkull nl.linkedin.com/in/rogierwerschkull 29 ‘Pushing down’ DWH concerns to the operational systems will likely be a long journey In addition, a lot of the tech mentioned in the book to cover some vital aspects of the data mesh does not exist yet The alternative I see…  Use the DWH 3.0 / Lakehouse pattern  Make sure to cover the mentioned Data Mesh principles there • I think the key there is to use Data Vault or other ELM-based data modeling style as an enabler Overall, this would be my starting point when MVP-ing a ‘meshy architecture’
  • 30. @rwerschkull nl.linkedin.com/in/rogierwerschkull 30  Subject Oriented  Create domain specific and centralized hubs  Create domain specific satellites  Integrated  Domain specific hubs should (obviously) be integrated within a domain  Centralized hubs should be fed from all domains and / or an central MDM source  Centralized same as links should be created to enable integrating across domains • Domain specific satellites can then be shared too  Time variant & non volatile  Before the Subject Oriented / Integrated step, data should be loaded RAW in a Historical Staging Area first Implementing the ‘DWH concerns’ using Data Vault
  • 32. @rwerschkull nl.linkedin.com/in/rogierwerschkull 32 Implementing ‘data as a product’ in the DWH 3.0 / Lakehouse pattern Product Principle My first implementation ‘idea’ / suggestion Discoverable • Data product metadata should be pushed or pulled from all domains towards a data catalog with a good search interface where NO MANUAL CURATION is needed! Addressable • If source entity names change, the entity names should remain stable. That’s also the purpose of a GOOD data vault hub Trustworthy and ‘truthful’ • Data quality test should be part of each data product, not having them should block releasing them • The data catalog mentioned in discoverable should handle lineage Natively accessible • Next to storing data products ‘in a database’, and using ODBC/ JDBC to access, create a data-API on top of each data product or make sure it can be used via an API too • Database systems with ‘low friction’ data sharing capabilities could help here Interoperable and composable • Embed metadata from parent / child data products in the data product itself • Again, a data catalog plays a central role here Valuable on its own • ELM based data modeling patterns (like Data Vault) and a datamart modeling style like Kimball / Dimensional is still the way to go Secure • Use the native row and column level security features of modern cloud based analytical databases. • Register these policies as metadata. • This requires data product consumers to consume using named accounts only
  • 34. @rwerschkull nl.linkedin.com/in/rogierwerschkull 34  All relevant data mesh info is collected at on https://datameshlearning.com/user-stories/  Scott Hireleman drives his initiative  He also host the accompanying Data Mesh Radio podcast • Listen to Shane Gibson’s (Knowledge Gap presenter) here: https://daappod.com/data-mesh-radio/repeatable-patterns-and-data-mesh-shane-gibson/ Check out the user journeys here:  https://datameshlearning.com/user-stories/ LAST: Where can I find more info?

Editor's Notes

  1. Our industry is still immature: https://www.linkedin.com/pulse/note-piergiuseppe-bill-inmon/ Summary: -One of the signs of immaturity of our industry is the practice of depending on vendors to lead the industry. -Because we are a young and immature industry, there are new advancements that occur every day. -Because of the newness of our industry there are very few principles. There are new toys. There are new gadgets. -When a new product or technology comes into the marketplace, the vendor thinks that it is their duty to remove everything that has come before. -There is a secret to combatting the vendors who are telling you that you are dated and old. The secret is to deliver business value to your end user.
  2. Our industry is still immature: https://www.linkedin.com/pulse/note-piergiuseppe-bill-inmon/ Summary: -One of the signs of immaturity of our industry is the practice of depending on vendors to lead the industry. -Because we are a young and immature industry, there are new advancements that occur every day. -Because of the newness of our industry there are very few principles. There are new toys. There are new gadgets. -When a new product or technology comes into the marketplace, the vendor thinks that it is their duty to remove everything that has come before. -There is a secret to combatting the vendors who are telling you that you are dated and old. The secret is to deliver business value to your end user.
  3. What does this number mean? Yes, it is the failure rate of BI, analytics and classical datawarehousing initiatives But also of Big data, Data lake, IOT or AI projects. It is the amount of ML work that never sees the light of day in your production enrironment And I don’t make this up, it is being said again and again by the likes of gartner, forrester, CIO.com, Cisco There are 2 reasons
  4. Our industry is still immature: https://www.linkedin.com/pulse/note-piergiuseppe-bill-inmon/ Summary: -One of the signs of immaturity of our industry is the practice of depending on vendors to lead the industry. -Because we are a young and immature industry, there are new advancements that occur every day. -Because of the newness of our industry there are very few principles. There are new toys. There are new gadgets. -When a new product or technology comes into the marketplace, the vendor thinks that it is their duty to remove everything that has come before. -There is a secret to combatting the vendors who are telling you that you are dated and old. The secret is to deliver business value to your end user.
  5. My take in on the primary reason WHY this lasting failure this is still happening relates to us still not addressing the data quality problem structurally Quality is still an afterthought Not only bull inmon, this is a guy working a t salesforce, modern cloud based saas company. In my opinions solving / addressing / governing these data quality issues are implicitly the core of what datawarehousing methodology should address
  6. My take in on the primary reason WHY this lasting failure this is still happening relates to us still not addressing the data quality problem structurally Quality is still an afterthought Not only bull inmon, this is a guy working a t salesforce, modern cloud based saas company. In my opinions solving / addressing / governing these data quality issues are implicitly the core of what datawarehousing methodology should address
  7. IMHO: it is ‘just’ a NEW form of Decentralized Data Warehousing
  8. Is data mesh an architecture? Is it a list of principles? Is it an operating model? After all, we rely on the classification of patterns as a major cognitive function to understand the structure of our world. Hence, I have decided to classify data mesh as a sociotechnical paradigm: an approach that recognizes the interactions between people and the technical architecture and solutions in complex organizations
  9. When people whould have read the data mesh book, even Zhamak herself writes down that it is still an emerging concept. That a lot of the tech to build what she is writing down conceptually DOES NOT EVEN EXIST (YET). As such, no one can actual claim to 'sell' a data mesh or claim to have build a 'full fledged' one. The only claim that could be true is: that people are 'on a journey’ towards creating a data mesh that vendors sell a tech component that might be applied when designing / building a data mesh.
  10. Is data mesh an architecture? Is it a list of principles? Is it an operating model? After all, we rely on the classification of patterns as a major cognitive function to understand the structure of our world. Hence, I have decided to classify data mesh as a sociotechnical paradigm: an approach that recognizes the interactions between people and the technical architecture and solutions in complex organizations
  11. That is quite a lot of people that will need to be protected, so read up!
  12. Data mesh calls for a fundamental shift in the assumptions, architecture, technical solutions, and social structure of our organizations, in how we manage, use, and own analytical data: Organizationally, it shifts from centralized ownership of data by specialists who run the data platform technologies to a decentralized data ownership model pushing ownership and accountability of the data back to the business domains where data is produced from or is used. Architecturally, it shifts from collecting data in monolithic warehouses and lakes to connecting data through a distributed mesh of data products accessed through standardized protocols. Technologically, it shifts from technology solutions that treat data as a byproduct of running pipeline code to solutions that treat data and code that maintains it as one lively autonomous unit. Operationally, it shifts data governance from a top-down centralized operational model with human interventions to a federated model with computational policies embedded in the nodes on the mesh. Principally, it shifts our value system from data as an asset to be collected to data as a product to serve and delight the data users (internal and external to the organization). Infrastructurally, it shifts from two sets of fragmented and point-to-point integrated infrastructure services—one for data and analytics and the other for applications and operational systems to a well-integrated set of infrastructure for both operational and data systems.
  13. datawarehousing is an activity, supported by a methodology. It has has nothing to do with technology directly , it’s about adressing these data-analytical concerns
  14. These four words say NOTHING about technology. Zip. Nada. They describe what functionally needs to happen. In traditional DWH modeling apraches you still do this work ‘in one go’ But to be fair, that really is a problem: What about modelling time vs added value, reverse engineering, starting with a data first / data centric architecture? Agility
  15. https://www.gartner.com/smarterwithgartner/gartner-top-10-data-and-analytics-trends-for-2021/ https://www.slideshare.net/ParisDataEngineers/delta-lake-oss-create-reliable-and-performant-data-lake-by-quentin-ambard Data Lakehouse: https://www.snowflake.com/guides/what-data-lakehouse https://databricks.com/blog/2020/01/30/what-is-a-data-lakehouse.html https://medium.com/snowflake/selling-the-data-lakehouse-a9f25f67c906 Delta lake: https://docs.databricks.com/delta/index.html Data Mesh: https://francois-nguyen.blog/2021/03/07/towards-a-data-mesh-part-1-data-domains-and-teams-topologies/ https://martinfowler.com/articles/data-monolith-to-mesh.html https://dpgmedia-engineering.medium.com/ddd-data-area-at-dpg-media-f0130e4d9766
  16. Data mesh calls for a fundamental shift in the assumptions, architecture, technical solutions, and social structure of our organizations, in how we manage, use, and own analytical data:
  17. Zalando: https://www.youtube.com/watch?v=UrM8yCjmzzw&ab_channel=Databricks
  18. If something is feasible, then you can do it without too much difficulty. When someone asks "Is it feasible?" the person is asking if you'll be able to get something done. =Capable of being done with means at hand and circumstances as they are. synonyms: executable, practicable, viable, workable
  19. All this is work that (NOW) often does not happen and makes sense to think about, determine and measure!
  20. The teams have the responsibility to provide data that is easily discoverable, understandable, accessible, and usable, known as data products. There are established roles such as data product owners in each cross-functional domain team that are responsible for data and sharing it successfully
  21. This is missing from the book!
  22. The teams have the responsibility to provide data that is easily discoverable, understandable, accessible, and usable, known as data products. There are established roles such as data product owners in each cross-functional domain team that are responsible for data and sharing it successfully
  23. The teams have the responsibility to provide data that is easily discoverable, understandable, accessible, and usable, known as data products. There are established roles such as data product owners in each cross-functional domain team that are responsible for data and sharing it successfully
  24. The teams have the responsibility to provide data that is easily discoverable, understandable, accessible, and usable, known as data products. There are established roles such as data product owners in each cross-functional domain team that are responsible for data and sharing it successfully
  25. build own to companies: Building a decentralized DWH in a database Using centralized configurable cloud native infra (Snowflake, Bigquery, Databricks)
  26. And therefore I don’t believe in cloud DWH as the answer that makes datawarehousing successful suddenly Who said this?