© 2023 Thoughtworks
Data Mesh 101
Chris Ford & Pablo Porto
© 2023 Thoughtworks
I am Head of Technology for Thoughtworks
Spain. I help clients with architecture, agile
development and organisational
effectiveness.
I was a technical reviewer for Zhamak
Dehghani’s 2022 book 'Data Mesh'.
2
I am a Lead Developer for Thoughtworks
Spain's Data and AI Service Line. I help
clients build distributed systems, platforms
and data architectures.
I am currently working as part of one of the
most mature Data Mesh implementations in
the healthcare industry.
2
© 2023 Thoughtworks
Chris Ford Pablo Porto
© 2023 Thoughtworks
Part zero: Introduction
3
3
© 2023 Thoughtworks 4
Promise
What is the value
proposition of Data
Mesh?
Principles
What are the core
elements of Data
Mesh?
Practicalities
Why is it important
and how can you get
started?
Structure of this talk
Table of contents
© 2023 Thoughtworks
“All models are wrong but
some are useful.”
5
George Box, Statistician
Image from Wikipedia
5
© 2022 Thoughtworks
© 2023 Thoughtworks
(domain-driven design + microservices)
× data
= Data Mesh
sort of...
6
© 2023 Thoughtworks
Part one: Promise
7
7
© 2023 Thoughtworks 8
If we fulfill these conditions
Architectural paradigms
We can obtain these benefits
So long as we don’t mind these costs
8
© 2022 Thoughtworks
What is the cost/benefit?
In combination, a paradigm’s elements define a kind of promise:
The promise only pays off when the conditions are satisfied, the benefits are valued
and the costs are acceptable.
© 2023 Thoughtworks
It’s usually more valuable to consider
“Does this paradigm apply in this context?”
than to try and decide
“Is this paradigm good?”
9
9
© 2023 Thoughtworks
© 2023 Thoughtworks 10
Microservices promise
10
© 2022 Thoughtworks
If we break things into small pieces
We can change them independently
So long as we don’t mind the added
integration complexity
© 2023 Thoughtworks 11
Domain-driven design promise
11
© 2022 Thoughtworks
If we align our architecture with our
business domain
We can represent our business
accurately, even as it changes
So long as we don’t mind adapting the
technology to the business domain
© 2023 Thoughtworks 12
Data warehouse promise
12
© 2022 Thoughtworks
If we model our data up front in a central
schema
We can run analytical queries across our
business
So long as we don’t mind the effort of
aggregating and reconciling it
E
T
L
© 2023 Thoughtworks 13
Data lake promise
13
© 2022 Thoughtworks
If we collect our raw data into a central
location
We can post-hoc run queries about
anything we like
So long as we don’t mind dependencies
being implicit
E
T
L
E
T
L
© 2023 Thoughtworks 14
1.
Data mesh promise
14
© 2022 Thoughtworks
If we give responsibility for data to the
people who produce it
We can rapidly incorporate new data
sources and use cases
So long as we don’t mind distributing
skills and investing in self-service
infrastructure
© 2023 Thoughtworks
Part two: Principles
15
15
© 2023 Thoughtworks
Data mesh
16
Principles
Domain
ownership
Self-serve data
platform
Data as a
product
Federated
computational
governance
© 2023 Thoughtworks
Domain ownership
17
Principles
Domain
ownership
What is it?
You give the originators of data authority over it and the responsibility
to make it easily and usefully available.
Where’s the value?
This reduces the distance between producer and consumer (in terms of
handoffs), enabling quicker, easier and richer consumption.
© 2023 Thoughtworks
Data as a product
18
Principles
Data as a
product
What is it?
You make your data available as products that are designed around the
needs of its consumers.
Where’s the value?
By framing things in terms of product, you put the emphasis on use and
give clear responsibility on the owners to fix anything that interferes
with use (and credit for working on anything that promotes use).
© 2023 Thoughtworks
Self-serve data platform
19
Principles
Self-serve
data
platform
What is it?
You have the ability to provision infrastructure for the creation and
consumption of data products without a human in the loop.
Where’s the value?
If you want to decrease lead time and enable on-demand changes, you
need self-service.
© 2023 Thoughtworks
Federated computational governance
20
Principles
Federated
computational
governance
What is it?
You have general guidance that make the requirements of good
citizenship clear to everyone in the mesh, leaning on automation
whenever possible.
Where’s the value?
By managing with constraints, rather than inspecting individual items,
we empower teams. By using automation, we scale, reduce handoffs
and encourage interoperability.
© 2023 Thoughtworks
Data mesh
21
Principles
Domain
ownership
Self-serve data
platform
Data as a
product
Federated
computational
governance
© 2023 Thoughtworks
Data mesh
22
Principles
Domain
ownership
Self-serve data
platform
Data as a
product
Federated
computational
governance
Team
Organisation
© 2023 Thoughtworks
Part three: Practicalities
23
23
© 2023 Thoughtworks
Why is now a time to care about data mesh?
Industry context
Rapidly developing new data use cases is
increasingly important to businesses.
The upside of Data Mesh is going up.
24
Data skills and infrastructure are
increasingly accessible as technology
advances.
The downside of Data Mesh is going
down.
© 2023 Thoughtworks
Glovo is a delivery service facing
strong competition in a market where
innovation and operational efficiency
is key to success.
With the pandemic, consumer
behaviours and expectations
changed.
Understanding customers’ needs and
the capability to quickly react to them
are crucial to increase customer
loyalty and satisfaction.
An execution
plan on how to
deliver on the
vision
A clear vision for
data in the
organisation
25
© 2023 Thoughtworks
Lesson learned #2
Introduce clear rules about what aspects
of data quality to measure and
communicate.
26
“A data platform that
supports data products
and business insights”
Lesson learned #1
Ensure that every system that produces
or transforms important data has a clear
owner who is taking care of it.
© 2023 Thoughtworks
© 2023 Thoughtworks
Roche is one of the biggest life
sciences and healthcare companies
in the world with over 100k
employees and operating in multiple
markets.
Roche wanted to unlock its potential
on the market leveraging the rich and
abundant data it possesses.
They chose Data Mesh as the
approach and Thoughtworks as a
partner to achieve their vision.
Access to data
is hard due to
technical and
organisational
constraints
Data
interoperability
becomes hard at
scale with
multiple data
platforms
27
© 2023 Thoughtworks
Lesson learned #2
Show value frequently, both to business
stakeholders and developers using your
platform.
28
Focus on incremental value
Lesson learned #1
Start your journey with your existing
organisational boundaries.
© 2023 Thoughtworks
© 2023 Thoughtworks
How to get started
What combination of data products is needed to
serve these use cases?
What is the thinnest viable platform that is
practical to support these data products?
What is the minimum viable cooperation we
need to work successfully together?
Use cases
Data products
Self-serve data platform
Governance
What are good candidates use case with real
and achievable value?
© 2023 Thoughtworks 30
Questions?
(We are hiring Senior and Lead Data Engineers!)
(We are remote-friendly and ordinary friendly too)
© 2023 Thoughtworks
Resources
31
31
© 2023 Thoughtworks
References
● Original How to Move Beyond a Monolithic Data Lake to a
Distributed Data Mesh article that introduced the concept
by Zhamak Dehghani
● Follow-up Data Mesh Principles and Logical Architecture
article by Zhamak Dehghani
● Data Mesh in practice: Getting off to the right start article
series about Roche’s Data Mesh journey by Ammara Gafoor,
Ian Murdoch and Kiran Prakash
● Data mesh: it's not just about tech, it's about ownership and
communication article series about Glovo’s Data Mesh
journey by Jorge Agudo, Narek Verdian, Óscar Torres
Fernández, Pablo Giner, Diana Pinto and Javier García.
● Data Mesh Accelerate workshop description by Steve Upton
and Paulo Caroli
32
WHERE TO GO NEXT
Data Mesh book by
Zhamak Dehghani
© 2023 Thoughtworks
Chris Ford
Head of Technology, Thoughtworks Spain
linkedin.com/in/ctford
twitter.com/ctford
chris.ford@thoughtworks.com
33
Thank you!
Pablo Porto
Lead Developer, Thoughtworks Spain
linkedin.com/in/pabloportoveloso
twitter.com/portovep
pablo.porto@thoughtworks.com
© 2023 Thoughtworks
“We’ve only got three teams”
Threat to applicability 1: no scale
Data Mesh gives responsibility for
data to its originators.
34
This is great for reducing the
distance between data producer
and data consumer.
BUT...
If your organisation is so small that distance between data producer and data consumer
is naturally short, there’s not much point investing in reducing it.
© 2023 Thoughtworks
“Data’s not key to our business right now”
Threat to applicability 2: no value
Data Mesh makes new data use
cases quicker and easier.
35
This is great for experimentation
and for getting new data use cases
to market quickly.
BUT...
If data use cases are not of value to your business, there’s not much point investing in
enabling them.
© 2023 Thoughtworks
“We’re not into flow and autonomy”
Threat to applicability 3: no flow
Data Mesh enables autonomous
change within domains.
36
This is great for achieving fast
flow, though it requires
investment to pull it off.
BUT...
If your culture or context is not prepared to take advantage of team autonomy, there’s
not much point investing in enabling it.
© 2023 Thoughtworks
Myth 1: infra explosion
37
“The infrastructure will be too expensive”
● Data Mesh emphasises that domains should own their own data.
● Some people interpret that as meaning that they need to provision independent
infrastructure for each domain, which could lead to a cost blowout.
● However, Data Mesh only requires that infrastructure be logically separate and
self-service, so multi tenant infrastructure used by multiple domains is fine.
Data Mesh is opinionated about how you organise infrastructure, not how you
provision it.
© 2023 Thoughtworks
Myth 2: data castle
38
“Data quality will suffer”
● Many organisations have data quality assurance processes where experts own the
data lake or data warehouse and inspect changes to datasets.
● In Data Mesh, governance happens via policy, not by inspection of individual
additions, alterations or consumptions of datasets / data products.
● This works by reducing distance between data producer and data consumer and
giving experts greater leverage by governing the process.
Data Mesh has an alternative way of ensuring data quality, delivering potentially
richer and deeper quality to consumers.
© 2023 Thoughtworks
Myth 3: over enthusiasm
39
“Data mesh will solve everything”
● Operational systems manage transactional data that changes in real time.
● Analytical systems manage a view of the facts of the business over time.
● Data Mesh is an architectural paradigm specifically aimed at analytical problems.
● Organisations that will benefit from Data Mesh likely have other problems in their
operational systems, but they have other solutions.
Data Mesh aims to bring operational and analytical systems into harmony, not to
conflate them.

Data Mesh 101

  • 1.
    © 2023 Thoughtworks DataMesh 101 Chris Ford & Pablo Porto
  • 2.
    © 2023 Thoughtworks Iam Head of Technology for Thoughtworks Spain. I help clients with architecture, agile development and organisational effectiveness. I was a technical reviewer for Zhamak Dehghani’s 2022 book 'Data Mesh'. 2 I am a Lead Developer for Thoughtworks Spain's Data and AI Service Line. I help clients build distributed systems, platforms and data architectures. I am currently working as part of one of the most mature Data Mesh implementations in the healthcare industry. 2 © 2023 Thoughtworks Chris Ford Pablo Porto
  • 3.
    © 2023 Thoughtworks Partzero: Introduction 3 3
  • 4.
    © 2023 Thoughtworks4 Promise What is the value proposition of Data Mesh? Principles What are the core elements of Data Mesh? Practicalities Why is it important and how can you get started? Structure of this talk Table of contents
  • 5.
    © 2023 Thoughtworks “Allmodels are wrong but some are useful.” 5 George Box, Statistician Image from Wikipedia 5 © 2022 Thoughtworks
  • 6.
    © 2023 Thoughtworks (domain-drivendesign + microservices) × data = Data Mesh sort of... 6
  • 7.
    © 2023 Thoughtworks Partone: Promise 7 7
  • 8.
    © 2023 Thoughtworks8 If we fulfill these conditions Architectural paradigms We can obtain these benefits So long as we don’t mind these costs 8 © 2022 Thoughtworks What is the cost/benefit? In combination, a paradigm’s elements define a kind of promise: The promise only pays off when the conditions are satisfied, the benefits are valued and the costs are acceptable.
  • 9.
    © 2023 Thoughtworks It’susually more valuable to consider “Does this paradigm apply in this context?” than to try and decide “Is this paradigm good?” 9 9 © 2023 Thoughtworks
  • 10.
    © 2023 Thoughtworks10 Microservices promise 10 © 2022 Thoughtworks If we break things into small pieces We can change them independently So long as we don’t mind the added integration complexity
  • 11.
    © 2023 Thoughtworks11 Domain-driven design promise 11 © 2022 Thoughtworks If we align our architecture with our business domain We can represent our business accurately, even as it changes So long as we don’t mind adapting the technology to the business domain
  • 12.
    © 2023 Thoughtworks12 Data warehouse promise 12 © 2022 Thoughtworks If we model our data up front in a central schema We can run analytical queries across our business So long as we don’t mind the effort of aggregating and reconciling it E T L
  • 13.
    © 2023 Thoughtworks13 Data lake promise 13 © 2022 Thoughtworks If we collect our raw data into a central location We can post-hoc run queries about anything we like So long as we don’t mind dependencies being implicit E T L E T L
  • 14.
    © 2023 Thoughtworks14 1. Data mesh promise 14 © 2022 Thoughtworks If we give responsibility for data to the people who produce it We can rapidly incorporate new data sources and use cases So long as we don’t mind distributing skills and investing in self-service infrastructure
  • 15.
    © 2023 Thoughtworks Parttwo: Principles 15 15
  • 16.
    © 2023 Thoughtworks Datamesh 16 Principles Domain ownership Self-serve data platform Data as a product Federated computational governance
  • 17.
    © 2023 Thoughtworks Domainownership 17 Principles Domain ownership What is it? You give the originators of data authority over it and the responsibility to make it easily and usefully available. Where’s the value? This reduces the distance between producer and consumer (in terms of handoffs), enabling quicker, easier and richer consumption.
  • 18.
    © 2023 Thoughtworks Dataas a product 18 Principles Data as a product What is it? You make your data available as products that are designed around the needs of its consumers. Where’s the value? By framing things in terms of product, you put the emphasis on use and give clear responsibility on the owners to fix anything that interferes with use (and credit for working on anything that promotes use).
  • 19.
    © 2023 Thoughtworks Self-servedata platform 19 Principles Self-serve data platform What is it? You have the ability to provision infrastructure for the creation and consumption of data products without a human in the loop. Where’s the value? If you want to decrease lead time and enable on-demand changes, you need self-service.
  • 20.
    © 2023 Thoughtworks Federatedcomputational governance 20 Principles Federated computational governance What is it? You have general guidance that make the requirements of good citizenship clear to everyone in the mesh, leaning on automation whenever possible. Where’s the value? By managing with constraints, rather than inspecting individual items, we empower teams. By using automation, we scale, reduce handoffs and encourage interoperability.
  • 21.
    © 2023 Thoughtworks Datamesh 21 Principles Domain ownership Self-serve data platform Data as a product Federated computational governance
  • 22.
    © 2023 Thoughtworks Datamesh 22 Principles Domain ownership Self-serve data platform Data as a product Federated computational governance Team Organisation
  • 23.
    © 2023 Thoughtworks Partthree: Practicalities 23 23
  • 24.
    © 2023 Thoughtworks Whyis now a time to care about data mesh? Industry context Rapidly developing new data use cases is increasingly important to businesses. The upside of Data Mesh is going up. 24 Data skills and infrastructure are increasingly accessible as technology advances. The downside of Data Mesh is going down.
  • 25.
    © 2023 Thoughtworks Glovois a delivery service facing strong competition in a market where innovation and operational efficiency is key to success. With the pandemic, consumer behaviours and expectations changed. Understanding customers’ needs and the capability to quickly react to them are crucial to increase customer loyalty and satisfaction. An execution plan on how to deliver on the vision A clear vision for data in the organisation 25
  • 26.
    © 2023 Thoughtworks Lessonlearned #2 Introduce clear rules about what aspects of data quality to measure and communicate. 26 “A data platform that supports data products and business insights” Lesson learned #1 Ensure that every system that produces or transforms important data has a clear owner who is taking care of it. © 2023 Thoughtworks
  • 27.
    © 2023 Thoughtworks Rocheis one of the biggest life sciences and healthcare companies in the world with over 100k employees and operating in multiple markets. Roche wanted to unlock its potential on the market leveraging the rich and abundant data it possesses. They chose Data Mesh as the approach and Thoughtworks as a partner to achieve their vision. Access to data is hard due to technical and organisational constraints Data interoperability becomes hard at scale with multiple data platforms 27
  • 28.
    © 2023 Thoughtworks Lessonlearned #2 Show value frequently, both to business stakeholders and developers using your platform. 28 Focus on incremental value Lesson learned #1 Start your journey with your existing organisational boundaries. © 2023 Thoughtworks
  • 29.
    © 2023 Thoughtworks Howto get started What combination of data products is needed to serve these use cases? What is the thinnest viable platform that is practical to support these data products? What is the minimum viable cooperation we need to work successfully together? Use cases Data products Self-serve data platform Governance What are good candidates use case with real and achievable value?
  • 30.
    © 2023 Thoughtworks30 Questions? (We are hiring Senior and Lead Data Engineers!) (We are remote-friendly and ordinary friendly too)
  • 31.
  • 32.
    © 2023 Thoughtworks References ●Original How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh article that introduced the concept by Zhamak Dehghani ● Follow-up Data Mesh Principles and Logical Architecture article by Zhamak Dehghani ● Data Mesh in practice: Getting off to the right start article series about Roche’s Data Mesh journey by Ammara Gafoor, Ian Murdoch and Kiran Prakash ● Data mesh: it's not just about tech, it's about ownership and communication article series about Glovo’s Data Mesh journey by Jorge Agudo, Narek Verdian, Óscar Torres Fernández, Pablo Giner, Diana Pinto and Javier García. ● Data Mesh Accelerate workshop description by Steve Upton and Paulo Caroli 32 WHERE TO GO NEXT Data Mesh book by Zhamak Dehghani
  • 33.
    © 2023 Thoughtworks ChrisFord Head of Technology, Thoughtworks Spain linkedin.com/in/ctford twitter.com/ctford chris.ford@thoughtworks.com 33 Thank you! Pablo Porto Lead Developer, Thoughtworks Spain linkedin.com/in/pabloportoveloso twitter.com/portovep pablo.porto@thoughtworks.com
  • 34.
    © 2023 Thoughtworks “We’veonly got three teams” Threat to applicability 1: no scale Data Mesh gives responsibility for data to its originators. 34 This is great for reducing the distance between data producer and data consumer. BUT... If your organisation is so small that distance between data producer and data consumer is naturally short, there’s not much point investing in reducing it.
  • 35.
    © 2023 Thoughtworks “Data’snot key to our business right now” Threat to applicability 2: no value Data Mesh makes new data use cases quicker and easier. 35 This is great for experimentation and for getting new data use cases to market quickly. BUT... If data use cases are not of value to your business, there’s not much point investing in enabling them.
  • 36.
    © 2023 Thoughtworks “We’renot into flow and autonomy” Threat to applicability 3: no flow Data Mesh enables autonomous change within domains. 36 This is great for achieving fast flow, though it requires investment to pull it off. BUT... If your culture or context is not prepared to take advantage of team autonomy, there’s not much point investing in enabling it.
  • 37.
    © 2023 Thoughtworks Myth1: infra explosion 37 “The infrastructure will be too expensive” ● Data Mesh emphasises that domains should own their own data. ● Some people interpret that as meaning that they need to provision independent infrastructure for each domain, which could lead to a cost blowout. ● However, Data Mesh only requires that infrastructure be logically separate and self-service, so multi tenant infrastructure used by multiple domains is fine. Data Mesh is opinionated about how you organise infrastructure, not how you provision it.
  • 38.
    © 2023 Thoughtworks Myth2: data castle 38 “Data quality will suffer” ● Many organisations have data quality assurance processes where experts own the data lake or data warehouse and inspect changes to datasets. ● In Data Mesh, governance happens via policy, not by inspection of individual additions, alterations or consumptions of datasets / data products. ● This works by reducing distance between data producer and data consumer and giving experts greater leverage by governing the process. Data Mesh has an alternative way of ensuring data quality, delivering potentially richer and deeper quality to consumers.
  • 39.
    © 2023 Thoughtworks Myth3: over enthusiasm 39 “Data mesh will solve everything” ● Operational systems manage transactional data that changes in real time. ● Analytical systems manage a view of the facts of the business over time. ● Data Mesh is an architectural paradigm specifically aimed at analytical problems. ● Organisations that will benefit from Data Mesh likely have other problems in their operational systems, but they have other solutions. Data Mesh aims to bring operational and analytical systems into harmony, not to conflate them.