Data Management at Scale Piethein Strengholt

Read Anytime Anywhere Easy Ebook Downloads at ebookmeta.com
Data Management at Scale Piethein Strengholt
https://ebookmeta.com/product/data-management-at-scale-
piethein-strengholt/
OR CLICK HERE
DOWLOAD EBOOK
Visit and Get More Ebook Downloads Instantly at https://ebookmeta.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.
Data Management at Scale, Second Edition Piethein
Strengholt
https://ebookmeta.com/product/data-management-at-scale-second-edition-
piethein-strengholt/
ebookmeta.com
Data Management at Scale Best Practices for Enterprise
Architecture 1st Edition Piethein Strengholt
https://ebookmeta.com/product/data-management-at-scale-best-practices-
for-enterprise-architecture-1st-edition-piethein-strengholt/
ebookmeta.com
Data Management at Scale: Modern Data Architecture with
Data Mesh and Data Fabric (2nd Edition) Piethein
Strengholt
https://ebookmeta.com/product/data-management-at-scale-modern-data-
architecture-with-data-mesh-and-data-fabric-2nd-edition-piethein-
strengholt/
ebookmeta.com
Make Me Your Ho Ho Ho Daddy It s Christmas Daddy 2 1st
Edition Cherry Sweet
https://ebookmeta.com/product/make-me-your-ho-ho-ho-daddy-it-s-
christmas-daddy-2-1st-edition-cherry-sweet/
ebookmeta.com

Biofertilizers Study and Impact Inamuddin
https://ebookmeta.com/product/biofertilizers-study-and-impact-
inamuddin/
ebookmeta.com
Wiedergeburt Legend of the Reïncarnated Warrior Volume 6
1st Edition Brandon Varnell
https://ebookmeta.com/product/wiedergeburt-legend-of-the-reincarnated-
warrior-volume-6-1st-edition-brandon-varnell/
ebookmeta.com
How People Learn: A new model of learning and cognition to
improve performance and education 2nd Edition Nick
Shackleton-Jones
https://ebookmeta.com/product/how-people-learn-a-new-model-of-
learning-and-cognition-to-improve-performance-and-education-2nd-
edition-nick-shackleton-jones/
ebookmeta.com
Privatizing Peace How Commerce Can Reduce Conflict in
Space 1st Edition Wendy N Whitman Cobb
https://ebookmeta.com/product/privatizing-peace-how-commerce-can-
reduce-conflict-in-space-1st-edition-wendy-n-whitman-cobb/
ebookmeta.com
The Green Fairy Book 1st Edition Andrew Lang.
https://ebookmeta.com/product/the-green-fairy-book-1st-edition-andrew-
lang/
ebookmeta.com

Confidence Pocketbook : Little Exercises for a Self-
Assured Life 1st Edition Gill Hasson
https://ebookmeta.com/product/confidence-pocketbook-little-exercises-
for-a-self-assured-life-1st-edition-gill-hasson/
ebookmeta.com

Data Management at Scale
Modern Data Architecture with Data Mesh and Data
Fabric
SECOND EDITION
Piethein Strengholt

Data Management at Scale
by Piethein Strengholt
Copyright © 2023 Piethein Strengholt. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales
promotional use. Online editions are also available for most titles
(http://oreilly.com). For more information, contact our
corporate/institutional sales department: 800-998-9938 or
corporate@oreilly.com.
Acquisitions Editor: Michelle Smith
Development Editor: Shira Evans
Production Editor: Katherine Tozer
Copyeditor: Rachel Head
Proofreader: Piper Editorial Consulting, LLC
Indexer: nSight, Inc.
Interior Designer: David Futato
Cover Designer: Karen Montgomery
Illustrator: Kate Dullea
April 2023: Second Edition

Revision History for the Second Edition
2023-04-10: First Release
See https://oreilly.com/catalog/errata.csp?isbn=9781098138868 for
release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc.
Data Management at Scale, the cover image, and related trade dress
are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not
represent the publisher’s views. While the publisher and the author
have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and
the author disclaim all responsibility for errors or omissions,
including without limitation responsibility for damages resulting from
the use of or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any code
samples or other technology this work contains or describes is
subject to open source licenses or the intellectual property rights of
others, it is your responsibility to ensure that your use thereof
complies with such licenses and/or rights.
This work is part of a collaboration between O’Reilly and Microsoft.
See our statement of editorial independence.
978-1-098-13886-8
[LSI]

Foreword
Whenever we talk about software, we inevitably end up talking
about data—how much there is, where it all lives, what it means,
where it came from or needs to go, and what happens when it
changes. These questions have stuck with us over the years, while
the technology we use to manage our data has changed rapidly.
Today’s databases provide instantaneous access to vast online
datasets; analytics systems answer complex, probing questions;
event-streaming platforms not only connect different applications but
also provide storage, query processing, and built-in data
management tools.
As these technologies have evolved, so have the expectations of our
users. A user is often connected to many different backend systems,
located in different parts of a company, as they switch from mobile
to desktop to call center, change location, or move from one
application to another. All the while, they expect a seamless and
real-time experience. I think the implications of this are far greater
than many may realize. The challenge involves a large estate of
software, data, and people that must appear—at least to our users—
to be a single joined-up unit.
Managing company-wide systems like this has always been a dark
art, something I got a feeling for when I helped build the
infrastructure that backs LinkedIn. All of LinkedIn’s data is generated
continuously, 24 hours a day, by processes that never stop. But
when I first arrived at the company, the infrastructure for harnessing
that data was often limited to big, slow, batch data dumps at the
end of the day and simplistic lookups, jerry-rigged together with
homegrown data feeds. The concept of “end-of-the-day batch
processing” seemed to me to be some legacy of a bygone era of
punch cards and mainframes. Indeed, for a global business, the day
doesn’t end.
As LinkedIn grew, it too became a sprawling software estate, and it
was clear to me that there was no off-the-shelf solution for this kind

of problem. Furthermore, having built the NoSQL databases that
powered LinkedIn’s website, I knew that there was an emerging
renaissance of distributed systems techniques, which meant
solutions could be built that weren’t possible before. This led to
Apache Kafka, which combined scalable messaging, storage, and
processing over the profile updates, page visits, payments, and
other event streams that sat at the core of LinkedIn.
While Kafka streamlined LinkedIn’s dataflows, it also affected the
way applications were built. Like many Silicon Valley firms at the turn
of the last decade, we had been experimenting with microservices,
and it took several iterations to come up with something that was
both functional and stable. This problem was as much about data
and people as it was about software: a complex, interconnected
system that had to evolve as the company grew. Handling a problem
this big required a new kind of technology, but it also needed a new
skill set to go with it.
Of course, there was no manual for navigating this problem back
then. We worked it out as we went along, but this book may well
have been the missing manual we needed. In it, Piethein provides a
comprehensive strategy for managing data not simply in a solitary
database or application but across the many databases, applications,
microservices, storage layers, and all other types of software that
make up today’s technology landscapes.
He also takes an opinionated view, with an architecture to match,
grounded in a well-thought-out set of principles. These help to
bound the decision space with logical guardrails, inside of which a
host of practical solutions should fit. I think this approach will be
very valuable to architects and engineers as they map their own
problem domain to the trade-offs described in this book. Indeed,
Piethein takes you on a journey that goes beyond data and
applications into the rich fabric of interactions that bind entire
companies together.

Jay Kreps
Cofounder and CEO at Confluent

Preface
Data management is an emerging and disruptive subject.
Datafication is everywhere. This transformation is happening all
around us: in smartphones, TV devices, ereaders, industrial
machines, self-driving cars, robots, and so on. It’s changing our lives
at an accelerating speed.
As the amount of data generated skyrockets, so does its complexity.
Disruptive trends like cloudification, API and ecosystem connectivity,
microservices, open data, software as a service (SaaS), and new
software delivery models have a tremendous effect on data
management. In parallel, we see an enormous number of new
applications transforming our businesses. All these trends are
fragmenting the data landscape. As a result, we are seeing more
point-to-point interfaces, endless discussions about data quality and
ownership, and plenty of ethical and legal dilemmas regarding
privacy, safety, and security. Agility, long-term stability, and clear
data governance compete with the need to develop new business
cases swiftly. We sorely need a clear vision for the future of data
management.
This book’s perspective on data management is informed by my
personal experience driving the data architecture agenda for a large
enterprise as chief data architect. Executing that role showed me
clearly the impact a good data strategy can have on a large
organization. After leaving that company, I started working as the
chief data officer for Microsoft Netherlands. In this exciting new
position, I’ve worked with over 50 large customers discussing and
attempting to come up with a perfect data solution. Here are some
of the common threads I’ve identified across all enterprises:

An overarching data strategy is often missing or not connected
to the business objectives. Discussions about data management
mostly pivot to technology trends and engineering discussions.
What is needed is business engagement: a good strategy and
well-thought-out data management and analysis plan that
includes tangible value in the form of business use cases. To
make my point: the focus must be put on usage and turning
data into business value.
Enterprises have difficulties in interpreting new concepts like the
data mesh and data fabric, because pragmatic guidance and
experiences from the field are missing. In addition to that, the
data mesh fully embraces a decentralized approach, which is a
transformational change not only for the data architecture and
technology, but even more so for organization and processes.
This means the transformation cannot only be led by IT; it’s a
business transformation as well.
Enterprises find it difficult to comprehend the latest technology
trends. They’re unable to interpret nuances or make pragmatic
choices.
Enterprises struggle to get started: large ambitions often end
with limited action; the execution plan and architecture remain
too high-level, too conceptual; top-down commitment from
leadership is missing.
These experiences and my observations across a range of
enterprises inspired me to write this second edition of Data
Management at Scale. You may wonder why this book is worth
reading, over the first edition—let’s take a closer look.

Why I Wrote This Book and Why Now
The first edition was founded on the experience I gained while
working at ABN AMRO as chief data architect.1
In that role, my team
and I practiced the approach of federation: shifting activities and
responsibilities in response to the need for a faster pace of change.
We used governance for balancing the imperatives of centralization
and decentralization. This shift was supported by a central data team
that started to develop platforms for empowering business units to
meet their goals. With platforms, we introduced self-service and
aligned analysts to domains, supporting them in implementing their
use cases. We experimented with domain-driven design and
eventually switched to business architecture for managing the
architectural landscape as a whole. I used all these experiences as
input for writing the first edition.
The term data mesh as a description of a sociotechnical approach to
using data at large was coined at around the time the manuscript for
the first edition was being finalized. When Zhamak Dehghani’s article
describing the concept appeared on Martin Fowler’s website, it
revealed concrete names for concepts we’d already been using at
ABN AMRO for many years. These names became industry terms,
and the concept quickly began to resonate with large organizations
as a solution to the friction enterprises encounter when scaling up.
So, why write a second edition? To start with, it was the data mesh
concept. I love the ideas of bringing data management and software
architecture closer together and businesses taking ownership of their
data, but I firmly believe that, with all the fuss, a more nuanced
view is needed.
In my previous role as an enterprise architect, we had hundreds of
application teams, thousands of services, and many large legacy
applications to manage. In such situations, you approach complexity
differently. With the data mesh architecture, artist, song, and playlist

are often used as data domain examples. This approach of
decomposing data into fine-grained domains might work well when
designing microservices, but it isn’t well suited to (re)structuring
large data landscapes. A different viewpoint is needed for scale.
Next, a more nuanced and pragmatic view of data products is
needed. There are good reasons why data must be managed
holistically and end-to-end. Enterprises have reusability and
consistency concerns. They’re forced by regulation to conform to the
same dimensions for group reporting, accounting, financial
reporting, and auditing and risk management. I know this might
sound controversial, but a data product cannot be advocated to be
managed as a container: something that packages data, metadata,
code, and infrastructure all together in an architecture as tiny as a
microservice. This doesn’t reflect how today’s big data platforms
work. Finally, the data mesh story isn’t complete: it focuses only on
data that is used for analytical purposes, not operational purposes; it
omits master data management;2
the consumer side must be
complemented with an intelligent data fabric; and it doesn’t provide
much data modeling guidance for building data products.
Another incentive for publishing a second edition was concerns
about the book’s practicality. The first version was perceived by
various readers as too abstract. Some critical reviewers even left
comments questioning my hands-on experience. In this second
edition I’ve worked hard to address these concerns, providing many
real-world examples and concrete solution diagrams. From time to
time, I also refer to blog posts that I’ve written about how to
implement designs. One final note on this: there are a large number
of very complex topics to cover, which are also highly context-
sensitive. It would be impossible to provide examples of everything
in a single volume, so I’ve had to use some discretion.
I’m excited to share my thoughts on best practices and observations
from the field, and I hope this book inspires you. Reflecting on my
time working at ABN AMRO, there are lots of good lessons to be

taken from other enterprises. I’ve seen a lot of good approaches.
There’s no right or wrong when building good data architecture; it’s
all about making the right trade-offs and discovering what works
best for your situation.
If you’ve already read the first edition, you should find this one
significantly different and much improved. Structurally it’s more or
less the same, but every chapter has been revised and enhanced. All
the diagrams have also been revised, new content has been added,
and it’s much more practical. Within each chapter you’ll find many
tips, starting points, and references to helpful articles.

Who Is This Book For?
This book is intended for large enterprises, though smaller
organizations may find much of value in it. It’s geared toward:
Executives and architects
Chief data officers, chief technology officers, chief architects,
enterprise architects, and lead data architects
Analytics teams
Data scientists, data engineers, data analysts, and heads of
analytics
Development teams
Data engineers, data scientists, business intelligence engineers,
data modelers and designers, and other data professionals
Compliance and governance teams
Chief information security officers, data protection officers,
information security analysts, regulatory compliance heads, data
stewards, and business analysts
How to Read or Use This Book
It’s important to say up front that this book touches upon a lot of
complex topics that are often interrelated or intertwined with other
subjects. So we’ll be hopping between different technologies,
business methods, frameworks, and architecture patterns. From time
to time I bring in my own operational experience when
implementing different architectures, so we’ll be working at different
levels of abstraction. To describe the journey through the book, I’ll
use the analogy of a helicopter ride.

We’ll start with a zoomed-out view, looking at data management,
data strategy, and data architecture at an abstract and higher level.
From this helicopter view, we’ll start to zoom in and first explore
what data domains and landing zones are. We’ll then fly to the
source system side of our landscape, in which applications are
managed and data is created, and circle until we have covered most
of the areas of data management. Then we’ll fly over to the
consumer side of the landscape and start learning about the
dynamics there. After that, we’ll bring everything we’ve covered
together by putting things into practice.
To help you navigate through the book, the following table gives a
high-level overview of which subjects will be intensively discussed in
each chapter.

Table P-1. Key topics in each chapter
Ch. 1 Ch. 2 Ch. 3 Ch. 4
Data
management
x
Data strategy x x x
Data
architecture
x x
Data
integration
x
Data
modeling
x
Data
governance
Data security
Data quality x
Metadata
management
MDM
Business
intelligence

Ch. 1 Ch. 2 Ch. 3 Ch. 4
Advanced
analytics
Enterprise
architecture
Chapter 1 introduces the topic of data management. It gives a
contextual view of what data management is, how it’s changing, and
how it affects our digital transformation. It provides an assessment
of the state of the field in recent years and guidance for working out
a data strategy. In Chapter 2, we’ll jump into the details of
managing data at large, exploring domain-driven design and
business architecture as methodologies for managing a large data
landscape using data domains. Next, Chapter 3 focuses on
topologies and data landing zones as a way of structuring your data
architecture and aligning with your data domains.
The following chapters discuss the specifics of distributing data.
Chapter 4 focuses on data products, Command Query Responsibility
Segregation (CQRS), and guiding principles, and presents an
example solution design. Chapter 5 discusses API management, and
Chapter 6 covers event and notification management. Chapter 7
brings it all together for a comprehensive overview, complemented
with architecture guidance and experience.
Next, we delve deeper into more advanced aspects of data
management. Chapter 8 examines how to approach data
governance and security in ways that are practical and sustainable
for the long term, even in rapidly changing times. Chapter 9 is a
deep dive into the use, significance, and democratizing potential of
metadata. Chapter 10 offers guidance on using master data

management (MDM) to keep data consistent over distributed, wide-
ranging assets, while Chapter 11 addresses turning data into value.
Chapter 12 concludes the book with an example of making it real
and a vision for the future of data management and enterprise
architecture.
Conventions Used in This Book
The following typographical conventions are used in this book:
Italic
Indicates new terms, URLs, email addresses, filenames, and file
extensions.
Constant width
Used for program listings, as well as within paragraphs to refer to
program elements such as variable or function names, databases,
data types, environment variables, statements, and keywords.
TIP
This element signifies a tip or suggestion.
NOTE
This element signifies a general note.
WARNING
This element indicates a warning or caution.

O’Reilly Online Learning
NOTE
For more than 40 years, O’Reilly Media has provided technology and
business training, knowledge, and insight to help companies succeed.
Our unique network of experts and innovators share their knowledge
and expertise through books, articles, and our online learning
platform. O’Reilly’s online learning platform gives you on-demand
access to live training courses, in-depth learning paths, interactive
coding environments, and a vast collection of text and video from
O’Reilly and 200+ other publishers. For more information, visit
http://oreilly.com.
How to Contact Us
Please address comments and questions concerning this book to the
publisher:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 (in the United States or Canada)
707-829-0515 (international or local)
707-829-0104 (fax)
We have a web page for this book, where we list errata, examples,
and any additional information. You can access this page at

https://oreil.ly/data-mgmt-at-scale-2e.
Email bookquestions@oreilly.com to comment or ask technical
questions about this book.
For more information about our books, courses, conferences, and
news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly.
Follow us on Twitter: http://twitter.com/oreillymedia.
Watch us on YouTube: http://youtube.com/oreillymedia.
Acknowledgments
I would like to acknowledge Jessica Strengholt-Geitenbeek for
allowing me to write this book. She has supported me throughout
this journey, taking care of the kids and creating room to allow me
to work on this, and she’s the love of my life.
I also would like to thank ABN AMRO, and especially Santhosh Pillai
for his trust and for guiding me throughout my career at the
company. Many of the initial ideas for this project originated in his
mind. Without the countless discussions he and I had, this book
wouldn’t exist. Next, I would like to thank Microsoft for providing the
support I needed to write this second edition. In addition, many
others provided support and feedback on the book: thanks to Tim
Ward (CEO at CluedIn), Batuhan Tuter, Nasim Merhshid, Rob Worrall,
Frank Leisten, and all the others who contributed in various ways.
Thanks also to the book’s technical reviewers, John Mallinder and
Ole Olesen-Bagneux. Your valuable insights and feedback helped
validate the technical content and make this a better book.
Finally, I would like to thank all the fantastic crew members from
O’Reilly for their support and trust. Shira, thank you for taking care
of me. I enjoyed our conversations, and I’m grateful for your
constructive feedback. Katie, thank you for your continuous support

and transparency. To my fantastic copyeditor Rachel Head, thank
you for your hard work to review and edit all content. You really
have done an outstanding job by debugging the content and
connecting my sentences.
1 The statements and opinions expressed in this book don’t necessarily
reflect the positions of ABN AMRO or Microsoft.
2 The terminology “master/slave” is clearly offensive, and many organizations
have switched to alternatives like “source/replica” or “primary/subordinate.”
We strive to be as inclusive as possible, but will use “master data
management” in this book because the industry hasn’t yet adopted an
alternative.

Chapter 1. The Journey to
Becoming Data-Driven
The pre-COVID-19 world was already fast and highly data-driven,
but the pace of change has accelerated rapidly. Fierce competition, a
digital-first era, ever-increasing customer expectations, and rising
regulatory scrutiny require organizations to transform themselves
into modern data-driven enterprises. This transformation will
inevitably result in future organizations being more digital than those
of today, and having a different view of data. Tomorrow’s
organizations will breathe data and embrace a philosophy that places
it at the heart of their business. They will manage data as a product,
make strategic decisions based on data analysis, and have a culture
that acts on data.
Data-driven isn’t just a buzzword.1
Being data-driven provides an
organization with a significant competitive advantage over other
organizations. It can be proactive, and it can predict what will
happen before it does. By using data correctly, organizations can
quickly react to changes. Using data leads to greater confidence
because decisions are based on facts, not intuition. With data, new
industry trends and business opportunities can be spotted sooner.
Customer retention and satisfaction are improved as well, because
data tells organizations what customers think, how they behave, and
what they want. With data, organizations can be more flexible, agile,
and cost effective, because data provides insights into measured
results, employee loyalty, dependencies, applications, and processes.
So, the imperative for organizations to transform themselves into
data-driven enterprises is definitively there.
Before we jump into the transformation itself, we’ll explore the
present-day challenges that require us to reevaluate how data must

be managed. We’ll establish a common definition of data
management, encompassing all the disciplines and activities related
to managing data as an asset. After that, we’ll zoom in on several
key technology developments and industry trends, and consider their
impact on data management. We’ll look at some best practices from
the last decade of data management, providing insights into why
previous-generation architectures are hard to scale. Finally, we’ll
consider what a next-generation data architecture might look like,
and I’ll present a set of action points that you will need to address
while developing your data strategy.
Recent Technology Developments and
Industry Trends
Transforming an organization to become data-driven isn’t easy. It’s a
long-term process that requires patience and fortitude. With more
data available, traditional architectures can no longer be scaled up
because of their size, complexity, monolithic designs, and centralistic
operating models. Enterprises need a new data strategy and cloud-
based architecture. A paradigm shift and change of culture are
needed, too, because the centralized data and IT operating models
that work today will no longer work when applying federated data
ownership and self-serve consumption models. This requires
organizations to redefine how people, processes, and technology are
aligned with data.
Recent technology developments and industry trends force us to
reevaluate how data must be managed. We need to shift away from
funneling all data into a single silo toward an approach that enables
domains, teams, and users to distribute, consume, and use data
themselves easily and securely. Platforms, processes, and patterns
should simplify the work for others. We need interfaces that are
simple, well documented, fast, and easy to consume. We need an
architecture that works at scale.

Although there are many positives about evolving into a truly data-
driven organization, it’s important to be aware of several technology
developments and industry trends that are impacting data
landscapes. In this chapter, I’ll discuss each of these and its
influence on data management. Firstly, analytics is fragmenting the
data landscape because of use case diversity. Secondly, new
software development methodologies are making data harder to
manage. Thirdly, cloud computing and faster networks are
fragmenting data landscapes. In addition, there are privacy, security,
and regulatory concerns to be aware of, and the rapid growth of
data and intensive data consumption are making operational
systems suffer. Lastly, data monetization requires an ecosystem-to-
ecosystem architecture. The impact these trends have on data
management is tremendous, and they are forcing the whole industry
to rethink how data management must be conducted in the future.
Fortunately, new approaches to data management have emerged
over the past few years, including the ideas of a data mesh and data
fabric:
The data mesh is an exciting new methodology for managing
data at large. The concept foresees an architecture in which
data is highly distributed and a future in which scalability is
achieved by federating responsibilities. It puts an emphasis on
the human factor and addressing the challenges of managing
the increasing complexity of data architectures.
The data fabric is an approach that addresses today’s data
management and scalability challenges by adding intelligence
and simplifying data access using self-service. In contrast to the
data mesh, it focuses more on the technology layer. It’s an
architectural vision using unified metadata with an end-to-end
integrated layer (fabric) for easily accessing, integrating,
provisioning, and using data.

These emerging approaches to data management are
complementary and often overlap. Despite popular belief among
data practitioners and what commercial vendors say, they shouldn’t
be seen as rigid or standalone techniques. In fact, I expect these
approaches to coexist and complement one another and any existing
investments in operational data stores, data warehouses, and data
lakes.
In the transition to becoming data-driven, organizations need to
make trade-offs to balance the imperatives of centralization and
decentralization. Some prefer a high degree of autonomy for their
business teams, while others prioritize quality and control. Some
organizations have a relatively simple structure, while others are
brutally large and complex. Creating the perfect governance
structure and data architecture isn’t easy, so while developing your
strategy, I encourage you to view these approaches to data
management as frameworks. There’s no right or wrong. With the
data mesh approach, for example, you might like some of the best
practices and principles, but not others; you don’t necessarily have
to apply all of them.
In this book, I’ll share my view on data management—one that is
based on observations from the field while working closely with
many large enterprises, and that helps you to make the right
decisions by learning from others. We’ll go beyond the concepts of
data mesh and data fabric, because I strongly believe that a data
strategy should be inclusive of both the operational and analytical
planes, and that the decomposition method for both data domains
and data products must be altered to fit the scale of large
enterprises. To help you in your journey, I’ll share my observations
on the strategies and architectures different enterprises have
designed, why, and what trade-offs they have made.
Before we jump into details, we need to agree on what data
management is, and why it’s important. Next, we need to determine
how to define boundaries and shape our landscape based on various

trade-offs. Finally, we’ll examine how current enterprise data
architectures can be designed and organized for today and
tomorrow.
Let me lay my cards out on the table: decentralization is not a
desired state, but the inevitable future of data. I therefore have a
strong belief that scalability is forcing data management to become
more decentrally organized. Managing data at scale requires you to
federate key responsibilities, set strong standards, and properly align
central and local resources and activities. This change affects
multiple areas: people, processes, and technology. It forces you to
decompose your architecture, dividing and grouping responsibilities.
The shift from centralization to decentralization also contradicts the
established best practice from the past decade: building large data
silos in which all data is collected and integrated before being
served. Although data warehouse and lake architectures are
excellent approaches for utilizing data, these centralized models are
not suited to a decentralized distributed data architecture.
Now that we’ve set the scene, I ask that you take a deep breath and
put your biases aside. Many of us might be deeply invested in
centralized data architectures; this has been a best practice for many
years. I acknowledge that the need for data harmonization and for
bringing large amounts of data into a particular context remains, and
that doing so brings value to organizations, but something we must
consider is the scale at which we want to apply this discipline. In a
highly distributed ecosystem with hundreds or even thousands of
applications, is the best way of managing data to apply centralization
on all dimensions? Is it best to integrate and harmonize all data?
Data Management
The term data management refers to the set of processes and
procedures used to manage data. The Data Management
Association’s Data Management Body of Knowledge (DAMA-DMBOK)

has a more extensive explanation of data management, which it
defines as “the development, execution, and supervision of plans,
policies, programs, and practices that deliver, control, protect, and
enhance the value of data and information assets throughout their
life cycles.”2
The DAMA-DMBOK identifies 11 functional areas of data
management, with data governance at the heart, as shown in
Figure 1-1. It’s crucial to embed all of these deeply into your
organization. Otherwise, you’ll lack insight and become ineffective,
and your data will get out of control. Becoming data-driven—getting
as much value as possible out of your data—will become a
challenge. Analytics, for example, is worth nothing if you have low-
quality data.

Figure 1-1. The 11 functional areas of data management
The activities and disciplines of data management are wide ranging
and cover multiple areas, some closely related to software
architecture.3
In this book, I’ll focus on the aspects of data
management that are most relevant for managing a modern data
architecture at scale. Let’s take a closer look at the 11 areas
identified in this figure and where they’re covered in the book:

Data governance, shown at the heart of Figure 1-1, involves all
activities around implementing and enforcing authority and
control over the management of data, including all
corresponding assets. This area is described in detail in
Chapter 8.
Data architecture involves the definition of the master plan for
your data, including the blueprints,4
reference architectures,
future state vision, and dependencies. Managing these helps
organizations make decisions. The entire book revolves around
data architecture generally, but the discipline and its activities
will be covered fully in Chapters 2 and 3.
Data modeling and design is about structuring and representing
data within a specific context and specific systems. Discovering,
designing, and analyzing data requirements are all part of this
discipline. We’ll discuss these topics in Chapters 4, 7, and 11.
Data storage and operations refers to the management of the
database design, correct implementation, and support in order
to maximize the value of the data. Database management also
includes database operations management. We’ll address this in
Chapter 11.
Data security includes all disciplines and activities that provide
secure authentication, authorization, and access to the data.
These activities include prevention, auditing, and escalation-
mitigating actions. This area is described in more detail in
Chapter 8.
Data integration and interoperability includes all the disciplines
and activities related to moving, collecting, consolidating,
combining, and transforming data in order to move it efficiently
from one context into another. Data interoperability refers to the
capability to communicate, invoke functions, or transfer data
among various applications in a way that requires little or no

knowledge of the application characteristics. Data integration,
on the other hand, is about consolidating data from different
(multiple) sources into a unified view. This process, which I
consider most important, is often supported by extra tools, such
as replication and ETL (extract, transform, and load) tools. It’s
described extensively in Chapters 4, 5, and 6.
Document and content management is the process of managing
data stored in unstructured (media) and data formats. Some
aspects of this will be discussed in Chapters 5 and 6.
Reference and master data management is about managing
critical data to make sure the data is accessible, accurate,
secure, transparent, and trustworthy. This area is described in
more detail in Chapter 10.5
Data warehousing and business intelligence management
includes all the activities that provide business insights and
support decision making. This area, including advanced
analytics, is described in more depth in Chapter 11.
Metadata management involves managing all data that classifies
and describes the data. Metadata can be used to make the data
understandable, ready for integration, and secure. It can also be
used to ensure the quality of data. This area is described in
more detail in Chapter 9.
Data quality management includes all activities related to
managing the quality of data to ensure the data can be used.
Some aspects of this area are described in Chapters 2 and 3.
The part of the DAMA-DMBOK that needs more work, which inspired
me to write the first edition of this book, is the section on data
integration and interoperability. I believe this section is lacking
depth: the relationship to application integration and software
architecture is not clear. It doesn’t discuss decentralized
architectures, and it lacks modern guidance on the interoperability of

data, such as observability best practices and modern data pipeline
management. In addition, the link to metadata management is
weak. Metadata needs integration and interoperability, too, because
it is scattered across many tools, applications, platforms, and
environments in diverse shapes and forms. The interoperability of
metadata—the ability of two or more systems or components to
exchange descriptive data about data—gets insufficient treatment:
building and managing a large-scale architecture is very much about
metadata integration. Interoperability and metadata also aren’t well
connected to the area of data architecture. If metadata is utilized in
the right way, you can see what data passes by, how it can be
integrated, distributed, and secured, and how it connects to
applications, business capabilities, and so on. There’s limited
guidance in the DAMA-DMBOK about managing your data as a whole
by utilizing and connecting metadata.
Another concern I have is the view DAMA and many organizations
have on achieving end-to-end semantic consistency. As of today,
attempts to unify semantics to provide enterprise-wide consistency
are still taking place. This is called a single version of the truth.
However, applications are always unique, and so is data. Designing
applications involves a lot of implicit thinking. The (domain) context
of the business problem influences the design of the application and
finds its way into the data. We pass through this context when we
move from conceptual design into logical application design and
physical application design.6
It’s essential to understand this because
it frames any future architecture. When data is moved across
applications, a data transformation step is always necessary. There’s
no escape from this data transformation dilemma! In the following
chapters, I’ll return to this idea.
Another view I see in many organizations is that data management
should be central and must be connected to the strategic goals of
the enterprise. Some organizations still believe that operational costs
can be reduced by centralizing all data and management activities.

There’s also a deep assumption that a centralized platform can take
away the pain of data integration for its users and consumers.
Companies have invested heavily in their enterprise data platforms,
which include data warehouses, data lakes, and service buses. The
activities of master data management are strongly connected to
these platforms because consolidating allows us to simultaneously
improve the accuracy of our most critical data.
A centralized platform—and the centralized model that comes with it
—will be subject to failure because it won’t be able to keep up with
the developments and trends that underpin decentralization, such as
analytics, cloud computing, new software development
methodologies, real-time decision making, and data monetization.
While they may be aware of these trends, many companies fail to
comprehend the impact they have on data management. Let’s
examine the most important trends and determine the magnitude of
that impact.
Analytics Is Fragmenting the Data Landscape
The most trailblazing trend is advanced analytics, which exploits data
to make companies more responsive, competitive, and innovative.
Why does advanced analytics disrupt the existing data landscape?
With more data available, the number of options and opportunities
skyrockets. Advanced analytics is about making what-if analyses,
projecting future trends and outcomes or events, detecting hidden
relations and behaviors, and automating decision making. Because
of the recognized value and strategic benefits of advanced analytics,
many methodologies, frameworks, and tools have been developed to
use it in divergent ways. We’ve only scratched the surface of what
artificial intelligence (AI), machine learning (ML), and natural
language processing (NLP) will be capable of in the future.

NOTE
OpenAI’s new ChatGPT is a mind-blowing example of what AI is capable
of. The models behind OpenAI, which include the Generative Pre-trained
Transformer (GPT) series, can work on complex tasks such as analyzing
math problems, writing code snippets, producing book essays, creating
recipes using a list of ingredients, and much more.
These analytical trends force data to be distributed across many
analytical applications because every individual use case requires
different data. Unique business problems require unique thinking,
unique data, and optimized technology to provide the best solution.
Take, for example, a marketing business unit whose goal is to
identify new sales opportunities for older and younger customers.
Targeting these two audiences requires different features—
measurable properties for analyzing—in datasets. For example,
prospects that are younger will be segmented and clustered
differently than prospects that are older. Asking the marketing
department to use a single dataset for both target audiences
requires many compromises, and you’ll probably end up with a
feature store that doesn’t add any value to each use case.7
The
optimal solution for generating the most value is to give either use
case its own unique set of features optimized for each learning
algorithm. The increasing popularity of advanced analytics and
resulting use case diversity issues lead to two problems: data
proliferation and data intensiveness.
With data proliferation, data is distributed and scattered across a
myriad of locations, applications, and databases. This is because
consuming domains need to process the data to fit it into their
unique solutions. This data distribution introduces other problems.
For one, when data is repeatedly copied and scattered throughout
the organization, it becomes more difficult to find its origin and
judge its quality. This requires you to develop a single logical view of

the same data that is managed in different locations. Additionally,
extensive data distribution makes controlling the data much more
difficult because data can be spread even further as soon as it leaves
any given application. This requires you to develop a framework for
efficiently reusing data, while applying governance and staying
compliant with external regulations.
The proliferation of analytical techniques is also accelerating the
growth of data intensiveness: the read-versus-write ratio is changing
significantly. Analytical models that are constantly retrained, for
example, constantly read large volumes of data. This impacts
application and database designs because we need to optimize for
data readability. It might also mean that we need to duplicate data
to relieve systems from the pressure of constantly serving it, or to
preprocess the data because of the large number of diverse use case
variations and their associated read patterns. Additionally, we might
need to provide different representations of the same data for many
different consumers. Facilitating a high variety of read patterns while
duplicating data and staying in control isn’t easy. A solution for this
problem will be provided in Chapter 4.
The Speed of Software Delivery Is Changing
In today’s world, software-based services are at the core of most
businesses, which means that new features and functionality must
be delivered quickly. In response to the demands for greater agility,
new ideologies have emerged at companies like Amazon, Netflix,
Meta, Google, and Uber. These companies have advanced their
software development practices based on two beliefs.
The first belief is that software development (Dev) and information
technology operations (Ops) must be combined to shorten the
systems development life cycle and provide continuous delivery with
high software quality. This methodology, called DevOps, requires a

new culture that embraces more autonomy, open communication,
trust, transparency, and cross-discipline teamwork.
The second belief is about the size at which applications must be
developed. Flexibility and speed of development are expected to
increase when applications are transformed into smaller decomposed
services. This development approach incorporates several
buzzwords: microservices, containers, Kubernetes, domain-driven
design, serverless computing, etc. I won’t go into detail on all of
these concepts yet, but it’s important to recognize that this evolution
in software development involves increased complexity and a greater
demand to better control data.
The transformation of monolithic applications into distributed
applications—for example, microservices—creates many difficulties
for data management. When breaking up applications into smaller
pieces, the data is spread across different smaller components.
Development teams must also transition their (single) unique data
stores, where they fully understand the data model and have all the
data objects together, to a design where data objects are spread all
over the place. This introduces several challenges, including
increased network communication, data read replicas that need to
be synchronized, difficulties when combining many datasets, data
consistency problems, referential integrity issues, and so on.
The recent shift in software development trends requires an
architecture that allows more fine-grained applications to distribute
their data. It also requires a new DataOps culture and a different
design philosophy with more emphasis on data interoperability, the
capture of immutable events, and reproducible and loose coupling.
We’ll discuss this in more detail in Chapter 2.
The Cloud’s Impact on Data Management Is

Immeasurable
Networks are becoming faster, and bandwidth increases year after
year. Large cloud vendors have proven that it’s possible to move
terabytes of data in the cloud in minutes, which allows for an
interesting approach: instead of bringing the computational power to
the data—which has been the common best practice because of
network limitations—we can turn it around and bring the data to the
computational power by distributing it. The network is no longer the
bottleneck, so we can move data quickly between environments to
allow applications to consume and use it. This model becomes
especially interesting as software as a service (SaaS) and machine
learning as a service (MLaaS) markets become more popular.
Instead of doing all the complex stuff in-house, we can use networks
to provide large quantities of data to other parties.
This distribution pattern of copying (duplicating) data and bringing it
to the computational power in a different facility, such as a cloud
data center, will fragment the data landscape even more, making a
clear data management strategy more important than ever. It
requires you to provide guidelines, because fragmentation of data
can negatively impact performance due to data access lag. It also
requires you to organize and model your data differently, because
cloud service providers architected separate compute and storage to
make them independently scalable.
Privacy and Security Concerns Are a Top
Priority
Enterprises need to rethink data security in response to the
proliferation of data sources and growing volume of data. Data is
inarguably key for organizations looking to optimize, innovate, or
differentiate themselves, but there is also a darker side with
unfriendly undertones that include data thefts, discrimination, and
political harm undermining democratic values. Let’s take a look at a

few examples to get an idea of the impact of bad data privacy and
security.
UpGuard maintains a long list of the biggest data breaches to date,
many of which have capitalized on the same mistakes. The
Cambridge Analytica breach and 500 million hacked accounts at
Marriott are impressive examples of damaging events. Governments
are increasingly getting involved in security and privacy because all
aspects of our personal and professional lives are now connected to
the internet. The COVID-19 pandemic, which forced so many of us
to work and socialize from home, accelerated this trend. Enterprises
cannot afford to ignore the threats of intellectual property
infringements and data privacy scandals.
The trends of massive data, more powerful advanced analytics, and
faster distribution of data have triggered a debate around the
dangers of data, raising ethical questions and discussions. Let me
share an example from my own country. In the Netherlands, the
Dutch Tax Administration practiced unlawful and discriminatory
activities. They tracked dual nationals in their systems and used
racial and ethnic classifications to train models for entitlements to
childcare benefits. The result: thousands of families were incorrectly
classified as criminals and had their benefits stopped. They were
ordered to repay what they already had received, sometimes
because of technical transgressions such as failing to correctly sign a
form. Some people were forced to sell their homes and possessions
after they were denied access to debt restructuring.
This is just one example of improper use of data. As organizations
inevitably make mistakes and cross ethical lines, I expect
governments to sharpen regulation by demanding more security,
control, and insight. We’ve only scratched the surface of true data
privacy and ethical problems. Regulations, such as the new
European laws on data governance and artificial intelligence will
force large companies to be transparent about what data is collected
and purchased, what data is combined, how data is used within

analytical models, and what data is distributed (sold). Big companies
need to start thinking about transparency and privacy-first
approaches and how to deal with large regulatory issues now, if they
haven’t already.
Regulation is a complex subject. Imagine a situation in which several
cloud regions and different SaaS services are used and data is
scattered. Satisfying regulations, such as the GDPR, CCPA, BCBS
239, and the new Trans-Atlantic Data Privacy Framework is difficult
because companies are required to have insight and control over all
personal data, regardless of where it is stored. Data governance and
correctly handling personal data is at the top of the agenda for many
large companies.8
These stronger regulatory requirements and data ethics concerns
will result in further restrictions, additional processes, and enhanced
control. Insights about where data originated, how models are
trained, and how data is distributed are crucial. Stronger internal
governance is required, but this trend of increased control runs
contrary to the methodologies for fast software development, which
involve less documentation and fewer internal controls. It requires a
different, more defensive viewpoint on how data management is
handled, with more integrated processes and better tools. Several of
these concerns will be addressed in Chapter 8.
Operational and Analytical Systems Need to Be
Integrated
The need to react faster to business events introduces new
challenges. Traditionally, there has been a great divide between
transactional (operational) applications and analytical applications
because transactional systems are generally not sufficient for
delivering large amounts of data or constantly pushing out data. The
accepted best practice has been to split the data strategy into two

parts: operational transactional processing and analytical data
warehousing and big data processing.
However, this divide is subject to disruption. Operational analytics,
which focuses on predicting and improving the existing operational
processes, is expected to work closely with both the transactional
and analytical systems. The analytical results need to be integrated
back into the operational system’s core so that insights become
relevant in the operational context. I could make the same argument
for real-time events: when events carry state, the same events can
be used for operational decision making and data distribution.
This trend requires a different integration architecture, one that
better manages both the operational and analytical systems at the
same time. It also requires data integration to work at different
velocities, as these tend to be different for operational systems and
analytical systems. In this book, we’ll explore the options for
preserving historical data in the original operational context while
simultaneously making it available to both operational and analytical
systems.
Organizations Operate in Collaborative
Ecosystems
Many people think that all business activities take place within the
single logical boundary in which the enterprise operates. The reality
is different, because many organizations work closely with other
organizations. Companies are increasingly integrating their core
business capabilities with third-party services. This collaboration
aspect influences the design of your architecture because you need
to be able to quickly distribute data, incorporate open data,9
make
APIs publicly available, and so on.
These changes mean that data is more often distributed between
environments, and thus is more decentralized. When data is shared

with other companies, or using cloud or SaaS solutions, it ends up in
different places, which makes integration and data management
more difficult. In addition, network bandwidth, connectivity, and
latency issues inevitably arise when moving data between platforms
or environments. Pursuing a single public cloud or single platform
strategy won’t solve these challenges, because decentralization is
the future. This means that if you want APIs and SaaS systems to
work well and use the capabilities of the public cloud, you must be
skilled at data integration, which this book will teach you how to do.
The trends discussed here are major and will affect the way people
use data and the way companies should organize their architectures.
Data growth is accelerating, computing power is increasing, and
analytical techniques are advancing. Data consumption is increasing,
too, which implies that data needs to be distributed quickly. Stronger
data governance is required. Data management also must be
decentralized due to trends like cloud services, SaaS, and
microservices. All of these factors have to be balanced with a short
time to market, thanks to strong competition. This risky combination
challenges us to manage data in a completely different way.
Enterprises Are Saddled with Outdated Data
Architectures
One of the biggest problems many enterprises are dealing with is
getting value out of their current data architectures.10
The majority
of existing data architectures use a monolithic design—either an
enterprise data warehouse (EDW)11
or a data lake—and manage and
distribute data centrally. In a highly distributed and consumer-driven
environment, these architectures won’t fulfill future needs. Let’s look
at some of the characteristics of each.

The Enterprise Data Warehouse: A Single
Source of Truth
Most enterprises’ first-generation data architectures are based on
data warehousing and business intelligence. The philosophy is that
centralization is the silver bullet to solve the problem of data
management. With this approach there’s one central integrated data
repository, containing years’ worth of detailed and stable data, for
the entire organization. But when it comes to distributing data at
large, such a monolithic design has several drawbacks.
Enterprise data unification is an incredibly complex undertaking and
takes many years to complete. Chances are relatively high that the
meaning of data differs across domains,12
departments, and
systems, even if data attributes have similar names. So, we either
end up creating many variations or just accepting the differences
and inconsistencies. The more data we add, and the more conflicts
and inconsistencies that arise, the more difficult it will be to
harmonize the data. We will likely end up with a unified context that
is meaningless for the end users. In addition, for advanced analytics
such as machine learning, leaving context out can be a significant
problem because it’s impossible to correctly predict the future if the
data is meaningless or unrecognizable.
Enterprise data warehouses behave like integration databases, as
illustrated in Figure 1-2. They act as data stores for multiple data-
consuming applications. This means that they’re a point of coupling
between all the applications that want to access the data. It’s often
assumed that centralization is the solution to data management—
this includes centralizing all data and management activities using
one central team, building one data platform, using one ETL
framework and one canonical model, etc. However, with a
centralized architecture, changes need to be carried out carefully
because of the many cross-dependencies between different
applications. Some changes can also trigger a ripple effect of other

Another Random Scribd Document
with Unrelated Content

The Project Gutenberg eBook of The Festival
of Spring, from the Díván of Jeláleddín

This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
Title: The Festival of Spring, from the Díván of Jeláleddín
Author: Maulana Jalal al-Din Rumi
Contributor: W. Hastie
Release date: April 29, 2018 [eBook #57068]
Language: English
Credits: Produced by Turgut Dincer, Chris Pinfield and the Online
Distributed Proofreading Team at http://www.pgdp.net
(This
book was produced from images made available by the
HathiTrust Digital Library.)
*** START OF THE PROJECT GUTENBERG EBOOK THE FESTIVAL OF
SPRING, FROM THE DÍVÁN OF JELÁLEDDÍN ***

Transcriber's Note.
Apparent
typographical errors
have been
corrected.
Variations in
transliteration have
been retained.
References to three
Notes at the end of
the text have been
included in the
Table of Contents.
The Festival of Spring
PUBLISHED BY
JAMES MACLEHOSE AND SONS,
GLASGOW,
Publishers to the University.
MACMILLAN AND CO.,
LTD., LONDON.
New York, The
Macmillan
Co.
London, Simpkin,
Hamilton
and Co.

Cambridge, Macmillan
and Bowes.
Edinburgh, Douglas
and Foulis.
MCMIII.
The Festival of
Spring
from
The Díván of
Jeláleddín
Rendered in English Gazels
after Rückert's Versions
With an Introduction And a
Criticism of the Rubáiyát of
Omar Khayyám
By
William Hastie, D.D.
Professor of Divinity, University
of Glasgow
Glasgow
James MacLehose and
Sons

Publishers to the University
1903
'In Depth of Conception,
as well as in Loftiness of
Flight and Sublimity of
Language, Jeláleddín
surpasses all the Poets of
the East.'
Professor Hermann Ethé.
'The greatest Mystical Poet
of any Age.'
Reynold A. Nicholson.
'And all the Breeze of Fancy
blows,
And every Dew-drop
paints a bow,
The wizard Lightnings
deeply glow,
And every Thought breaks
out a Rose.'
Tennyson.
This Book with its Sincere
Utterances of Love and
Friendship

towards the Highest
is dedicated to
William A.
Sanderson, Esq.
Byethorne, Galashiels
My ever faithful Friend
in Adversity, as in
Prosperity
Old Songs are sweetest
Old Friends are best

Note
The current, popular spelling of Persian Names and Words has
been generally adopted in the following pages, in order to avoid any
appearance of pedantry. The Turkish forms have occasionally been
preferred when in place, e.g. Devlet for Daulat, and Mevlānā for
Maulānā. The exact transliteration of the Persian—such as Jalálu-'d-
Din, Shams-ud-Din, Umar, Ghasal, Díwán—will be found in the foot-
note references to more learned Works.

Contents
Introduction
Jeláleddín as a Persian Poet—Judgments of Scholars
and Experts in Persian Literature since Sir W.
Jones—The Philosophical and Theological Interest
—Hegel—Tholuck—The Poetical Form—The Gazel
—The Divan—Fitzgerald's Omar Khayyám—Burns
—Browning—Keats's Nightingale—Coleridge
echoes the Faith of Jeláleddín.
Fifty Gazels of Jeláleddín
Page
I. Light, 1
II. Death and Life, 2
III. Invocation, 3
IV. Faith, 4
V. Dawn, 5
VI. Allah Hu, 6
VII. Spring, 7
VIII. Spring's Festival, 8
IX. Dependence, 9
X. Mystical Union, 10
XI. Identity, 12
XII. Confession, 13
XIII. Discordia Concors, 14

XIV. Renovation, 15
XV. Revolving in Mystic Dance, 16
XVI. The Soul in All, 17
XVII. Responsibility, 18
XVIII. Action, 19
XIX. Bondage, 20
XX. Love's Freedom, 21
XXI. In My Heart, 22
XXII. Not Deaf to Love, 23
XXIII. Assimilation, 24
XXIV. Cleanliness, 25
XXV. Where is He? 26
XXVI. Love's Slavery, 27
XXVII. Psyche in Tears, 28
XXVIII. Substitutional, 29
XXVIX. God's Throne, 30
XXX. The Lion of God, 31
XXXI. Self-Realisation, 33
XXXII. Thy Hand, 34
XXXIII. The Priests, 35
XXXIV. The Pilgrims, 36
XXXV. Many Faiths, One Lord, 37
XXXVI. Love Absolute, 38
XXXVII. Renunciation, 39
XXXVIII. All Fulness, 40
XXXIX. Friendship, 41
XL. The Friend Supreme, 42
XLI. Immortality, 44
XLII. The First and Last, 45

XLIII. Mystic Love Dance, 46
XLIV. Dream Fear, 47
XLV. The Cry of Love, 48
XLVI. Night Thought, 49
XLVII. Up out of Night, 50
XLVIII. All One, 52
XLIX. O Wake in Me, 53
L. Jeláleddín, 55
Notes
A.
Sir William Jones on the Mystical
Poetry of the Persians.
57
B.
Hegel on the Character of the
Persian Lyrical Poetry.
59
C.
Von Hammer's Account of Omar
Khayyám.
61

Introduction
I.
Jeláleddín Rúmí (A.D. 1207-1273) is now universally recognised by
'those who know,' as the greatest of the Persian Mystical Poets. This
supremacy, in his own sphere, has been unanimously accorded to
him for more than six centuries, by unnumbered myriads of his own
disciples and followers in the Oriental World, who have been wrapt
in devoutest admiration of the great Master to whom they have
owed the highest joy and inspiration of their spiritual life. And at
last, in our own Western World, the great Persian scholars of
Europe, looking at him without personal or national bias, and
through the clear, cold light of the new time, have come more and
more, as with one voice, to join in this chorus of praise. His most
appreciative recent editor and interpreter in England, in presenting a
few leaves plucked with reverent hand from what he calls
Jeláleddín's 'wreath of imperishable Lyric Song,' offers his own
careful and conscientious work to us, as a contribution 'to a better
appreciation of the greatest mystical poet of any age.' And with this
designation, as summing up the judgment of a capable expert and
critic—strange as it may sound—we venture, in all deference and
sincerity, to agree. Jeláleddín is now rising upon our literary horizon
in all his native Splendour—his name appropriately signifying 'The
Splendour of the Faith'—as at once the Dante, the St. Bernard, the
Spenser, the Milton, the Angelus Silesius, and the Novalis of the
Orient. As a religious Lyrical Poet his mellifluous music, his variety of
strain, his captivating charm of words, his purity of feeling, his
joyous faith, and his elevation of thought, have never been
surpassed in their own kind. Taking what Matthew Arnold has called
'the lyrical cry' even in its widest range, it would be doing no one
wrong—although it dare hardly be done as yet—to rank Jeláleddín,

when he comes fully before us 'with all his singing robes about him,'
with the very highest—with Shakespeare, with Wordsworth and
Keats and Shelley, and with Goethe and Heine! He is certainly one of
the most fertile poets of Nature among the Lyrical Singers of all
time, and the most exuberant, if not also the most spiritual, Hymnist
the world outside of Christendom has yet produced.
This estimate, however shaded or qualified, cannot but appear at
first strangely exaggerated, and out of all just proportion, to those
who mayhap read the name of Jeláleddín now for the first time. Let
us listen, then, to the greatest students of Persian Poetry in the
critical Nineteenth Century, the judges who have highest authority
on the subject, and who have the best right to pronounce judgment
on Jeláleddín. And let us hear in the first place, as is his due, the
most learned Historian of Persian Poetry in the Nineteenth Century,
who with indefatigable industry and completest knowledge has
adorned his pages with Extracts from no less than Two Hundred of
Persia's greatest Poets. Joseph von Hammer, the great Austrian
Orientalist (known later as Baron Von Hammer-Purgstall and as the
Historian of Arabic Literature in seven immense volumes, containing
Accounts of nearly ten thousand Authors) says:—
'Jeláleddín Rumi is the greatest Mystical Poet of the East, the Oracle of the
Sofis, the Nightingale of the contemplative life, the Author of the Mesnevi (a
celebrated double-rhymed ascetic poem), and the Founder of the Mevlevi,
the most famous Order of Mystical Dervishes. As Founder of this Order, as
the Legislator of the Contemplative Life, and as the Interpreter of Heavenly
Mysteries, he is highly revered. And as such he has to be estimated by quite
a different standard from that which applies to those Poets whose inspiration
has not soared, like his, to the Vision of Divine things, to the primal Fountain
of Love and Light. He cannot properly be compared either with Firdusi, the
greatest of the Persian Epic Poets, nor with Nizami, the greatest of the
Romantic Poets, nor with Saadi, the first of the moral Didactic Poets, nor with
Hafiz, the chiefest of the erotic Lyrical Poets; for all these won the Palm of
Poetry in entirely different fields from his. The only two great Poets of his
kind, with whom a comparison can be in place, Senayi, the Author of the
Mystical 'Flower Garden,' and Attar, the Author of the Mystical 'Bird
Dialogues.' But both these works stand, as regards poetic merit, far below
the Mesnevi, which is the Text Book of all The Sofis, from the banks of the
Ganges to the shores of the Bosphorus. The Collection of Jeláleddín's 'Lyrical

Poems'—his Divan, properly so called—'is regarded by them as of still higher
value; it is practically the Law Book and the Ritual of all these Mystics. These
outbursts of the highest inspiration of its kind deserve to be more closely
considered, as it is from them that we see shining forth as in clear splendour
the essence of the Oriental Mysticism, the cardinal Doctrine that All is One—
the view of the ultimate Unity of all Being—and giving with it Direction and
Guidance to the highest goal of Perfection by the contemplative Way of
Divine Love. On the wings of the highest religious enthusiasm, the Sofi, rising
above all the outward forms of positive Religions, adores the Eternal Being,
in the completest abstraction from all that is sensuous and earthly, as the
purest Source of Eternal Light. Mevláná Jeláleddín thus soars, not only like
other Lyrical Poets, such as Hafiz, over Suns and Moons, but even above
Space and Time, above the world of Creation and Fate, above the Original
Contract of Predestination, and beyond the Last Judgment, into the Infinite,
where in Eternal Adoration he melts into One with the Eternal Being, and
infinitely loving, becomes One with the Infinite Love—ever forgetting himself
and having only the great All in his view.'[1]
Thus far the learned Von Hammer. But let us also hear the
judgment of the East itself, of which this is only a Western echo, as
it may be gathered from Devletshah, the greatest native biographer
—the Dr. Johnson we may appropriately say—of the Persian Poets.
Of Jeláleddín, he says:—
'His pure Heart is filled with Divine Mysteries, and through his eradiating
Soul streams the Infinite Light. His View of the World leads the thirsty in the
Vale of the Contemplative Life to the refreshing Fountain of Knowledge; and
his Guidance leads those who have wandered in the Wilderness of Ignorance
into the Gardens where Truth is really known. He makes plain to the Pilgrim
the Secrets of the Way of Unity, and unveils the Mysteries of the Path of
Eternal Truth:
As when the foaming Sea high swells in Wave upon Wave,
It casts out Pearls upon Pearls on every Shore they lave.'
And to cite only one Turkish Authority—for the Turks claim
Jeláleddín as their own, although a Persian of royal race, born at
Balkh, old Bactra, on the ground of his having sung and died at
Qoniya, in Asia Minor (the Iconium of Paul and Barnabas and
Timothy and St. Thecla), whence he was called Rumi 'the Roman,'
usually rendered 'the Greek,' as wonning within the confines of old
Oriental Rome. This is how Fehîm Efendi, the Turkish Historian of the

Persian Literature, himself a Poet, begins his Sketch of the Life of the
great poetic Mystagogue:—
'As the ideal of Searchers after Truth here below, as the pattern of the
Pure, the Mevlana is honoured by great and small among the people, by the
aristocrat and the common man. In all circles his words are held in high
honour; among all the wise his knowledge is greatly esteemed; and no pen
has had the power to praise him, and to celebrate his excellence worthily, or
to describe it in fitting terms.
And should the fancy hold it can
His praise completely reach;
Mevlana's praise it ne'er shall scan—
How say it then in speech?'
Rosen, who gives this quotation, and an excellent rhymed German
translation of part of the Mesnevi, refers to that poem as not only
'one of the most celebrated productions of the Persian Mysticism,
but as being regarded by many Mohammedans as almost equal in
holiness to the Koran and the Sunna.' Being attached, at the time he
wrote, to the German Embassy at Constantinople, Rosen also
mentions that not only did the educated Oriental regard the Mesnevi
as the most perfect Book of Edification, which when its contents
were received into his mind and heart, made him certain of
Salvation; but that even the poor Persian retailers of the products of
their home industries, on the streets, could recite with enthusiasm
long passages from the poems of Jeláleddín. We believe that this
holds true to-day, more or less, of the whole Mohammedan world.[2]
But coming to more familiar names, we might gather a whole
cloud of the most approved witnesses in this connection. Thus Sir
William Jones, the first great Anglo-Indian Scholar, the Columbus of
the new Old World of Sanskrit and Persian Literature, enters with
wonderful sympathy and insight into possession of the Persian and
Hindu Mystical Poetry; he refers to their great Maulavi, and his
astonishing work, The Mesnevi; and he translates the celebrated
opening passage in rhyming couplets which would not have been
unworthy of Pope himself.[3] Sir William Jones did not, indeed, touch
Jeláleddín's Lyrics, but he rendered some precious morsels of Hafiz,
'Odes,' as they are called, both in English and French, in a way that

made young European students and poets, like Herder and Goethe,
turn again to the East with yearning expectant eyes. Similar
testimony might be adduced from Henry Thomas Colebrooke, one of
the very greatest of the successors of Sir W. Jones. The chief
Historian of Persia, and the best informed Persian scholar of his day,
Sir John Malcolm (of Langholm), if less sympathetic than Sir W.
Jones in his painstaking account of the Persian Mystics, gives
likewise the first place to Jeláleddín.[4] And then much more
definitely Sir Gore Ouseley, the first English Biographer of the
Persian Poets, gives Jeláleddín due recognition in connection with
the unrivalled Mesnevi.[5] The Journal of the Asiatic Society, an ever
valuable Magazine of Oriental learning, and the parent of many
others of its kind, has been enriched by the contributions of many
enthusiastic English scholars following in the footsteps of Sir W.
Jones, and it contains the earliest fragments of English translations
of Jeláleddín.[6] Robert Alfred Vaughan, in his Hours with the Mystics,
1856, a popular, sympathetic, and still attractive work, appreciates
Jeláleddín, and compares him with Angelus Silesius and Emerson,
but all his knowledge of the Persian Mystic was derived from Tholuck
and Sir W. Jones. At last competent scholars began to deal worthily
with Jelál's poetry in English. Sir James W. Redhouse has translated
the First Book of the Mesnevi in rhyming couplets, with the utmost
fidelity and care; and another distinguished Persian scholar, Mr.
Whinfield, the most faithful English translator of Omar Khayyám, has
given an abridged version of the whole immense work, which in the
Persian original contains about 70,000 lines.[7] The Mesnevi has thus
come now to be pretty well known by English readers interested in
the subject; and in the last edition of the Encyclopædia Britannica,
Professor Hermann Ethé, an unquestionable authority, in his valuable
Articles on Persian Literature and Jeláleddín Rumi, sums him up as
'the greatest Pantheistic Writer of all ages,' and speaks of 'his
matchless Odes in which he soars on the wings of a genuine
enthusiasm, high over Earth and Heaven, up to the Throne of
Almighty God.' Be it noted, in passing, that it is at least remarkable
how two such different writers as the Turkish Devlet Shah and the
learned German Orientalist should both write of Jeláleddín in terms

that undesignedly, but irresistibly, recall by their very
superlativeness, the famous lines of Dr. Johnson on Shakespeare:—
'Each change of many-coloured Life he drew,
Exhausted Worlds and then imagin'd new;
Existence saw him spurn her bounded reign,
And panting Time toil'd after him in vain.'
All this makes it now intelligible that the late lamented Editor of
the Encyclopædia Britannica, Dr. W. Robertson Smith, when
Professor of Arabic at Cambridge, with the fine insight of the far-
seeing scholar, should have directed the attention of a young,
enthusiastic student to the 'Lyrical Poetry of Jeláleddín Rumi'; and it
is to the loyal devotion of this young scholar that we owe the first
appearance from an English Press of a Volume of forty-eight
'Selected Poems' of Jeláleddín, in a critical Persian Text and with
accurate and elegant prose renderings.[8] Mr. Reynold A. Nicholson
has thus established a right to pronounce judgment on the merits of
Jeláleddín, and we now listen to him with deference, and no longer
with astonishment, when he deliberately characterises him as 'the
greatest Mystical Poet of any Age.'
As the object of this Introduction is only to determine, in some
measure, the literary interest of the Lyrical Poetry—the Díván, as it is
technically called—of Jeláleddín, space need not be taken up by
narrating again what is traditionally known of his Life, and it is the
less necessary as excellent accounts are now easily accessible. Sir
James W. Redhouse gives in somewhat abridged translation El
Eflākī's interesting narrative, with its romantic wreath of legend, and
its quaint anecdotes and racy sayings. Mr. Nicholson furnishes an
excellent summary. Professor Hermann Ethé's notice in the Encycl.
Brit. has been already referred to, and reference may also be made
to his Morgenländische Studien, and his popular Lecture in the
Virchow-Holtzendorff Series, 1888, on 'The Mystical, Didactic, and
Lyrical Poetry, and the later Literature of the Persians,' with its fine
characterization, which we would fain have quoted. Rosen translates
into German the Biographical Sketches of Devletshah and Jāmi.

Professor E. G. Browne's recent 'Literary History of Persia,' which
carries the subject down to A.D. 1000, and is undoubtedly so far the
best History of Persian Literature yet produced, contains appreciative
references to Jeláleddín, with a masterly account of the Sufi
Mysticism; and we look forward with much interest to a
comprehensive and judicial summing up of the great Mystic Poet, by
this high authority upon the whole subject.[9]
II.
The interest of the writer in Jeláleddín has been from the first, and
all through, philosophical and theological rather than specially
historical or textual. This interest was awakened in him by Hegel. In
early student days, when to him as to so many then, the Hegelian
Philosophy was the all in all of his thought, he was startled by the
unwonted enthusiasm with which the great thinker at the climax of
his severest exposition, paused to pay a warm tribute to 'the
excellent Jeláleddín,' when he came into view in the light of the
Supreme Idea of his own System.[10] This passage in Hegel, seems
always to have impressed the students of his own writings, and it
has been frequently referred to both by his German and English
expounders. The greatest speculative Thinker of the Nineteenth
Century, seems to have felt a deep satisfaction in recognising the
affinity of the greatest speculative Poet of the East to his own
deepest thought, while at the same time carefully distinguishing the
clearer and higher form of his own conception. Nay more, although
parsimonious to the utmost of his space and words, in this, the most
condensed and compacted Text Book of Philosophy written in any
European language since Aristotle, the stern German Dialectician in
a comparatively long Foot Note, says he 'cannot refrain' from
quoting several passages from the Poet, in order that the reader
may get a clearer knowledge of his ideas; and he quotes them from
Rückert's Versions, to give, at the same time, some specimens of
'the marvellous Art of the translation.' The Reader who is not
acquainted with German will find Hegel's words accurately translated
by the late Dr. W. Wallace, who also gives an English version of the

Data Management at Scale Piethein Strengholt

More Related Content

Similar to Data Management at Scale Piethein Strengholt

Recently uploaded

Data Management at Scale Piethein Strengholt