In many ways, moving data is like moving furniture: it's an unpleasant process dubbed an occasional necessary evil. But as the data pipelines of old decay, a new reality is taking shape: the data-native architecture. Unlike traditional data processing for BI and Analytics, this approach works on data right where it lives, thus eliminating the pain of forklifting, narrowing the margin of error, and expediting the time to business benefit. The new architecture embodies new assumptions, some of which we will talk about here.
Register for this episode of The Briefing Room to hear veteran Analyst Mark Madsen of Third Nature explain why this shift is truly tectonic. He'll be briefed by Steve Wooledge of Arcadia Data who will showcase his company's technology, which leverages a data-native architecture to fuel rapid-fire visualization and analysis of both big data and small.
Assumptions about Data and Analysis: Briefing room webcast slides
1. Copyright Third Nature, Inc.
“Your assumptions are your
windows on the world. Scrub
them off every once in a while,
or the light won't come in.”
– Isaac Asimov
2. Copyright Third Nature, Inc.
Schema
The BI concept in the DW is simple: one place to funnel data,
one direction of data flow, one model integrated prior to use.
Limited consideration for feedback loops and change
Processing only
happens here
Carefully
controlled
access
here
Peoplehavelimitedability
tocreatenewinformation
Sources
homogenous
and well
understood
Assumes that you have requirements
ahead of time; the data is already
collected, stored, ready to use.
One way flow
3. Copyright Third Nature, Inc.
Success breeds failure
Organizational use of BI
matured over 25 years of data
warehouse history.
BI enabled a shift in managing
from the center of the
organization to the edge, and
that drives new requirements.
Needs have moved from basic
access to more advanced use,
and from the common data to
specific, local ad-hoc needs.
7. Copyright Third Nature, Inc.
What people do with data: not just read it
Explore and
Understand
Inform and
Explain
Convince
and Decide
Deliver
Process
Collect
8. Copyright Third Nature, Inc.
Questions that are not asked in BI
Query
What data do I need?
Known Unknown
Known
What data is
available?
Where is it?
Browse
Search ExploreUnknown
9. Copyright Third Nature, Inc.
- Helmuth von Moltke the Elder,
talking about ETL specifications
Metadata is what you wished your
data looked like.
Reality is not requirements = code
Reality is the data, not the metadata
Exploring data defines metadata
“No battle plan ever survives first contact with
the enemy.”
10. Copyright Third Nature, Inc.
Changing analytics design assumptions
Past assumptions
▪ Center of the org
▪ Global use
▪ Common data
▪ Value in what’s
known, monitoring
▪ Data requirements
found in advance
Present assumptions
▪ Edge of the org
▪ Local use
▪ Specific data
▪ Value in what’s
unknown, discovery
▪ Data requirements
found during process
11. Copyright Third Nature, Inc.
"Always design a thing by considering it in its next
larger context - a chair in a room, a room in a
house, a house in an environment, an environment
in a city plan." – Eliel Saarinen
13. Copyright Third Nature, Inc.
An architectural history of BI tools
First there were files and reporting programs
We had cubes before we had RDBMSs!
Then we had hand-coded SQL, then QBE
Then semantic layers and SQL-generation
And now we’re back to files and cubes
But also new and improved:
Products that embed local and in-memory
datastores inside the tools so they can
deliver direct manipulation (wysiwyg) UIs
14. Copyright Third Nature, Inc.
BI server architecture shifts
The SQL-generating server model of BI scales
extremely well but has poor user response time.
Solution 1: pre-cache
query results or prebuild
datasets on the BI server
(i.e. the old OLAP model)
Well-known problems
with this.
Solution 2: Shove all the
data into a BI server
repository. Avoids subset
problems. Adds potential
scaling problems.
15. Copyright Third Nature, Inc.
There is always a third way
The previous choices were driven by client-server
thinking. We have a distributed (cloud) environment.
Possibilities:
Don’t force all the compute
into the DB or server.
Don’t force all the compute
to the client.
Data on demand, bring it to
the analysis from where it is,
or execute the analysis local
to where the data is.
16. Copyright Third Nature, Inc.
On to Q&A
With that as framing:
▪ How is analysis functionally different from “classic” BI?
▪ What technology capabilities are important in an
analysis tool today?
▪ How does running in a cloud encironment influence the
internal architecture of the product?
17. Copyright Third Nature, Inc.
About the Presenter
Mark Madsen is president of Third Nature, a
technology research and consulting firm
focused on business intelligence, data
integration and data management. Mark is
an award-winning author, architect and CTO
whose work has been featured in numerous
industry publications. Over the past ten years
Mark received awards for his work from the
American Productivity & Quality Center,
TDWI, and the Smithsonian Institute. He is an
international speaker, a contributor to
Forbes Online and on the O’Reilly Strata
program committee. For more information
or to contact Mark, follow @markmadsen on
Twitter or visit http://ThirdNature.net
18. Copyright Third Nature, Inc.
About Third Nature
Third Nature is a research and consulting firm focused on new and emerging technology
and practices in analytics, business intelligence, information strategy and data
management. If your question is related to data, analytics, information strategy and
technology infrastructure then you‘re at the right place.
Our goal is to help organizations solve problems using data. We offer education, consulting
and research services to support business and IT organizations as well as technology
vendors.
We fill the gap between what the industry analyst firms cover and what IT needs. We
specialize in product and technology analysis, so we look at emerging technologies and
markets, evaluating technology and hw it is applied rather than vendor market positions.