Copyright Third Nature, Inc.
“Your assumptions are your
windows on the world. Scrub
them off every once in a while,
or the light won't come in.”
– Isaac Asimov
Copyright Third Nature, Inc.
Schema
The BI concept in the DW is simple: one place to funnel data,
one direction of data flow, one model integrated prior to use.
Limited consideration for feedback loops and change
Processing only
happens here
Carefully
controlled
access
here
Peoplehavelimitedability
tocreatenewinformation
Sources
homogenous
and well
understood
Assumes that you have requirements
ahead of time; the data is already
collected, stored, ready to use.
One way flow
Copyright Third Nature, Inc.
Success breeds failure
Organizational use of BI
matured over 25 years of data
warehouse history.
BI enabled a shift in managing
from the center of the
organization to the edge, and
that drives new requirements.
Needs have moved from basic
access to more advanced use,
and from the common data to
specific, local ad-hoc needs.
Copyright Third Nature, Inc.
This is what success looks like (with only a hammer)
Copyright Third Nature, Inc.
The primary view of BI, self service is publishing data
Copyright Third Nature, Inc.
The old problem was access, the new problem is analysis
Copyright Third Nature, Inc.
What people do with data: not just read it
Explore and
Understand
Inform and
Explain
Convince
and Decide
Deliver
Process
Collect
Copyright Third Nature, Inc.
Questions that are not asked in BI
Query
What data do I need?
Known Unknown
Known
What data is
available?
Where is it?
Browse
Search ExploreUnknown
Copyright Third Nature, Inc.
- Helmuth von Moltke the Elder,
talking about ETL specifications
Metadata is what you wished your
data looked like.
Reality is not requirements = code
Reality is the data, not the metadata
Exploring data defines metadata
“No battle plan ever survives first contact with
the enemy.”
Copyright Third Nature, Inc.
Changing analytics design assumptions
Past assumptions
▪ Center of the org
▪ Global use
▪ Common data
▪ Value in what’s
known, monitoring
▪ Data requirements
found in advance
Present assumptions
▪ Edge of the org
▪ Local use
▪ Specific data
▪ Value in what’s
unknown, discovery
▪ Data requirements
found during process
Copyright Third Nature, Inc.
"Always design a thing by considering it in its next
larger context - a chair in a room, a room in a
house, a house in an environment, an environment
in a city plan." – Eliel Saarinen
Copyright Third Nature, Inc.
IT reality is multiple data stores and systems
Separate, purpose-built databases and processing systems for
different types of data and query / computing workloads, plus any
access method, is the new norm for information delivery.
BI, Dashboards,
analytics, apps
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
1 MargeInovera $150,000 Statistician
2 AnitaBath $120,000 Sewerinspector
3 IvanAwfulitch $160,000 Dermatologist
4 NadiaGeddit $36,000 DBA
Query
processing
Databases Documents Flat Files Objects Streams ERP SaaS Applications
Source Environments
Data
processing
Stream
processing
Copyright Third Nature, Inc.
An architectural history of BI tools
First there were files and reporting programs
We had cubes before we had RDBMSs!
Then we had hand-coded SQL, then QBE
Then semantic layers and SQL-generation
And now we’re back to files and cubes
But also new and improved:
Products that embed local and in-memory
datastores inside the tools so they can
deliver direct manipulation (wysiwyg) UIs
Copyright Third Nature, Inc.
BI server architecture shifts
The SQL-generating server model of BI scales
extremely well but has poor user response time.
Solution 1: pre-cache
query results or prebuild
datasets on the BI server
(i.e. the old OLAP model)
Well-known problems
with this.
Solution 2: Shove all the
data into a BI server
repository. Avoids subset
problems. Adds potential
scaling problems.
Copyright Third Nature, Inc.
There is always a third way
The previous choices were driven by client-server
thinking. We have a distributed (cloud) environment.
Possibilities:
Don’t force all the compute
into the DB or server.
Don’t force all the compute
to the client.
Data on demand, bring it to
the analysis from where it is,
or execute the analysis local
to where the data is.
Copyright Third Nature, Inc.
On to Q&A
With that as framing:
▪ How is analysis functionally different from “classic” BI?
▪ What technology capabilities are important in an
analysis tool today?
▪ How does running in a cloud encironment influence the
internal architecture of the product?
Copyright Third Nature, Inc.
About the Presenter
Mark Madsen is president of Third Nature, a
technology research and consulting firm
focused on business intelligence, data
integration and data management. Mark is
an award-winning author, architect and CTO
whose work has been featured in numerous
industry publications. Over the past ten years
Mark received awards for his work from the
American Productivity & Quality Center,
TDWI, and the Smithsonian Institute. He is an
international speaker, a contributor to
Forbes Online and on the O’Reilly Strata
program committee. For more information
or to contact Mark, follow @markmadsen on
Twitter or visit http://ThirdNature.net
Copyright Third Nature, Inc.
About Third Nature
Third Nature is a research and consulting firm focused on new and emerging technology
and practices in analytics, business intelligence, information strategy and data
management. If your question is related to data, analytics, information strategy and
technology infrastructure then you‘re at the right place.
Our goal is to help organizations solve problems using data. We offer education, consulting
and research services to support business and IT organizations as well as technology
vendors.
We fill the gap between what the industry analyst firms cover and what IT needs. We
specialize in product and technology analysis, so we look at emerging technologies and
markets, evaluating technology and hw it is applied rather than vendor market positions.

Assumptions about Data and Analysis: Briefing room webcast slides

  • 1.
    Copyright Third Nature,Inc. “Your assumptions are your windows on the world. Scrub them off every once in a while, or the light won't come in.” – Isaac Asimov
  • 2.
    Copyright Third Nature,Inc. Schema The BI concept in the DW is simple: one place to funnel data, one direction of data flow, one model integrated prior to use. Limited consideration for feedback loops and change Processing only happens here Carefully controlled access here Peoplehavelimitedability tocreatenewinformation Sources homogenous and well understood Assumes that you have requirements ahead of time; the data is already collected, stored, ready to use. One way flow
  • 3.
    Copyright Third Nature,Inc. Success breeds failure Organizational use of BI matured over 25 years of data warehouse history. BI enabled a shift in managing from the center of the organization to the edge, and that drives new requirements. Needs have moved from basic access to more advanced use, and from the common data to specific, local ad-hoc needs.
  • 4.
    Copyright Third Nature,Inc. This is what success looks like (with only a hammer)
  • 5.
    Copyright Third Nature,Inc. The primary view of BI, self service is publishing data
  • 6.
    Copyright Third Nature,Inc. The old problem was access, the new problem is analysis
  • 7.
    Copyright Third Nature,Inc. What people do with data: not just read it Explore and Understand Inform and Explain Convince and Decide Deliver Process Collect
  • 8.
    Copyright Third Nature,Inc. Questions that are not asked in BI Query What data do I need? Known Unknown Known What data is available? Where is it? Browse Search ExploreUnknown
  • 9.
    Copyright Third Nature,Inc. - Helmuth von Moltke the Elder, talking about ETL specifications Metadata is what you wished your data looked like. Reality is not requirements = code Reality is the data, not the metadata Exploring data defines metadata “No battle plan ever survives first contact with the enemy.”
  • 10.
    Copyright Third Nature,Inc. Changing analytics design assumptions Past assumptions ▪ Center of the org ▪ Global use ▪ Common data ▪ Value in what’s known, monitoring ▪ Data requirements found in advance Present assumptions ▪ Edge of the org ▪ Local use ▪ Specific data ▪ Value in what’s unknown, discovery ▪ Data requirements found during process
  • 11.
    Copyright Third Nature,Inc. "Always design a thing by considering it in its next larger context - a chair in a room, a room in a house, a house in an environment, an environment in a city plan." – Eliel Saarinen
  • 12.
    Copyright Third Nature,Inc. IT reality is multiple data stores and systems Separate, purpose-built databases and processing systems for different types of data and query / computing workloads, plus any access method, is the new norm for information delivery. BI, Dashboards, analytics, apps 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA 1 MargeInovera $150,000 Statistician 2 AnitaBath $120,000 Sewerinspector 3 IvanAwfulitch $160,000 Dermatologist 4 NadiaGeddit $36,000 DBA Query processing Databases Documents Flat Files Objects Streams ERP SaaS Applications Source Environments Data processing Stream processing
  • 13.
    Copyright Third Nature,Inc. An architectural history of BI tools First there were files and reporting programs We had cubes before we had RDBMSs! Then we had hand-coded SQL, then QBE Then semantic layers and SQL-generation And now we’re back to files and cubes But also new and improved: Products that embed local and in-memory datastores inside the tools so they can deliver direct manipulation (wysiwyg) UIs
  • 14.
    Copyright Third Nature,Inc. BI server architecture shifts The SQL-generating server model of BI scales extremely well but has poor user response time. Solution 1: pre-cache query results or prebuild datasets on the BI server (i.e. the old OLAP model) Well-known problems with this. Solution 2: Shove all the data into a BI server repository. Avoids subset problems. Adds potential scaling problems.
  • 15.
    Copyright Third Nature,Inc. There is always a third way The previous choices were driven by client-server thinking. We have a distributed (cloud) environment. Possibilities: Don’t force all the compute into the DB or server. Don’t force all the compute to the client. Data on demand, bring it to the analysis from where it is, or execute the analysis local to where the data is.
  • 16.
    Copyright Third Nature,Inc. On to Q&A With that as framing: ▪ How is analysis functionally different from “classic” BI? ▪ What technology capabilities are important in an analysis tool today? ▪ How does running in a cloud encironment influence the internal architecture of the product?
  • 17.
    Copyright Third Nature,Inc. About the Presenter Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor to Forbes Online and on the O’Reilly Strata program committee. For more information or to contact Mark, follow @markmadsen on Twitter or visit http://ThirdNature.net
  • 18.
    Copyright Third Nature,Inc. About Third Nature Third Nature is a research and consulting firm focused on new and emerging technology and practices in analytics, business intelligence, information strategy and data management. If your question is related to data, analytics, information strategy and technology infrastructure then you‘re at the right place. Our goal is to help organizations solve problems using data. We offer education, consulting and research services to support business and IT organizations as well as technology vendors. We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.