To co-opt an old adage: “If data gets lost and no one knows where to find it, does it still take up hard-drive space?” In the interest of avoiding that unfortunate philosophical end, individual data structures enable sorting, storage, and organization of data so that it can be retrieved and used efficiently. Applying the correct data structure to different types of data—whether master, reference, or analytics—allows your organization to tailor its data management to fit its unique business needs.
In this webinar, we will:
Discuss the various data structures available and when to use each one, as well as different design styles for analytics
Illustrate how data structures should support your organizational data strategy
Demonstrate how each method can contribute to business value
Data Structures - The Cornerstone of Your Data’s Home
1. Tom Gartland & Peter Aiken, PhD
Data Structures
The Cornerstone of your Data's Home
Copyright 2017 by Data Blueprint Slide # 1
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with
Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
Peter Aiken, Ph.D.
• 33+ years in data management
• Repeated international recognition
• Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• DAMA International (dama.org)
• 10 books and dozens of articles
• Experienced w/ 500+ data
management practices
• Multi-year immersions:
– US DoD (DISA/Army/Marines/DLA)
– Nokia
– Deutsche Bank
– Wells Fargo
– Walmart
– … PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
The Case for the
Chief Data Officer
Recasting the C-Suite to Leverage
Your MostValuable Asset
Peter Aiken and
Michael Gorman
2
Copyright 2017 by Data Blueprint Slide #
2. Tom Gartland
• A 30+ year veteran of IT, Tom
has done everything:
– Quality assurance
– Programming
– Data analysis
– Architecting
– Business intelligence
– Project management
• Across a variety of sectors and
industries
– Finance
– Private health care
– Charity health care
– Government services
– Construction
– Discrete manufacturing
– Process manufacturing
– Retail
– Telecommunications
– Consulting
3Copyright 2017 by Data Blueprint Slide #
• Tom spends much of his personal time with
his wife and 7 Rhodesian Ridgebacks
4Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
3. Maslow's Hierarchy of Needs
5Copyright 2017 by Data Blueprint Slide #
You can accomplish
Advanced Data Practices
without becoming proficient
in the Foundational Data
Practices however
this will:
• Take longer
• Cost more
• Deliver less
• Present
greater
risk (with thanks to
Tom DeMarco)
Data Management Practices Hierarchy
Advanced
Data
Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Practices
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
6Copyright 2017 by Data Blueprint Slide #
4. DMM℠ Structure of
5 Integrated
DM Practice Areas
Data architecture
implementation
Data
Governance
Data
Management
Strategy
Data
Operations
Platform
Architecture
Supporting
Processes
Maintain fit-for-purpose data,
efficiently and effectively
7Copyright 2017 by Data Blueprint Slide #
Manage data coherently
Manage data assets professionally
Data life cycle
management
Organizational support
Data
Quality
DMM℠ Structure of
5 Integrated
DM Practice Areas
Data
Governance
Data
Management
Strategy
Data
Operations
Platform
Architecture
Supporting
Processes
8Copyright 2017 by Data Blueprint Slide #
Data
Quality
3 3
33
1
Strategy is often the
weakest link!
5. 9Copyright 2017 by Data Blueprint Slide #
Data Management
Body of Knowledge
(DM BoK V2)
Practice Areas
To do any of
these well
requires specific
knowledge of the
relevant data
structures!
10Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
6. Without Data Structures ...
11Copyright 2017 by Data Blueprint Slide #
• Water into wine
• Coal into gold
• Proper usage is:
– Semi-structured into more structured
– Non-tabular data into tabular data
– Operational question: how much of it?
12Copyright 2017 by Data Blueprint Slide #
Unstructured data cannot be
transformed into structured data!
Wrappers
7. What is a data structure?
• "An organization of information
• usually in memory (for better algorithm efficiency)
• such as queue, stack, linked list, heap, dictionary, and tree, or
• conceptual unity, such as the name and address of a person.
• It may include redundant information, such as length of the list or
number of nodes in a subtree."
• Some data structure characteristics
– Grammar (rules) for data objects
– Constraints for data objects
– Sequential order
– Uniqueness
– Arrangement
• Hierarchical, relational,
network, other
– Balance
– Optimality
http://www.nist.gov/dads/HTML/datastructur.html
13Copyright 2017 by Data Blueprint Slide #
How are data structures expressed as architectures?
• Details are
organized into
larger
components
• Larger
components are
organized into
models
• Models are
organized into
architectures
A B
C D
A B
C D
A
D
C
B
14Copyright 2017 by Data Blueprint Slide #
8. How are data structures expressed as architectures?
• Attributes are organized into
entities/objects
– Attributes are characteristics of "things"
– Entitles/objects are "things" whose
information is managed in support of strategy
– For example: person (name, dob, res, kids, phone)
• Entities/objects are organized into models
– Combinations of attributes and entities are
structured to represent information requirements
– Poorly structured data, constrains organizational information delivery
capabilities
– For example: sales model, accounting model, reporting model
• Models are organized into architectures
– When building new systems, architectures are used to plan development
– More often, data managers do not know what existing architectures are and -
therefore - cannot make use of them in support of strategy implementation
– For example: financial architecture or business intelligence architecture
15Copyright 2017 by Data Blueprint Slide #
Sample Data Architecture Overview
16Copyright 2017 by Data Blueprint Slide #
9. 17Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
History (such as it is)
• Automate existing manual
processing
• Data management was:
– Running millions of punched
cards through banks of sorting,
collating & tabulating machines
– Results printed on paper or
punched onto more cards
– Data management meant physically storing and hauling around
punched cards
• Tasks (check signing, calculating, and machine control)
were implemented to provide automated support for
departmental-based processing
• Creating information silos
• Data Processing Manager
18Copyright 2017 by Data Blueprint Slide #
10. • Data Processing Manager
Chief Information Officer
19Copyright 2017 by Data Blueprint Slide #
CFO Necessary Prerequisites/Qualifications
• CPA
• CMA
• Masters of Accountancy
• Other recognized
degrees/certifications
• These are necessary
but insufficient
prerequisites/qualifications
20Copyright 2017 by Data Blueprint Slide #
11. CIO Qualifications
• No specific qualifications
• Typically technological fields:
– Computer science
– Software engineering
– Information systems
• Business
– Master of Business Administration
– Master of Science in Management
• Business acumen and strategic perspectives have taken
precedence over technical skills.
– CIOs appointed from the business side of the organization
• Especially if they have project management skills.
21Copyright 2017 by Data Blueprint Slide #
What do we teach knowledge workers about data?
What percentage of them deal with it daily?
22Copyright 2017 by Data Blueprint Slide #
12. 23Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Leverage
Less ROT
Technologies
Process
People
• Permits organizations to better manage their sole non-depleteable,
non-degrading, durable, strategic asset - data
– within the organization, and
– with organizational data exchange partners
• Leverage
– Obtained by implementation of data-centric technologies, processes, and
human skill sets
– Increased by elimination of data ROT (redundant, obsolete, or trivial)
• The bigger the organization, the greater potential leverage exists
• Treating data more asset-like simultaneously
1. lowers organizational IT costs and
2. increases organizational knowledge worker productivity
24Copyright 2017 by Data Blueprint Slide #
13. Data Structure Questions
Program F
Program E
Program D
Program G
Program H
Program I
Application
domain 2Application
domain 3
• Who makes decisions about the range and scope of
common data usage?
25Copyright 2017 by Data Blueprint Slide #
Running Query
26Copyright 2017 by Data Blueprint Slide #
14. Optimized Query
27Copyright 2017 by Data Blueprint Slide #
Repeat 100s, thousands, millions of times ...
28Copyright 2017 by Data Blueprint Slide #
15. 29Copyright 2017 by Data Blueprint Slide #
Data structures organized into an Architecture
• How do data structures support organizational
strategy?
• Consider the opposite question?
– Were your systems explicitly designed to be
integrated or otherwise work together?
– If not, then what is the likelihood that they will work
well together?
– In all likelihood your organization is spending
between 20-40% of its IT budget compensating for
poor data structure integration
– They cannot be helpful as long as their structure is
unknown
• Two answers/two separate strategies
– Achieving efficiency and
effectiveness goals
– Providing organizational dexterity for rapid
implementation
30Copyright 2017 by Data Blueprint Slide #
16. Data Models Used to Support Strategy
• Flexible, adaptable data structures
• Cleaner, less complex code
• Ensure strategy effectiveness measurement
• Build in future capabilities
• Form/assess merger and acquisitions strategies
31Copyright 2017 by Data Blueprint Slide #
Employee
Type
Employee
Sales
Person
Manager
Manager
Type
Staff
Manager
Line
Manager
Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992
5 Basic Data Structures
Indexed Sequential File: Built-in index permits location of
records of persons with last names starting with "T"
Index
Program: Where is the record for person
"Townsend?"
Index: Start looking here where the
"Ts" are stored
Relational Database: Records are related to
each other using relationships describable using relational
algebra
Flat File: Records are typically sorted
according to some criteria and must be
searched from the beginning for each access
Program: Must start at the beginning
and read each record when looking for
person "Townsend?"
Network Database: Records are related to each
other using arranged master records associated with
multiple detail records using linked lists and pointers Associative
Concept-oriented
Multi-dimensional
XML database
3NF
Star schema
Data Vault
Hierarchical Database: Records are related to each other
hierarchically using 'parent child' relationships
32Copyright 2017 by Data Blueprint Slide #
17. • The thought of a single monolithic data store which can
service all of an organization’s information needs has long
since been abandoned. In the modern data management
topology, multiple data stores are created to service
specific processing needs and user groups within the
organization.
• Implications:
– The needs characteristics of the
multitude of the audiences served
by the data structures
– Data lifecycle
– The design styles (old and new) utilized
to organize the data to service the audiences
– A breakdown of the various stores
– The resultant store characteristics
Single
Data Store
One Size does not satisfy all needs
33Copyright 2017 by Data Blueprint Slide #
Payroll Application
(3rd GL)Payroll Data
(database)
R& D Applications
(researcher supported, no documentation)
R & D
Data
(raw) Mfg. Data
(home grown
database)
Mfg. Applications
(contractor supported)
Finance
Data
(indexed)
Finance Application
(3rd GL, batch
system, no source)
Marketing Application
(4rd GL, query facilities,
no reporting, very large)
Marketing Data
(external database)
Personnel App.
(20 years old,
un-normalized data)
Personnel Data
(database)
Typical System Evolution
34Copyright 2017 by Data Blueprint Slide #
18. The Situation
35Copyright 2017 by Data Blueprint Slide #
How many interfaces are required to solve this integration problem?
Application 4 Application 5 Application 6
15 Interfaces
(N*(N-1))/2
Application 1 Application 2 Application 3
RBC: 200 applications - 4900 batch interfaces
36Copyright 2017 by Data Blueprint Slide #
20. Conclusions
• 1 data structure is not
enough
• Most organizations have
far too many different
data structures and they
become barriers to
progress and integration
• Not much expertise to
figure out these
challenges
39Copyright 2017 by Data Blueprint Slide #
40Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
21. Data Personas (The Requirements)
Operational
Performer
Interested in alerts,
notifications and
reporting based on
current values (real-
time) data. They use the
information to make
decisions and changes
in the transactional
systems. These
changes are targeted to
improve the
organizations ability to
deliver in the short term.
Operational Analyst
(Manager)
Interested in aggregated
real-time data for their
domain of responsibility.
The data is displayed
using visualization
techniques of
scorecards, charts and
reports, preferably within
a single dashboard. The
searching is for
favorable/unfavorable
trends to indicate
adjustments are needed
in the staff & resource
allocations.
Data Analyst
Responsible to support
detailed and typically
complex analysis
requests from business
users/consumers of
data. The analyst role
span both the
operational and
historical time windows
and thus they need to be
versed in both the
operational and analytic
environments.
Data Miner/
Scientist
Responsible for using
statistical and machine
learning techniques to
identify patterns from
the data. These patterns
are correlated into
insights and actions for
better business
outcomes. The miner
may use operational
and historical data for
research.
Executive Consumer
Receives the data
through summary
dashboards with drill
down/through
capabilities. Request
detailed analysis and
reporting on High Value
Question from the Data
Analyst and Data
Miners. These
consumers are looking
at the data to make
short and long term
decisions to improve the
organizational efficiency
and customer
experience.
Operational Analytic
41Copyright 2017 by Data Blueprint Slide #
• Operational interest is high when data is introduced to the
operational stores. This interest wanes over time.
• Analytic interest is low when data is first introduced. The
interest increases as the data is collected and combined
with other enterprise data.
Persona Data Interest
Operational
Interest
Analytic
Interest
Interest
Time
42Copyright 2017 by Data Blueprint Slide #
Time
Interest
22. Development Standards/Concrete Blocks
43Copyright 2017 by Data Blueprint Slide #
Example: Set Analysis
44Copyright 2017 by Data Blueprint Slide #
from MicroStrategy, Better Business Decisions Every Day: Integrating Business Reporting & Analysis
23. 45Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Topology Today Can Be Complex
46Copyright 2017 by Data Blueprint Slide #
Data Mart
Master
Data
OLTP 1
OLTP 2
OLTP n...
Enterprise Data
Warehouse
(EDW)
Operational
Data Store
(ODS)
Data Mart
Data Mart
Event Data StoreBus OPS Events Tech OPS Events
Technical MetadataMetadata StoreBusiness Metadata
24. Data Store Purpose a review of the Data Topology
• Master Data
– Master Data is the term used to describe the data domains that
drive business activities. Master data is the data that must first be
in place before business transactions can occur. Master data is
often shared across the organizational business units and it is
typically at the center of business strategies. The transaction
defines the business/process event (order, dispatch, sales) while
the Master Data describes the ‘who’ (customers, drivers, account
reps), the ‘what’ (load), the ‘when’ (date, time) and the
‘where’ (origin and destination location).
• Online Transaction Processing (OLTP)
– “Transactional data” is the term used to describe the data involved
in the execution of the business activities. Transactional data
associates master data (i.e. customers and products) to a business
activity that often represents a unit or work, such as the creation of
an order.
• The Master Data and OLTP stores are where data is initially created
and persisted within the organization’s data and thus carry a special
classification of System of Record (SOR). They are created to capture
the transactional data as it arrives and makes the data available for the
processes and services. The data arrives into these databases through
manual entry or automated feeds. These data stores are logically (and
sometimes physically) separated by the transactional subject area they
are created to serve.
OLTP1
OLTP2
OLTPn...
Master
Data
47Copyright 2017 by Data Blueprint Slide #
Data Store Purpose a review of the Data Topology
• Online Transaction Processing (OLTP)
– “Transactional data” is the term used to describe the data involved
in the execution of the business activities. Transactional data
associates master data (i.e. customers and products) to a business
activity that often represents a unit or work, such as the creation of
an order.
– The Master Data and OLTP stores are where data is initially created
and persisted within the organization’s data and thus carry a special
classification of System of Record (SOR). They are created to
capture the transactional data as it arrives and makes the data
available for the processes and services. The data arrives into these
databases through manual entry or automated feeds. These data
stores are logically (and sometimes physically) separated by the
transactional subject area they are created to serve.
• Master Data
– Master Data is the term used to describe the data domains that
drive business activities. Master data is the data that must first be in
place before business transactions can occur. Master data is often
shared across the organizational business units and it is typically at
the center of business strategies. The transaction defines the
business/process event (order, dispatch, sales) while the Master
Data describes the ‘who’ (customers, drivers, account reps), the
‘how’ (order delivery type), the ‘when’ (date, time) and the
‘where’ (location, destination).
48Copyright 2017 by Data Blueprint Slide #
OLTP 1
OLTP 2
OLTP n...
Master
Data
25. Data Store Purpose a review of the Data Topology
• Operational Data Store (ODS)
– An Operational Data Store (ODS) is created to integrate data from two
or more SORs for the purposes of data integration. The ODS is
normally created to satisfy reporting needs across functional SOR
boundaries. The ODS should hold very little historical information and
should focus on maintaining the most up-to-date data needed by the
organization for daily operations. Depending on the application
requirements, the ODS may institute a near real-time data feed from
the source applications. The ODS is expected to be technically
accurate and is considered to be an Authoritative Source. The data it
contains can be used for non-critical needs instead of having to access
the SOR. The more frequently the data is pushed into the ODS
environment, the less reliance there will be on direct access to SORs
for data reporting needs.
• Enterprise Data Warehouse (EDW)
– An Enterprise Data Warehouse (EDW) is responsible for collection and
integration of data from either SORs or from the Operational Data
Store. An EDW has an enterprise scope as it will pull from many (if not
all) SORs. The focus of the data warehouse is to be historical in nature
and in many instances is loaded with a latency (every 24 hours). The
data warehouse is created to support historical analytics. The
expectation of the data warehouse is to be exhaustive in the data it
collects with a focus being on collecting and storing of the data.
EnterpriseData
Warehouse
(EDW)
Operational
DataStore
(ODS)
49Copyright 2017 by Data Blueprint Slide #
Data Store Purpose a review of the Data Topology
• Data Marts
– A Data Mart is a subset of a data warehouse, it
is created to address specific questions and/or
subject area of questions. A Data Mart is built
and tuned to deliver the data to the end users,
it exists to get the data out from the data
warehouse.
Data Mart
50Copyright 2017 by Data Blueprint Slide #
26. Data Store Purpose a review of the Data Topology
• Event Data Store
– Is the data store which logs, stores and reports the discrete
business and technical events which occur within the
process. This data store is a critical, and often overlooked
data domain for managing, controlling and creating
transparency into the business processes. The events are
used to report out the overall health of the processes in
both business and technical terms. This consolidated
solution is key to obtaining a 360 view of the processes.
• Metadata Store
– Metadata is a broad term which includes descriptive
elements in both business and technical terms. It covers:
business terms, data elements descriptions, element
display formats, element valid values, element quality
targets, etc. Metadata is critical to an organization as it
describes the organization’s business and processing
infrastructure in detail. Metadata is entertainingly defined
as “data about the data”. That is, Metadata characterizes
other data and makes it easier to retrieve, interpret and use
information.
Technical
Metadata
Metadata
Store
Business
Metadata
Event
Data
Store
BusOPS
Events
TechOPS
Events
51Copyright 2017 by Data Blueprint Slide #
Operational i
n
c
o
n
t
r
a
s
t
w
i
t
h
Analytic
Subject-Oriented
Databases which are focused on a
single or small set of business
functions
Integrated
Collecting and semantically aligning
data from disparate sources to achieve
a homogeneous view
Volatile
Data which may change frequently
Non-Volatile
Data for which entered into the
database will not change
Atomic
Low grain data, each transaction, each
order with all of the attributes
Aggregate
A summary of multiple orders or
transactions performed to transform
the atomic detail into more
comprehensible information
Current Valued: The data and the
system represents what is current in
this moment; not yesterday, not last
week --- now
Time Variant Data: is marked and
stored with a date/time element where
questions of what was it yesterday and
last week can be answered
Data Store Characteristics
52Copyright 2017 by Data Blueprint Slide #
27. 53Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Structure Design Styles
• 3rd Normal Form (3NF)
– Inmon
• Dimensional
– Kimball
• Data Vault
– Lindstad
54Copyright 2017 by Data Blueprint Slide #
28. Design Styles – 3NF
• 3rd Normal Form Modeling
• A mathematical data design
technique founded in the early
70s by E.F. Codd.
• Organizes data in simple rows
and columns - Entities
• Creates connections between the
entities called relationships to show how the data is inter-related
• It is purest form 3NF removes all data redundancies – a piece of
data is stored only once
• 3NF is based on mathematics, give the same facts to different
modelers; the model should be the same.
• Creates a visual (Entity Relation Diagram - ERD) which may be
understood by less technical personnel
• 3NF is the modeling style most popularly used for operationally
focused data stores.
55Copyright 2017 by Data Blueprint Slide #
Inmon Implementation
56Copyright 2017 by Data Blueprint Slide #
29. Design Styles – Dimensional
• Created and refined by Ralph
Kimball in the 80s.
• Organizes data in Facts
and Dimensions. Fact
tables record the events
(what) within the business domain
and the Dimension tables describe
who, when, how and where.
• The data design style was created to
exploit the capabilities of the relational database to retrieve
and report against large volumes of data.
• Dimensional modeling sacrifices storage efficiency for
analytical processing speed
• There are 2 variations to Dimensional Modeling: Star Schema
and Snowflake
57Copyright 2017 by Data Blueprint Slide #
Kimball Implementation
58Copyright 2017 by Data Blueprint Slide #
30. Design Styles – Data Vault
• One of the newer relational database modeling techniques
• Data Vault modeling was conceived in the 1990s by Dan
Linstedt
• Data Vault models are designed for central data
warehouses that store non-volatile, time-variant, atomic
data
• Relationships are defined through Link structures which
promote flexibility and extensibility
59Copyright 2017 by Data Blueprint Slide #
Data Vault Implementation
60Copyright 2017 by Data Blueprint Slide #
31. Hybrid Approach
• (http://www.kimballgroup.com/2004/03/03/differences-of-opinion/)
• Learn Data Vault – “dv-in-kimball-bus-architecture”
61Copyright 2017 by Data Blueprint Slide #
DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE
O
P
E
R
A
T
I
O
N
A
L
Master Data
OLTP
ODS
Event
A
N
A
L
Y
T
I
C
Data Warehouse
Data Mart
Summary/Take Aways
DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE
O
P
E
R
A
T
I
O
N
A
L
Master Data
Operations Manager
Operational Analyst
Subject Oriented
Volatile
Atomic
Current Valued
3NF
OLTP
Operational Performer
Operations Manager
Subject Oriented
Volatile
Atomic
Current Valued
3NF
ODS
Operational Manager
Operational Analyst
Executive Consumer
Integrated
Volatile
Atomic
Current Valued
3NF
Event All Personas
Integrated
Volatile
Atomic
Current Valued
3NF
A
N
A
L
Y
T
I
C
Data Warehouse Data Miner/Scientist
Integrated
Non-volatile
Atomic
Time Variant
3NF trending to
Data Vault
Data Mart
Operational Analyst
Data Analyst
Executive Consumer
Subject Oriented
Non-volatile
Atomic -or- Aggregated
Time Variant
Dimensional
62Copyright 2017 by Data Blueprint Slide #
32. Outline: Design/Manage Data Structures
63Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Upcoming Events
September Webinar:
Implementing Big Data, NOSQL, & HADOOP – Bigger is (Usually) Better
September 12, 2017 @ 2:00 PM ET/11:00 AM PT
Sign up here:
• www.datablueprint.com/webinar-schedule
• www.Dataversity.net
Brought to you by:
64Copyright 2017 by Data Blueprint Slide #
33. Questions?
+ =
65Copyright 2017 by Data Blueprint Slide #
10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056
Copyright 2017 by Data Blueprint Slide #
66