SlideShare a Scribd company logo
1 of 33
Download to read offline
Tom Gartland & Peter Aiken, PhD
Data Structures
The Cornerstone of your Data's Home
Copyright 2017 by Data Blueprint Slide # 1
• DAMA International President 2009-2013
• DAMA International Achievement Award 2001 (with
Dr. E. F. "Ted" Codd
• DAMA International Community Award 2005
Peter Aiken, Ph.D.
• 33+ years in data management
• Repeated international recognition
• Founder, Data Blueprint (datablueprint.com)
• Associate Professor of IS (vcu.edu)
• DAMA International (dama.org)
• 10 books and dozens of articles
• Experienced w/ 500+ data
management practices
• Multi-year immersions:

– US DoD (DISA/Army/Marines/DLA)

– Nokia

– Deutsche Bank

– Wells Fargo

– Walmart

– … PETER AIKEN WITH JUANITA BILLINGS
FOREWORD BY JOHN BOTTEGA
MONETIZING
DATA MANAGEMENT
Unlocking the Value in Your Organization’s
Most Important Asset.
The Case for the
Chief Data Officer
Recasting the C-Suite to Leverage
Your MostValuable Asset
Peter Aiken and
Michael Gorman
2
Copyright 2017 by Data Blueprint Slide #
Tom Gartland






• A 30+ year veteran of IT, Tom
has done everything:
– Quality assurance
– Programming
– Data analysis
– Architecting
– Business intelligence
– Project management
• Across a variety of sectors and
industries
– Finance
– Private health care
– Charity health care
– Government services
– Construction
– Discrete manufacturing
– Process manufacturing
– Retail
– Telecommunications
– Consulting
3Copyright 2017 by Data Blueprint Slide #
• Tom spends much of his personal time with 

his wife and 7 Rhodesian Ridgebacks
4Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Maslow's Hierarchy of Needs
5Copyright 2017 by Data Blueprint Slide #
You can accomplish
Advanced Data Practices
without becoming proficient
in the Foundational Data
Practices however 

this will:
• Take longer
• Cost more
• Deliver less
• Present 

greater

risk
(with thanks to 

Tom DeMarco)
Data Management Practices Hierarchy
Advanced 

Data 

Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Practices
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
6Copyright 2017 by Data Blueprint Slide #
DMM℠ Structure of 

5 Integrated 

DM Practice Areas
Data architecture
implementation
Data 

Governance
Data 

Management

Strategy
Data 

Operations
Platform

Architecture
Supporting

Processes
Maintain fit-for-purpose data,
efficiently and effectively
7Copyright 2017 by Data Blueprint Slide #
Manage data coherently
Manage data assets professionally
Data life cycle
management
Organizational support
Data 

Quality
DMM℠ Structure of 

5 Integrated 

DM Practice Areas
Data 

Governance
Data 

Management

Strategy
Data 

Operations
Platform

Architecture
Supporting

Processes
8Copyright 2017 by Data Blueprint Slide #
Data 

Quality
3 3
33
1
Strategy is often the
weakest link!
9Copyright 2017 by Data Blueprint Slide #
Data Management 

Body of Knowledge 

(DM BoK V2)

Practice Areas
To do any of
these well
requires specific
knowledge of the
relevant data
structures!
10Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Without Data Structures ...
11Copyright 2017 by Data Blueprint Slide #
• Water into wine
• Coal into gold
• Proper usage is:
– Semi-structured into more structured
– Non-tabular data into tabular data
– Operational question: how much of it?
12Copyright 2017 by Data Blueprint Slide #
Unstructured data cannot be
transformed into structured data!
Wrappers
What is a data structure?
• "An organization of information
• usually in memory (for better algorithm efficiency)
• such as queue, stack, linked list, heap, dictionary, and tree, or
• conceptual unity, such as the name and address of a person.
• It may include redundant information, such as length of the list or
number of nodes in a subtree."
• Some data structure characteristics
– Grammar (rules) for data objects
– Constraints for data objects
– Sequential order
– Uniqueness
– Arrangement
• Hierarchical, relational, 

network, other
– Balance
– Optimality
http://www.nist.gov/dads/HTML/datastructur.html
13Copyright 2017 by Data Blueprint Slide #
How are data structures expressed as architectures?
• Details are
organized into 

larger
components
• Larger
components are
organized into
models
• Models are
organized into
architectures
A B
C D
A B
C D
A
D
C
B
14Copyright 2017 by Data Blueprint Slide #
How are data structures expressed as architectures?
• Attributes are organized into 

entities/objects
– Attributes are characteristics of "things"
– Entitles/objects are "things" whose 

information is managed in support of strategy
– For example: person (name, dob, res, kids, phone)
• Entities/objects are organized into models
– Combinations of attributes and entities are 

structured to represent information requirements
– Poorly structured data, constrains organizational information delivery
capabilities
– For example: sales model, accounting model, reporting model
• Models are organized into architectures
– When building new systems, architectures are used to plan development
– More often, data managers do not know what existing architectures are and -
therefore - cannot make use of them in support of strategy implementation
– For example: financial architecture or business intelligence architecture
15Copyright 2017 by Data Blueprint Slide #
Sample Data Architecture Overview
16Copyright 2017 by Data Blueprint Slide #
17Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
History (such as it is)
• Automate existing manual 

processing
• Data management was:
– Running millions of punched 

cards through banks of sorting, 

collating & tabulating machines
– Results printed on paper or 

punched onto more cards
– Data management meant physically storing and hauling around
punched cards
• Tasks (check signing, calculating, and machine control)
were implemented to provide automated support for
departmental-based processing
• Creating information silos
• Data Processing Manager
18Copyright 2017 by Data Blueprint Slide #
• Data Processing Manager
Chief Information Officer
19Copyright 2017 by Data Blueprint Slide #
CFO Necessary Prerequisites/Qualifications
• CPA
• CMA
• Masters of Accountancy
• Other recognized 

degrees/certifications
• These are necessary 

but insufficient 

prerequisites/qualifications
20Copyright 2017 by Data Blueprint Slide #
CIO Qualifications
• No specific qualifications
• Typically technological fields:
– Computer science
– Software engineering
– Information systems
• Business
– Master of Business Administration
– Master of Science in Management
• Business acumen and strategic perspectives have taken
precedence over technical skills.
– CIOs appointed from the business side of the organization
• Especially if they have project management skills.
21Copyright 2017 by Data Blueprint Slide #
What do we teach knowledge workers about data?
What percentage of them deal with it daily?
22Copyright 2017 by Data Blueprint Slide #
23Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Leverage
Less ROT
Technologies
Process
People
• Permits organizations to better manage their sole non-depleteable,
non-degrading, durable, strategic asset - data
– within the organization, and
– with organizational data exchange partners
• Leverage
– Obtained by implementation of data-centric technologies, processes, and
human skill sets
– Increased by elimination of data ROT (redundant, obsolete, or trivial)
• The bigger the organization, the greater potential leverage exists
• Treating data more asset-like simultaneously
1. lowers organizational IT costs and
2. increases organizational knowledge worker productivity
24Copyright 2017 by Data Blueprint Slide #
Data Structure Questions
Program F
Program E
Program D
Program G
Program H
Program I
Application
domain 2Application
domain 3
• Who makes decisions about the range and scope of
common data usage?
25Copyright 2017 by Data Blueprint Slide #
Running Query
26Copyright 2017 by Data Blueprint Slide #
Optimized Query
27Copyright 2017 by Data Blueprint Slide #
Repeat 100s, thousands, millions of times ...
28Copyright 2017 by Data Blueprint Slide #
29Copyright 2017 by Data Blueprint Slide #
Data structures organized into an Architecture
• How do data structures support organizational
strategy?
• Consider the opposite question?
– Were your systems explicitly designed to be
integrated or otherwise work together?
– If not, then what is the likelihood that they will work
well together?
– In all likelihood your organization is spending
between 20-40% of its IT budget compensating for
poor data structure integration
– They cannot be helpful as long as their structure is
unknown
• Two answers/two separate strategies
– Achieving efficiency and 

effectiveness goals
– Providing organizational dexterity for rapid
implementation
30Copyright 2017 by Data Blueprint Slide #
Data Models Used to Support Strategy
• Flexible, adaptable data structures
• Cleaner, less complex code
• Ensure strategy effectiveness measurement
• Build in future capabilities
• Form/assess merger and acquisitions strategies
31Copyright 2017 by Data Blueprint Slide #
Employee

Type
Employee
Sales

Person
Manager
Manager

Type
Staff

Manager
Line

Manager
Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992
5 Basic Data Structures
Indexed Sequential File: Built-in index permits location of
records of persons with last names starting with "T"
Index
Program: Where is the record for person
"Townsend?"
Index: Start looking here where the
"Ts" are stored
Relational Database: Records are related to
each other using relationships describable using relational
algebra
Flat File: Records are typically sorted
according to some criteria and must be
searched from the beginning for each access
Program: Must start at the beginning
and read each record when looking for
person "Townsend?"
Network Database: Records are related to each
other using arranged master records associated with
multiple detail records using linked lists and pointers Associative
Concept-oriented
Multi-dimensional
XML database

3NF

Star schema

Data Vault
Hierarchical Database: Records are related to each other
hierarchically using 'parent child' relationships
32Copyright 2017 by Data Blueprint Slide #
• The thought of a single monolithic data store which can
service all of an organization’s information needs has long
since been abandoned. In the modern data management
topology, multiple data stores are created to service
specific processing needs and user groups within the
organization.
• Implications:
– The needs characteristics of the 

multitude of the audiences served 

by the data structures
– Data lifecycle
– The design styles (old and new) utilized 

to organize the data to service the audiences
– A breakdown of the various stores
– The resultant store characteristics
Single
Data Store
One Size does not satisfy all needs
33Copyright 2017 by Data Blueprint Slide #
Payroll Application

(3rd GL)Payroll Data
(database)
R& D Applications

(researcher supported, no documentation)
R & D
Data
(raw) Mfg. Data
(home grown
database)
Mfg. Applications

(contractor supported)


Finance
Data
(indexed)
Finance Application

(3rd GL, batch 

system, no source)
Marketing Application

(4rd GL, query facilities, 

no reporting, very large)


Marketing Data
(external database)
Personnel App.

(20 years old,

un-normalized data)


Personnel Data

(database)
Typical System Evolution
34Copyright 2017 by Data Blueprint Slide #
The Situation
35Copyright 2017 by Data Blueprint Slide #
How many interfaces are required to solve this integration problem?
Application 4 Application 5 Application 6
15 Interfaces

(N*(N-1))/2
Application 1 Application 2 Application 3
RBC: 200 applications - 4900 batch interfaces
36Copyright 2017 by Data Blueprint Slide #
0
10000
20000
1 101 201
Number of Silos
Worst case number of interconnections
The rapidly increasing cost of complexity
• N
– 6 / 15
– 60 / 1,770
– 600 / 179,700
– 200 / 19,900
– 200 / 5,000 (actual)
37Copyright 2017 by Data Blueprint Slide #
© Copyright 2004 by Data Blueprint - all rights reserved!43 - datablueprint.com
XML-based Integration SolutionXML-based Integration Solution
Application 4 Application 5 Application 6
XML Processor
Application 1 Application 2 Application 33-Way Scalability
Expand the:
1. Number of data items 

from each system
– How many individual 

data items are tagged?
2. Number of 

interconnections 

between the systems and the hub
– How many systems are connected to the hub?
3. Amount of interconnectability among hub-connected
systems
– How many inter-system data item transformations exist in the
rule collection?
38Copyright 2017 by Data Blueprint Slide #
HUB
Conclusions
• 1 data structure is not
enough
• Most organizations have
far too many different
data structures and they
become barriers to
progress and integration
• Not much expertise to
figure out these
challenges
39Copyright 2017 by Data Blueprint Slide #
40Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Personas (The Requirements)
Operational
Performer
Interested in alerts,
notifications and
reporting based on
current values (real-
time) data. They use the
information to make
decisions and changes
in the transactional
systems. These
changes are targeted to
improve the
organizations ability to
deliver in the short term.
Operational Analyst
(Manager)
Interested in aggregated
real-time data for their
domain of responsibility.
The data is displayed
using visualization
techniques of
scorecards, charts and
reports, preferably within
a single dashboard. The
searching is for
favorable/unfavorable
trends to indicate
adjustments are needed
in the staff & resource
allocations.
Data Analyst
Responsible to support
detailed and typically
complex analysis
requests from business
users/consumers of
data. The analyst role
span both the
operational and
historical time windows
and thus they need to be
versed in both the
operational and analytic
environments.
Data Miner/
Scientist
Responsible for using
statistical and machine
learning techniques to
identify patterns from
the data. These patterns
are correlated into
insights and actions for
better business
outcomes. The miner
may use operational
and historical data for
research.
Executive Consumer
Receives the data
through summary
dashboards with drill
down/through
capabilities. Request
detailed analysis and
reporting on High Value
Question from the Data
Analyst and Data
Miners. These
consumers are looking
at the data to make
short and long term
decisions to improve the
organizational efficiency
and customer
experience.
Operational Analytic
41Copyright 2017 by Data Blueprint Slide #
• Operational interest is high when data is introduced to the
operational stores. This interest wanes over time.
• Analytic interest is low when data is first introduced. The
interest increases as the data is collected and combined
with other enterprise data.
Persona Data Interest
Operational
Interest
Analytic
Interest
Interest
Time
42Copyright 2017 by Data Blueprint Slide #
Time
Interest
Development Standards/Concrete Blocks
43Copyright 2017 by Data Blueprint Slide #
Example: Set Analysis
44Copyright 2017 by Data Blueprint Slide #
from MicroStrategy, Better Business Decisions Every Day: Integrating Business Reporting & Analysis
45Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Topology Today Can Be Complex
46Copyright 2017 by Data Blueprint Slide #
Data Mart
Master
Data
OLTP 1
OLTP 2
OLTP n...
Enterprise Data
Warehouse
(EDW)
Operational
Data Store
(ODS)
Data Mart
Data Mart
Event Data StoreBus OPS Events Tech OPS Events
Technical MetadataMetadata StoreBusiness Metadata
Data Store Purpose a review of the Data Topology
• Master Data
– Master Data is the term used to describe the data domains that
drive business activities. Master data is the data that must first be
in place before business transactions can occur. Master data is
often shared across the organizational business units and it is
typically at the center of business strategies. The transaction
defines the business/process event (order, dispatch, sales) while
the Master Data describes the ‘who’ (customers, drivers, account
reps), the ‘what’ (load), the ‘when’ (date, time) and the
‘where’ (origin and destination location).
• Online Transaction Processing (OLTP)
– “Transactional data” is the term used to describe the data involved
in the execution of the business activities. Transactional data
associates master data (i.e. customers and products) to a business
activity that often represents a unit or work, such as the creation of
an order.
• The Master Data and OLTP stores are where data is initially created
and persisted within the organization’s data and thus carry a special
classification of System of Record (SOR). They are created to capture
the transactional data as it arrives and makes the data available for the
processes and services. The data arrives into these databases through
manual entry or automated feeds. These data stores are logically (and
sometimes physically) separated by the transactional subject area they
are created to serve.
OLTP1
OLTP2
OLTPn...
Master
Data
47Copyright 2017 by Data Blueprint Slide #
Data Store Purpose a review of the Data Topology
• Online Transaction Processing (OLTP)
– “Transactional data” is the term used to describe the data involved
in the execution of the business activities. Transactional data
associates master data (i.e. customers and products) to a business
activity that often represents a unit or work, such as the creation of
an order.
– The Master Data and OLTP stores are where data is initially created
and persisted within the organization’s data and thus carry a special
classification of System of Record (SOR). They are created to
capture the transactional data as it arrives and makes the data
available for the processes and services. The data arrives into these
databases through manual entry or automated feeds. These data
stores are logically (and sometimes physically) separated by the
transactional subject area they are created to serve.
• Master Data
– Master Data is the term used to describe the data domains that
drive business activities. Master data is the data that must first be in
place before business transactions can occur. Master data is often
shared across the organizational business units and it is typically at
the center of business strategies. The transaction defines the
business/process event (order, dispatch, sales) while the Master
Data describes the ‘who’ (customers, drivers, account reps), the
‘how’ (order delivery type), the ‘when’ (date, time) and the
‘where’ (location, destination).
48Copyright 2017 by Data Blueprint Slide #
OLTP 1
OLTP 2
OLTP n...
Master
Data
Data Store Purpose a review of the Data Topology
• Operational Data Store (ODS)
– An Operational Data Store (ODS) is created to integrate data from two
or more SORs for the purposes of data integration. The ODS is
normally created to satisfy reporting needs across functional SOR
boundaries. The ODS should hold very little historical information and
should focus on maintaining the most up-to-date data needed by the
organization for daily operations. Depending on the application
requirements, the ODS may institute a near real-time data feed from
the source applications. The ODS is expected to be technically
accurate and is considered to be an Authoritative Source. The data it
contains can be used for non-critical needs instead of having to access
the SOR. The more frequently the data is pushed into the ODS
environment, the less reliance there will be on direct access to SORs
for data reporting needs.
• Enterprise Data Warehouse (EDW)
– An Enterprise Data Warehouse (EDW) is responsible for collection and
integration of data from either SORs or from the Operational Data
Store. An EDW has an enterprise scope as it will pull from many (if not
all) SORs. The focus of the data warehouse is to be historical in nature
and in many instances is loaded with a latency (every 24 hours). The
data warehouse is created to support historical analytics. The
expectation of the data warehouse is to be exhaustive in the data it
collects with a focus being on collecting and storing of the data.
EnterpriseData
Warehouse
(EDW)
Operational
DataStore
(ODS)
49Copyright 2017 by Data Blueprint Slide #
Data Store Purpose a review of the Data Topology
• Data Marts
– A Data Mart is a subset of a data warehouse, it
is created to address specific questions and/or
subject area of questions. A Data Mart is built
and tuned to deliver the data to the end users,
it exists to get the data out from the data
warehouse.
Data Mart
50Copyright 2017 by Data Blueprint Slide #
Data Store Purpose a review of the Data Topology
• Event Data Store
– Is the data store which logs, stores and reports the discrete
business and technical events which occur within the
process. This data store is a critical, and often overlooked
data domain for managing, controlling and creating
transparency into the business processes. The events are
used to report out the overall health of the processes in
both business and technical terms. This consolidated
solution is key to obtaining a 360 view of the processes.
• Metadata Store
– Metadata is a broad term which includes descriptive
elements in both business and technical terms. It covers:
business terms, data elements descriptions, element
display formats, element valid values, element quality
targets, etc. Metadata is critical to an organization as it
describes the organization’s business and processing
infrastructure in detail. Metadata is entertainingly defined
as “data about the data”. That is, Metadata characterizes
other data and makes it easier to retrieve, interpret and use
information.
Technical
Metadata
Metadata
Store
Business
Metadata
Event
Data
Store
BusOPS
Events
TechOPS
Events
51Copyright 2017 by Data Blueprint Slide #
Operational i

n 



c

o

n

t

r

a

s

t 



w

i

t

h
Analytic
Subject-Oriented
Databases which are focused on a
single or small set of business
functions
Integrated
Collecting and semantically aligning
data from disparate sources to achieve
a homogeneous view
Volatile
Data which may change frequently
Non-Volatile
Data for which entered into the
database will not change
Atomic
Low grain data, each transaction, each
order with all of the attributes
Aggregate
A summary of multiple orders or
transactions performed to transform
the atomic detail into more
comprehensible information
Current Valued: The data and the
system represents what is current in
this moment; not yesterday, not last
week --- now
Time Variant Data: is marked and
stored with a date/time element where
questions of what was it yesterday and
last week can be answered
Data Store Characteristics
52Copyright 2017 by Data Blueprint Slide #
53Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Data Structures: The Cornerstone of your Data's Home
Data Structure Design Styles
• 3rd Normal Form (3NF)
– Inmon
• Dimensional
– Kimball
• Data Vault
– Lindstad
54Copyright 2017 by Data Blueprint Slide #
Design Styles – 3NF
• 3rd Normal Form Modeling
• A mathematical data design 

technique founded in the early 

70s by E.F. Codd.
• Organizes data in simple rows 

and columns - Entities
• Creates connections between the 

entities called relationships to show how the data is inter-related
• It is purest form 3NF removes all data redundancies – a piece of
data is stored only once
• 3NF is based on mathematics, give the same facts to different
modelers; the model should be the same.
• Creates a visual (Entity Relation Diagram - ERD) which may be
understood by less technical personnel
• 3NF is the modeling style most popularly used for operationally
focused data stores.
55Copyright 2017 by Data Blueprint Slide #
Inmon Implementation
56Copyright 2017 by Data Blueprint Slide #
Design Styles – Dimensional
• Created and refined by Ralph 

Kimball in the 80s.
• Organizes data in Facts 

and Dimensions. Fact 

tables record the events 

(what) within the business domain 

and the Dimension tables describe 

who, when, how and where.
• The data design style was created to 

exploit the capabilities of the relational database to retrieve
and report against large volumes of data.
• Dimensional modeling sacrifices storage efficiency for
analytical processing speed
• There are 2 variations to Dimensional Modeling: Star Schema
and Snowflake
57Copyright 2017 by Data Blueprint Slide #
Kimball Implementation
58Copyright 2017 by Data Blueprint Slide #
Design Styles – Data Vault
• One of the newer relational database modeling techniques
• Data Vault modeling was conceived in the 1990s by Dan
Linstedt
• Data Vault models are designed for central data
warehouses that store non-volatile, time-variant, atomic
data
• Relationships are defined through Link structures which
promote flexibility and extensibility
59Copyright 2017 by Data Blueprint Slide #
Data Vault Implementation
60Copyright 2017 by Data Blueprint Slide #
Hybrid Approach
• (http://www.kimballgroup.com/2004/03/03/differences-of-opinion/)
• Learn Data Vault – “dv-in-kimball-bus-architecture”
61Copyright 2017 by Data Blueprint Slide #
DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE
O

P

E

R

A

T

I

O

N

A

L
Master Data
OLTP
ODS
Event
A

N

A

L

Y

T

I

C
Data Warehouse
Data Mart
Summary/Take Aways
DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE
O

P

E

R

A

T

I

O

N

A

L
Master Data
Operations Manager
Operational Analyst
Subject Oriented
Volatile
Atomic
Current Valued
3NF
OLTP
Operational Performer
Operations Manager
Subject Oriented
Volatile
Atomic
Current Valued
3NF
ODS
Operational Manager
Operational Analyst
Executive Consumer
Integrated
Volatile
Atomic
Current Valued
3NF
Event All Personas
Integrated
Volatile
Atomic
Current Valued
3NF
A

N

A

L

Y

T

I

C
Data Warehouse Data Miner/Scientist
Integrated
Non-volatile
Atomic
Time Variant
3NF trending to
Data Vault
Data Mart
Operational Analyst
Data Analyst
Executive Consumer
Subject Oriented
Non-volatile
Atomic -or- Aggregated
Time Variant
Dimensional
62Copyright 2017 by Data Blueprint Slide #
Outline: Design/Manage Data Structures
63Copyright 2017 by Data Blueprint Slide #
• Context: Data Management/DAMA/DM BoK/CDMP?
• What is a data structure?
• Structured data storage, a bit of history and context
• Why are data structures important?
• Data Personas/Usage (interest over time)
• Data Topology and alignment to the data audience
• Internal data structures to fit the needs
• Q & A?
Upcoming Events
September Webinar:
Implementing Big Data, NOSQL, & HADOOP – Bigger is (Usually) Better
September 12, 2017 @ 2:00 PM ET/11:00 AM PT
Sign up here:
• www.datablueprint.com/webinar-schedule
• www.Dataversity.net
Brought to you by:
64Copyright 2017 by Data Blueprint Slide #
Questions?
+ =
65Copyright 2017 by Data Blueprint Slide #
10124 W. Broad Street, Suite C
Glen Allen, Virginia 23060
804.521.4056
Copyright 2017 by Data Blueprint Slide #
66

More Related Content

What's hot

DataEd Online: Unlock Business Value through Data Governance
DataEd Online: Unlock Business Value through Data GovernanceDataEd Online: Unlock Business Value through Data Governance
DataEd Online: Unlock Business Value through Data GovernanceDATAVERSITY
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingDATAVERSITY
 
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...DATAVERSITY
 
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful Swan
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful SwanData-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful Swan
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful SwanDATAVERSITY
 
Data Management vs Data Strategy
Data Management vs Data StrategyData Management vs Data Strategy
Data Management vs Data StrategyDATAVERSITY
 
Slides: How Automating Data Lineage Improves BI Performance
Slides: How Automating Data Lineage Improves BI PerformanceSlides: How Automating Data Lineage Improves BI Performance
Slides: How Automating Data Lineage Improves BI PerformanceDATAVERSITY
 
Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata StrategiesDATAVERSITY
 
Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!DATAVERSITY
 
DataEd Slides: Data Architecture versus Data Modeling
DataEd Slides:  Data Architecture versus Data ModelingDataEd Slides:  Data Architecture versus Data Modeling
DataEd Slides: Data Architecture versus Data ModelingDATAVERSITY
 
A Modern Approach to DI & MDM
A Modern Approach to DI & MDMA Modern Approach to DI & MDM
A Modern Approach to DI & MDMDATAVERSITY
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesDATAVERSITY
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
 
Data-Ed Webinar: Best Practices with the DMM
Data-Ed Webinar: Best Practices with the DMMData-Ed Webinar: Best Practices with the DMM
Data-Ed Webinar: Best Practices with the DMMDATAVERSITY
 
Data-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMData-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMDATAVERSITY
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DATAVERSITY
 
Data-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMData-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMDATAVERSITY
 
DI&A Slides: Data-Centric Development
DI&A Slides: Data-Centric DevelopmentDI&A Slides: Data-Centric Development
DI&A Slides: Data-Centric DevelopmentDATAVERSITY
 
DataEd Slides: Data Management vs. Data Strategy
DataEd Slides: Data Management vs. Data StrategyDataEd Slides: Data Management vs. Data Strategy
DataEd Slides: Data Management vs. Data StrategyDATAVERSITY
 
Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)DATAVERSITY
 

What's hot (19)

DataEd Online: Unlock Business Value through Data Governance
DataEd Online: Unlock Business Value through Data GovernanceDataEd Online: Unlock Business Value through Data Governance
DataEd Online: Unlock Business Value through Data Governance
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
 
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
Data-Ed Slides: Data Modeling Strategies - Getting Your Data Ready for the Ca...
 
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful Swan
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful SwanData-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful Swan
Data-Ed Webinar: Data Quality Strategies - From Data Duckling to Successful Swan
 
Data Management vs Data Strategy
Data Management vs Data StrategyData Management vs Data Strategy
Data Management vs Data Strategy
 
Slides: How Automating Data Lineage Improves BI Performance
Slides: How Automating Data Lineage Improves BI PerformanceSlides: How Automating Data Lineage Improves BI Performance
Slides: How Automating Data Lineage Improves BI Performance
 
Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata Strategies
 
Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!Everybody is a Data Steward – Get Over It!
Everybody is a Data Steward – Get Over It!
 
DataEd Slides: Data Architecture versus Data Modeling
DataEd Slides:  Data Architecture versus Data ModelingDataEd Slides:  Data Architecture versus Data Modeling
DataEd Slides: Data Architecture versus Data Modeling
 
A Modern Approach to DI & MDM
A Modern Approach to DI & MDMA Modern Approach to DI & MDM
A Modern Approach to DI & MDM
 
Business Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data StrategiesBusiness Value Through Reference and Master Data Strategies
Business Value Through Reference and Master Data Strategies
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Data-Ed Webinar: Best Practices with the DMM
Data-Ed Webinar: Best Practices with the DMMData-Ed Webinar: Best Practices with the DMM
Data-Ed Webinar: Best Practices with the DMM
 
Data-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDMData-Ed Webinar: The Importance of MDM
Data-Ed Webinar: The Importance of MDM
 
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
DataEd Slides: Unlock Business Value Using Reference and Master Data Manageme...
 
Data-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDMData-Ed Online: Unlock Business Value through Reference & MDM
Data-Ed Online: Unlock Business Value through Reference & MDM
 
DI&A Slides: Data-Centric Development
DI&A Slides: Data-Centric DevelopmentDI&A Slides: Data-Centric Development
DI&A Slides: Data-Centric Development
 
DataEd Slides: Data Management vs. Data Strategy
DataEd Slides: Data Management vs. Data StrategyDataEd Slides: Data Management vs. Data Strategy
DataEd Slides: Data Management vs. Data Strategy
 
Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)Implementing the Data Maturity Model (DMM)
Implementing the Data Maturity Model (DMM)
 

Similar to Data Structures - The Cornerstone of Your Data’s Home

Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenData-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenDATAVERSITY
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture StrategiesDATAVERSITY
 
Data-Ed Webinar: Design & Manage Data Structures
Data-Ed Webinar: Design & Manage Data Structures Data-Ed Webinar: Design & Manage Data Structures
Data-Ed Webinar: Design & Manage Data Structures DATAVERSITY
 
Data-Ed: Design and Manage Data Structures
Data-Ed: Design and Manage Data Structures Data-Ed: Design and Manage Data Structures
Data-Ed: Design and Manage Data Structures Data Blueprint
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsDATAVERSITY
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsDATAVERSITY
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is FundamentalDATAVERSITY
 
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
DataEd Webinar:  Reference & Master Data Management - Unlocking Business ValueDataEd Webinar:  Reference & Master Data Management - Unlocking Business Value
DataEd Webinar: Reference & Master Data Management - Unlocking Business ValueDATAVERSITY
 
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)DATAVERSITY
 
Metadata Strategies
Metadata StrategiesMetadata Strategies
Metadata StrategiesDATAVERSITY
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindDATAVERSITY
 
The Importance of Master Data Management
The Importance of Master Data ManagementThe Importance of Master Data Management
The Importance of Master Data ManagementDATAVERSITY
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsDATAVERSITY
 
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and ContrastDataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and ContrastDATAVERSITY
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data Blueprint
 
Data-Ed Webinar: Monetizing Data Management - Show Me the Money
Data-Ed Webinar: Monetizing Data Management - Show Me the MoneyData-Ed Webinar: Monetizing Data Management - Show Me the Money
Data-Ed Webinar: Monetizing Data Management - Show Me the MoneyDATAVERSITY
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementDATAVERSITY
 
Data-Ed: Unlock Business Value through Document & Content Management
Data-Ed: Unlock Business Value through Document & Content ManagementData-Ed: Unlock Business Value through Document & Content Management
Data-Ed: Unlock Business Value through Document & Content ManagementData Blueprint
 

Similar to Data Structures - The Cornerstone of Your Data’s Home (20)

Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data GardenData-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
Data-Ed Slides: Data Architecture Strategies - Constructing Your Data Garden
 
Data Architecture Strategies
Data Architecture StrategiesData Architecture Strategies
Data Architecture Strategies
 
Data-Ed Webinar: Design & Manage Data Structures
Data-Ed Webinar: Design & Manage Data Structures Data-Ed Webinar: Design & Manage Data Structures
Data-Ed Webinar: Design & Manage Data Structures
 
Data-Ed: Design and Manage Data Structures
Data-Ed: Design and Manage Data Structures Data-Ed: Design and Manage Data Structures
Data-Ed: Design and Manage Data Structures
 
Data-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture RequirementsData-Ed Online Webinar: Data Architecture Requirements
Data-Ed Online Webinar: Data Architecture Requirements
 
Data-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture RequirementsData-Ed Webinar: Data Architecture Requirements
Data-Ed Webinar: Data Architecture Requirements
 
DataEd Slides: Data Modeling is Fundamental
DataEd Slides:  Data Modeling is FundamentalDataEd Slides:  Data Modeling is Fundamental
DataEd Slides: Data Modeling is Fundamental
 
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
DataEd Webinar:  Reference & Master Data Management - Unlocking Business ValueDataEd Webinar:  Reference & Master Data Management - Unlocking Business Value
DataEd Webinar: Reference & Master Data Management - Unlocking Business Value
 
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
 
Metadata Strategies
Metadata StrategiesMetadata Strategies
Metadata Strategies
 
The Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data MindThe Importance of MDM - Eternal Management of the Data Mind
The Importance of MDM - Eternal Management of the Data Mind
 
The Importance of Master Data Management
The Importance of Master Data ManagementThe Importance of Master Data Management
The Importance of Master Data Management
 
Data-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling FundamentalsData-Ed Webinar: Data Modeling Fundamentals
Data-Ed Webinar: Data Modeling Fundamentals
 
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and ContrastDataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
DataEd Slides: Data Architecture vs. Data Modeling – Compare and Contrast
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM
 
Data-Ed Webinar: Monetizing Data Management - Show Me the Money
Data-Ed Webinar: Monetizing Data Management - Show Me the MoneyData-Ed Webinar: Monetizing Data Management - Show Me the Money
Data-Ed Webinar: Monetizing Data Management - Show Me the Money
 
DMBOK and Data Governance
DMBOK and Data GovernanceDMBOK and Data Governance
DMBOK and Data Governance
 
Data-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content ManagementData-Ed Online: Unlock Business Value through Document & Content Management
Data-Ed Online: Unlock Business Value through Document & Content Management
 
Data-Ed: Unlock Business Value through Document & Content Management
Data-Ed: Unlock Business Value through Document & Content ManagementData-Ed: Unlock Business Value through Document & Content Management
Data-Ed: Unlock Business Value through Document & Content Management
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03DallasHaselhorst
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMintel Group
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Timedelhimodelshub1
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607dollysharma2066
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 

Recently uploaded (20)

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03Cybersecurity Awareness Training Presentation v2024.03
Cybersecurity Awareness Training Presentation v2024.03
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
Market Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 EditionMarket Sizes Sample Report - 2024 Edition
Market Sizes Sample Report - 2024 Edition
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Time
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 

Data Structures - The Cornerstone of Your Data’s Home

  • 1. Tom Gartland & Peter Aiken, PhD Data Structures The Cornerstone of your Data's Home Copyright 2017 by Data Blueprint Slide # 1 • DAMA International President 2009-2013 • DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd • DAMA International Community Award 2005 Peter Aiken, Ph.D. • 33+ years in data management • Repeated international recognition • Founder, Data Blueprint (datablueprint.com) • Associate Professor of IS (vcu.edu) • DAMA International (dama.org) • 10 books and dozens of articles • Experienced w/ 500+ data management practices • Multi-year immersions:
 – US DoD (DISA/Army/Marines/DLA)
 – Nokia
 – Deutsche Bank
 – Wells Fargo
 – Walmart
 – … PETER AIKEN WITH JUANITA BILLINGS FOREWORD BY JOHN BOTTEGA MONETIZING DATA MANAGEMENT Unlocking the Value in Your Organization’s Most Important Asset. The Case for the Chief Data Officer Recasting the C-Suite to Leverage Your MostValuable Asset Peter Aiken and Michael Gorman 2 Copyright 2017 by Data Blueprint Slide #
  • 2. Tom Gartland 
 
 
 • A 30+ year veteran of IT, Tom has done everything: – Quality assurance – Programming – Data analysis – Architecting – Business intelligence – Project management • Across a variety of sectors and industries – Finance – Private health care – Charity health care – Government services – Construction – Discrete manufacturing – Process manufacturing – Retail – Telecommunications – Consulting 3Copyright 2017 by Data Blueprint Slide # • Tom spends much of his personal time with 
 his wife and 7 Rhodesian Ridgebacks 4Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home
  • 3. Maslow's Hierarchy of Needs 5Copyright 2017 by Data Blueprint Slide # You can accomplish Advanced Data Practices without becoming proficient in the Foundational Data Practices however 
 this will: • Take longer • Cost more • Deliver less • Present 
 greater
 risk
(with thanks to 
 Tom DeMarco) Data Management Practices Hierarchy Advanced 
 Data 
 Practices • MDM • Mining • Big Data • Analytics • Warehousing • SOA Foundational Data Practices Data Platform/Architecture Data Governance Data Quality Data Operations Data Management Strategy Technologies Capabilities 6Copyright 2017 by Data Blueprint Slide #
  • 4. DMM℠ Structure of 
 5 Integrated 
 DM Practice Areas Data architecture implementation Data 
 Governance Data 
 Management
 Strategy Data 
 Operations Platform
 Architecture Supporting
 Processes Maintain fit-for-purpose data, efficiently and effectively 7Copyright 2017 by Data Blueprint Slide # Manage data coherently Manage data assets professionally Data life cycle management Organizational support Data 
 Quality DMM℠ Structure of 
 5 Integrated 
 DM Practice Areas Data 
 Governance Data 
 Management
 Strategy Data 
 Operations Platform
 Architecture Supporting
 Processes 8Copyright 2017 by Data Blueprint Slide # Data 
 Quality 3 3 33 1 Strategy is often the weakest link!
  • 5. 9Copyright 2017 by Data Blueprint Slide # Data Management 
 Body of Knowledge 
 (DM BoK V2)
 Practice Areas To do any of these well requires specific knowledge of the relevant data structures! 10Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home
  • 6. Without Data Structures ... 11Copyright 2017 by Data Blueprint Slide # • Water into wine • Coal into gold • Proper usage is: – Semi-structured into more structured – Non-tabular data into tabular data – Operational question: how much of it? 12Copyright 2017 by Data Blueprint Slide # Unstructured data cannot be transformed into structured data! Wrappers
  • 7. What is a data structure? • "An organization of information • usually in memory (for better algorithm efficiency) • such as queue, stack, linked list, heap, dictionary, and tree, or • conceptual unity, such as the name and address of a person. • It may include redundant information, such as length of the list or number of nodes in a subtree." • Some data structure characteristics – Grammar (rules) for data objects – Constraints for data objects – Sequential order – Uniqueness – Arrangement • Hierarchical, relational, 
 network, other – Balance – Optimality http://www.nist.gov/dads/HTML/datastructur.html 13Copyright 2017 by Data Blueprint Slide # How are data structures expressed as architectures? • Details are organized into 
 larger components • Larger components are organized into models • Models are organized into architectures A B C D A B C D A D C B 14Copyright 2017 by Data Blueprint Slide #
  • 8. How are data structures expressed as architectures? • Attributes are organized into 
 entities/objects – Attributes are characteristics of "things" – Entitles/objects are "things" whose 
 information is managed in support of strategy – For example: person (name, dob, res, kids, phone) • Entities/objects are organized into models – Combinations of attributes and entities are 
 structured to represent information requirements – Poorly structured data, constrains organizational information delivery capabilities – For example: sales model, accounting model, reporting model • Models are organized into architectures – When building new systems, architectures are used to plan development – More often, data managers do not know what existing architectures are and - therefore - cannot make use of them in support of strategy implementation – For example: financial architecture or business intelligence architecture 15Copyright 2017 by Data Blueprint Slide # Sample Data Architecture Overview 16Copyright 2017 by Data Blueprint Slide #
  • 9. 17Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home History (such as it is) • Automate existing manual 
 processing • Data management was: – Running millions of punched 
 cards through banks of sorting, 
 collating & tabulating machines – Results printed on paper or 
 punched onto more cards – Data management meant physically storing and hauling around punched cards • Tasks (check signing, calculating, and machine control) were implemented to provide automated support for departmental-based processing • Creating information silos • Data Processing Manager 18Copyright 2017 by Data Blueprint Slide #
  • 10. • Data Processing Manager Chief Information Officer 19Copyright 2017 by Data Blueprint Slide # CFO Necessary Prerequisites/Qualifications • CPA • CMA • Masters of Accountancy • Other recognized 
 degrees/certifications • These are necessary 
 but insufficient 
 prerequisites/qualifications 20Copyright 2017 by Data Blueprint Slide #
  • 11. CIO Qualifications • No specific qualifications • Typically technological fields: – Computer science – Software engineering – Information systems • Business – Master of Business Administration – Master of Science in Management • Business acumen and strategic perspectives have taken precedence over technical skills. – CIOs appointed from the business side of the organization • Especially if they have project management skills. 21Copyright 2017 by Data Blueprint Slide # What do we teach knowledge workers about data? What percentage of them deal with it daily? 22Copyright 2017 by Data Blueprint Slide #
  • 12. 23Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home Data Leverage Less ROT Technologies Process People • Permits organizations to better manage their sole non-depleteable, non-degrading, durable, strategic asset - data – within the organization, and – with organizational data exchange partners • Leverage – Obtained by implementation of data-centric technologies, processes, and human skill sets – Increased by elimination of data ROT (redundant, obsolete, or trivial) • The bigger the organization, the greater potential leverage exists • Treating data more asset-like simultaneously 1. lowers organizational IT costs and 2. increases organizational knowledge worker productivity 24Copyright 2017 by Data Blueprint Slide #
  • 13. Data Structure Questions Program F Program E Program D Program G Program H Program I Application domain 2Application domain 3 • Who makes decisions about the range and scope of common data usage? 25Copyright 2017 by Data Blueprint Slide # Running Query 26Copyright 2017 by Data Blueprint Slide #
  • 14. Optimized Query 27Copyright 2017 by Data Blueprint Slide # Repeat 100s, thousands, millions of times ... 28Copyright 2017 by Data Blueprint Slide #
  • 15. 29Copyright 2017 by Data Blueprint Slide # Data structures organized into an Architecture • How do data structures support organizational strategy? • Consider the opposite question? – Were your systems explicitly designed to be integrated or otherwise work together? – If not, then what is the likelihood that they will work well together? – In all likelihood your organization is spending between 20-40% of its IT budget compensating for poor data structure integration – They cannot be helpful as long as their structure is unknown • Two answers/two separate strategies – Achieving efficiency and 
 effectiveness goals – Providing organizational dexterity for rapid implementation 30Copyright 2017 by Data Blueprint Slide #
  • 16. Data Models Used to Support Strategy • Flexible, adaptable data structures • Cleaner, less complex code • Ensure strategy effectiveness measurement • Build in future capabilities • Form/assess merger and acquisitions strategies 31Copyright 2017 by Data Blueprint Slide # Employee
 Type Employee Sales
 Person Manager Manager
 Type Staff
 Manager Line
 Manager Adapted from Clive Finkelstein Information Engineering Strategic Systems Development 1992 5 Basic Data Structures Indexed Sequential File: Built-in index permits location of records of persons with last names starting with "T" Index Program: Where is the record for person "Townsend?" Index: Start looking here where the "Ts" are stored Relational Database: Records are related to each other using relationships describable using relational algebra Flat File: Records are typically sorted according to some criteria and must be searched from the beginning for each access Program: Must start at the beginning and read each record when looking for person "Townsend?" Network Database: Records are related to each other using arranged master records associated with multiple detail records using linked lists and pointers Associative Concept-oriented Multi-dimensional XML database
 3NF
 Star schema
 Data Vault Hierarchical Database: Records are related to each other hierarchically using 'parent child' relationships 32Copyright 2017 by Data Blueprint Slide #
  • 17. • The thought of a single monolithic data store which can service all of an organization’s information needs has long since been abandoned. In the modern data management topology, multiple data stores are created to service specific processing needs and user groups within the organization. • Implications: – The needs characteristics of the 
 multitude of the audiences served 
 by the data structures – Data lifecycle – The design styles (old and new) utilized 
 to organize the data to service the audiences – A breakdown of the various stores – The resultant store characteristics Single Data Store One Size does not satisfy all needs 33Copyright 2017 by Data Blueprint Slide # Payroll Application
 (3rd GL)Payroll Data (database) R& D Applications
 (researcher supported, no documentation) R & D Data (raw) Mfg. Data (home grown database) Mfg. Applications
 (contractor supported) 
 Finance Data (indexed) Finance Application
 (3rd GL, batch 
 system, no source) Marketing Application
 (4rd GL, query facilities, 
 no reporting, very large) 
 Marketing Data (external database) Personnel App.
 (20 years old,
 un-normalized data) 
 Personnel Data
 (database) Typical System Evolution 34Copyright 2017 by Data Blueprint Slide #
  • 18. The Situation 35Copyright 2017 by Data Blueprint Slide # How many interfaces are required to solve this integration problem? Application 4 Application 5 Application 6 15 Interfaces
 (N*(N-1))/2 Application 1 Application 2 Application 3 RBC: 200 applications - 4900 batch interfaces 36Copyright 2017 by Data Blueprint Slide #
  • 19. 0 10000 20000 1 101 201 Number of Silos Worst case number of interconnections The rapidly increasing cost of complexity • N – 6 / 15 – 60 / 1,770 – 600 / 179,700 – 200 / 19,900 – 200 / 5,000 (actual) 37Copyright 2017 by Data Blueprint Slide # © Copyright 2004 by Data Blueprint - all rights reserved!43 - datablueprint.com XML-based Integration SolutionXML-based Integration Solution Application 4 Application 5 Application 6 XML Processor Application 1 Application 2 Application 33-Way Scalability Expand the: 1. Number of data items 
 from each system – How many individual 
 data items are tagged? 2. Number of 
 interconnections 
 between the systems and the hub – How many systems are connected to the hub? 3. Amount of interconnectability among hub-connected systems – How many inter-system data item transformations exist in the rule collection? 38Copyright 2017 by Data Blueprint Slide # HUB
  • 20. Conclusions • 1 data structure is not enough • Most organizations have far too many different data structures and they become barriers to progress and integration • Not much expertise to figure out these challenges 39Copyright 2017 by Data Blueprint Slide # 40Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home
  • 21. Data Personas (The Requirements) Operational Performer Interested in alerts, notifications and reporting based on current values (real- time) data. They use the information to make decisions and changes in the transactional systems. These changes are targeted to improve the organizations ability to deliver in the short term. Operational Analyst (Manager) Interested in aggregated real-time data for their domain of responsibility. The data is displayed using visualization techniques of scorecards, charts and reports, preferably within a single dashboard. The searching is for favorable/unfavorable trends to indicate adjustments are needed in the staff & resource allocations. Data Analyst Responsible to support detailed and typically complex analysis requests from business users/consumers of data. The analyst role span both the operational and historical time windows and thus they need to be versed in both the operational and analytic environments. Data Miner/ Scientist Responsible for using statistical and machine learning techniques to identify patterns from the data. These patterns are correlated into insights and actions for better business outcomes. The miner may use operational and historical data for research. Executive Consumer Receives the data through summary dashboards with drill down/through capabilities. Request detailed analysis and reporting on High Value Question from the Data Analyst and Data Miners. These consumers are looking at the data to make short and long term decisions to improve the organizational efficiency and customer experience. Operational Analytic 41Copyright 2017 by Data Blueprint Slide # • Operational interest is high when data is introduced to the operational stores. This interest wanes over time. • Analytic interest is low when data is first introduced. The interest increases as the data is collected and combined with other enterprise data. Persona Data Interest Operational Interest Analytic Interest Interest Time 42Copyright 2017 by Data Blueprint Slide # Time Interest
  • 22. Development Standards/Concrete Blocks 43Copyright 2017 by Data Blueprint Slide # Example: Set Analysis 44Copyright 2017 by Data Blueprint Slide # from MicroStrategy, Better Business Decisions Every Day: Integrating Business Reporting & Analysis
  • 23. 45Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home Data Topology Today Can Be Complex 46Copyright 2017 by Data Blueprint Slide # Data Mart Master Data OLTP 1 OLTP 2 OLTP n... Enterprise Data Warehouse (EDW) Operational Data Store (ODS) Data Mart Data Mart Event Data StoreBus OPS Events Tech OPS Events Technical MetadataMetadata StoreBusiness Metadata
  • 24. Data Store Purpose a review of the Data Topology • Master Data – Master Data is the term used to describe the data domains that drive business activities. Master data is the data that must first be in place before business transactions can occur. Master data is often shared across the organizational business units and it is typically at the center of business strategies. The transaction defines the business/process event (order, dispatch, sales) while the Master Data describes the ‘who’ (customers, drivers, account reps), the ‘what’ (load), the ‘when’ (date, time) and the ‘where’ (origin and destination location). • Online Transaction Processing (OLTP) – “Transactional data” is the term used to describe the data involved in the execution of the business activities. Transactional data associates master data (i.e. customers and products) to a business activity that often represents a unit or work, such as the creation of an order. • The Master Data and OLTP stores are where data is initially created and persisted within the organization’s data and thus carry a special classification of System of Record (SOR). They are created to capture the transactional data as it arrives and makes the data available for the processes and services. The data arrives into these databases through manual entry or automated feeds. These data stores are logically (and sometimes physically) separated by the transactional subject area they are created to serve. OLTP1 OLTP2 OLTPn... Master Data 47Copyright 2017 by Data Blueprint Slide # Data Store Purpose a review of the Data Topology • Online Transaction Processing (OLTP) – “Transactional data” is the term used to describe the data involved in the execution of the business activities. Transactional data associates master data (i.e. customers and products) to a business activity that often represents a unit or work, such as the creation of an order. – The Master Data and OLTP stores are where data is initially created and persisted within the organization’s data and thus carry a special classification of System of Record (SOR). They are created to capture the transactional data as it arrives and makes the data available for the processes and services. The data arrives into these databases through manual entry or automated feeds. These data stores are logically (and sometimes physically) separated by the transactional subject area they are created to serve. • Master Data – Master Data is the term used to describe the data domains that drive business activities. Master data is the data that must first be in place before business transactions can occur. Master data is often shared across the organizational business units and it is typically at the center of business strategies. The transaction defines the business/process event (order, dispatch, sales) while the Master Data describes the ‘who’ (customers, drivers, account reps), the ‘how’ (order delivery type), the ‘when’ (date, time) and the ‘where’ (location, destination). 48Copyright 2017 by Data Blueprint Slide # OLTP 1 OLTP 2 OLTP n... Master Data
  • 25. Data Store Purpose a review of the Data Topology • Operational Data Store (ODS) – An Operational Data Store (ODS) is created to integrate data from two or more SORs for the purposes of data integration. The ODS is normally created to satisfy reporting needs across functional SOR boundaries. The ODS should hold very little historical information and should focus on maintaining the most up-to-date data needed by the organization for daily operations. Depending on the application requirements, the ODS may institute a near real-time data feed from the source applications. The ODS is expected to be technically accurate and is considered to be an Authoritative Source. The data it contains can be used for non-critical needs instead of having to access the SOR. The more frequently the data is pushed into the ODS environment, the less reliance there will be on direct access to SORs for data reporting needs. • Enterprise Data Warehouse (EDW) – An Enterprise Data Warehouse (EDW) is responsible for collection and integration of data from either SORs or from the Operational Data Store. An EDW has an enterprise scope as it will pull from many (if not all) SORs. The focus of the data warehouse is to be historical in nature and in many instances is loaded with a latency (every 24 hours). The data warehouse is created to support historical analytics. The expectation of the data warehouse is to be exhaustive in the data it collects with a focus being on collecting and storing of the data. EnterpriseData Warehouse (EDW) Operational DataStore (ODS) 49Copyright 2017 by Data Blueprint Slide # Data Store Purpose a review of the Data Topology • Data Marts – A Data Mart is a subset of a data warehouse, it is created to address specific questions and/or subject area of questions. A Data Mart is built and tuned to deliver the data to the end users, it exists to get the data out from the data warehouse. Data Mart 50Copyright 2017 by Data Blueprint Slide #
  • 26. Data Store Purpose a review of the Data Topology • Event Data Store – Is the data store which logs, stores and reports the discrete business and technical events which occur within the process. This data store is a critical, and often overlooked data domain for managing, controlling and creating transparency into the business processes. The events are used to report out the overall health of the processes in both business and technical terms. This consolidated solution is key to obtaining a 360 view of the processes. • Metadata Store – Metadata is a broad term which includes descriptive elements in both business and technical terms. It covers: business terms, data elements descriptions, element display formats, element valid values, element quality targets, etc. Metadata is critical to an organization as it describes the organization’s business and processing infrastructure in detail. Metadata is entertainingly defined as “data about the data”. That is, Metadata characterizes other data and makes it easier to retrieve, interpret and use information. Technical Metadata Metadata Store Business Metadata Event Data Store BusOPS Events TechOPS Events 51Copyright 2017 by Data Blueprint Slide # Operational i
 n 
 
 c
 o
 n
 t
 r
 a
 s
 t 
 
 w
 i
 t
 h Analytic Subject-Oriented Databases which are focused on a single or small set of business functions Integrated Collecting and semantically aligning data from disparate sources to achieve a homogeneous view Volatile Data which may change frequently Non-Volatile Data for which entered into the database will not change Atomic Low grain data, each transaction, each order with all of the attributes Aggregate A summary of multiple orders or transactions performed to transform the atomic detail into more comprehensible information Current Valued: The data and the system represents what is current in this moment; not yesterday, not last week --- now Time Variant Data: is marked and stored with a date/time element where questions of what was it yesterday and last week can be answered Data Store Characteristics 52Copyright 2017 by Data Blueprint Slide #
  • 27. 53Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Data Structures: The Cornerstone of your Data's Home Data Structure Design Styles • 3rd Normal Form (3NF) – Inmon • Dimensional – Kimball • Data Vault – Lindstad 54Copyright 2017 by Data Blueprint Slide #
  • 28. Design Styles – 3NF • 3rd Normal Form Modeling • A mathematical data design 
 technique founded in the early 
 70s by E.F. Codd. • Organizes data in simple rows 
 and columns - Entities • Creates connections between the 
 entities called relationships to show how the data is inter-related • It is purest form 3NF removes all data redundancies – a piece of data is stored only once • 3NF is based on mathematics, give the same facts to different modelers; the model should be the same. • Creates a visual (Entity Relation Diagram - ERD) which may be understood by less technical personnel • 3NF is the modeling style most popularly used for operationally focused data stores. 55Copyright 2017 by Data Blueprint Slide # Inmon Implementation 56Copyright 2017 by Data Blueprint Slide #
  • 29. Design Styles – Dimensional • Created and refined by Ralph 
 Kimball in the 80s. • Organizes data in Facts 
 and Dimensions. Fact 
 tables record the events 
 (what) within the business domain 
 and the Dimension tables describe 
 who, when, how and where. • The data design style was created to 
 exploit the capabilities of the relational database to retrieve and report against large volumes of data. • Dimensional modeling sacrifices storage efficiency for analytical processing speed • There are 2 variations to Dimensional Modeling: Star Schema and Snowflake 57Copyright 2017 by Data Blueprint Slide # Kimball Implementation 58Copyright 2017 by Data Blueprint Slide #
  • 30. Design Styles – Data Vault • One of the newer relational database modeling techniques • Data Vault modeling was conceived in the 1990s by Dan Linstedt • Data Vault models are designed for central data warehouses that store non-volatile, time-variant, atomic data • Relationships are defined through Link structures which promote flexibility and extensibility 59Copyright 2017 by Data Blueprint Slide # Data Vault Implementation 60Copyright 2017 by Data Blueprint Slide #
  • 31. Hybrid Approach • (http://www.kimballgroup.com/2004/03/03/differences-of-opinion/) • Learn Data Vault – “dv-in-kimball-bus-architecture” 61Copyright 2017 by Data Blueprint Slide # DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE O
 P
 E
 R
 A
 T
 I
 O
 N
 A
 L Master Data OLTP ODS Event A
 N
 A
 L
 Y
 T
 I
 C Data Warehouse Data Mart Summary/Take Aways DATA STORE AUDIENCE SERVED BUILD CHARACTERISTICS DESIGN STYLE O
 P
 E
 R
 A
 T
 I
 O
 N
 A
 L Master Data Operations Manager Operational Analyst Subject Oriented Volatile Atomic Current Valued 3NF OLTP Operational Performer Operations Manager Subject Oriented Volatile Atomic Current Valued 3NF ODS Operational Manager Operational Analyst Executive Consumer Integrated Volatile Atomic Current Valued 3NF Event All Personas Integrated Volatile Atomic Current Valued 3NF A
 N
 A
 L
 Y
 T
 I
 C Data Warehouse Data Miner/Scientist Integrated Non-volatile Atomic Time Variant 3NF trending to Data Vault Data Mart Operational Analyst Data Analyst Executive Consumer Subject Oriented Non-volatile Atomic -or- Aggregated Time Variant Dimensional 62Copyright 2017 by Data Blueprint Slide #
  • 32. Outline: Design/Manage Data Structures 63Copyright 2017 by Data Blueprint Slide # • Context: Data Management/DAMA/DM BoK/CDMP? • What is a data structure? • Structured data storage, a bit of history and context • Why are data structures important? • Data Personas/Usage (interest over time) • Data Topology and alignment to the data audience • Internal data structures to fit the needs • Q & A? Upcoming Events September Webinar: Implementing Big Data, NOSQL, & HADOOP – Bigger is (Usually) Better September 12, 2017 @ 2:00 PM ET/11:00 AM PT Sign up here: • www.datablueprint.com/webinar-schedule • www.Dataversity.net Brought to you by: 64Copyright 2017 by Data Blueprint Slide #
  • 33. Questions? + = 65Copyright 2017 by Data Blueprint Slide # 10124 W. Broad Street, Suite C Glen Allen, Virginia 23060 804.521.4056 Copyright 2017 by Data Blueprint Slide # 66