Peter Aiken, Ph.D. & Steven MacLauchlan
Data Warehousing Strategies
2
Copyright 2014 by Data Blueprint
Premise
Two types of listeners …
1. Interested in how to
approach the subject of
warehousing data
– Need to integrate disparate
data
– Need more holistic view of
business operations
– Management just discovered
data warehouses and wants
you to "build one"
2. Have complex and/or
messy data warehouse
practices
– Want to improve them
Data Warehousing Strategies
3
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Maslow's Hierarchiy of Needs
4
Copyright 2014 by Data Blueprint
You can accomplish
Advanced Data Practices
without becoming proficient
in the Foundational Data
Management Practices
however this will:
• Take longer
• Cost more
• Deliver less
• Present 

greater

risk

(with thanks to Tom DeMarco)
Data Management Practices Hierarchy
Advanced 

Data 

Practices
• MDM
• Mining
• Big Data
• Analytics
• Warehousing
• SOA
Foundational Data Management Practices
5
Copyright 2014 by Data Blueprint
Data Platform/Architecture
Data Governance Data Quality
Data Operations
Data Management Strategy
Technologies
Capabilities
UsesReuses
What is data management?
6
Copyright 2014 by Data Blueprint
Sources
Data Governance


Data
Engineering


Data 

Delivery


Data

Storage
Specialized Team Skills
Understanding the current
and future data needs of an
enterprise and making that
data effective and efficient in
supporting 

business activities


Aiken, P, Allen, M. D., Parker, B., Mattia, A., 

"Measuring Data Management's Maturity: 

A Community's Self-Assessment" 

IEEE Computer (research feature April 2007)
Data management practices connect
data sources and uses in an
organized and efficient manner
• Storage
• Engineering
• Delivery
• Governance
When executed, 

engineering, storage, and 

delivery implement governance
Note: does not well-depict data reuse
Maintain fit-for-purpose data,
efficiently and effectively
DMM℠ Structure of 

5 Integrated 

DM Practice Areas
7
Copyright 2014 by Data Blueprint
Manage data coherently
Manage data assets professionally
Data architecture
implementation
Data engineering
implementation
Organizational support
DataManagementBodyofKnowledge
8
Copyright 2014 by Data Blueprint
Data
Management
Functions
DAMA DM BoK & CDMP
9
Copyright 2014 by Data Blueprint
• Data Management Body of
Knowledge (DMBOK)
– Published by DAMA International, the
professional association for 

Data Managers (40 chapters worldwide)
– Organized around primary data management
functions focused around data delivery to the
organization and several environmental
elements
• Certified Data Management
Professional (CDMP)
– Series of 3 exams by DAMA International and
ICCP
– Membership in a distinct group of 

fellow professionals
– Recognition for specialized knowledge in a 

choice of 17 specialty areas
– For more information, please visit:
• www.dama.org, www.iccp.org
Data Warehousing & Business Intelligence Management
10Copyright 2014 by Data Blueprint
Warehousing data in the context of data management
11
Copyright 2014 by Data Blueprint
Assumes you have
• An overarching data strategy
• A strategy for becoming
familiar with "big data
technologies"
• Made a decision to not make
available (integrating or
storing) needed data
• Decided to increase (or
decrease) the complexity of
existing DM practices
• Decided to learn more about
this DM BoK slice
UsesReusesSources
Data Governance


Data
Engineering


Data 

Delivery


Data

Storage
Specialized Team Skills
Data Warehousing Strategies
12
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Payroll Application

(3rd GL)Payroll Data
(database)
R& D Applications

(researcher supported, no documentation)
R & D
Data
(raw) Mfg. Data
(home grown
database)
Mfg. Applications

(contractor supported)


Finance
Data
(indexed)
Finance Application

(3rd GL, batch 

system, no source)
Marketing Application

(4rd GL, query facilities, 

no reporting, very large)


Marketing Data
(external database)
Personnel App.

(20 years old,

un-normalized data)


Personnel Data

(database)
Typical System Evolution
13
Copyright 2014 by Data Blueprint
Multiple Sources of
(for example)
Customer Data
Payroll Data
(database)
R & D
Data
(raw)
Mfg. Data
(home grown
database)


Finance
Data
(indexed)


Marketing Data
(external database)


Personnel Data

(database)
... Then Integrate
14
Copyright 2014 by Data Blueprint
Organizational

Data
Payroll Data
(database)
R & D
Data
(raw)
Mfg. Data
(home grown
database)


Finance
Data
(indexed)


Marketing Data
(external database)


Personnel Data

(database)
... Then Re-architect
15
Copyright 2014 by Data Blueprint
Organizational

Data
An organization's integration needs ...
16
Copyright 2014 by Data Blueprint


Software
Package 1


Software
Package 2


Software
Package 3


Software
Package 4


Software
Package 5


Software
Package 6
Data Architecture
... map between and across software packages
Defining Data Warehousing, BI/Analytics
17Copyright 2014 by Data Blueprint
• Data Warehousing
– A technology solution supporting … business capabilities 

such as: query, analysis, reporting and development 

of these capabilities
– Analysis of information not previously integrated
– Another, often new, set of organizational capabilities
• Business Intelligence (aka. decision support)
– Dates at least to 1958
– Support better business decision making
– Technologies, applications and practices for the collection, integration, analysis, and
presentation of business information
– Understanding historical patterns in data to improve future performance
– Use of mathematics in business
• Analytics (aka.) enterprise decision management, marketing analytics,
predictive science, strategy science, credit risk analysis. fraud analytics -
often based on computational modeling
• Reframing the question …
– From: what data warehouse should we build?
– To: how can data warehouse-based integration address challenges?
From The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Descriptive
Ask: What happened? What is happening?
Find: Structured data
Show: Profiles, Bar/pie charts, Narrative
Predictive
Ask: What will happen? Why will it happen?
Find: Structured/unstructured data
Show: Risk Profiles, Pros/Cons, Care Recs
Prescriptive
Ask: What should I do? Why should I do it?
Find: Unstructured/structured data
Show: Strategic Goals, Support Recs
Hemophilia Management Analytics
18
Copyright 2014 by Data Blueprint
Target Isn't Just Predicting Pregnancies
19
Copyright 2014 by Data Blueprint
http://rmportal.performedia.com/node/1373 and http://www.predictiveanalyticsworld.com/patimes/target-really-predict-teens-pregnancy-inside-story/ http://rmportal.performedia.com/rm/paw10/gallery_01#1373
Basics
20
Copyright 2014 by Data Blueprint
• Users can 

"drill" 

anywhere
• Entire collection
"cube" is
accessible
• Summaries to
transaction-level
detail
Sample questions …
21
Copyright 2014 by Data Blueprint


Cancer patient
revenue across
all facilities



Revenue for diseases
this year versus last
year in the NE region



Total costs and revenue at
top 10 facilities

• Emphasis on the
"cube"
– N dimensions
• Permits different
users to "slice and
dice" subsets of data
• Viewing from different
perspectives
Example: Set Analysis
22Copyright 2014 by Data Blueprint
Portfolio Analysis
23Copyright 2014 by Data Blueprint
• Bank accounts are of 

varying value and risk
• Cube by
– Social status
– Geographical location
– Net value, etc.
• Strategy or goal: 

balance return on the loan with risk of default
• How to evaluate the portfolio as a whole?
– Least risk loan may be to the very wealthy, but there are a very
limited number
– Many poor customers, but greater risk
• Solution may combine types of analyses
– When to lend, interest rate charged
15 years ago, CarMax started as a way to make the car buying experience simple, fair, and fun. Today CarMax is a FORTUNE 500 retailer and one of FORTUNE’s “100 Best Companies to
Work For.” And we are hiring talented individuals who are interested in:

--solving original, wide-ranging, and open-ended business problems

--not only discovering new insights, but successfully implementing them

--making a significant mark on a growing company

--developing the fundamental skills for a rewarding business career



If that sounds like you, the Strategy Analyst position is the unique opportunity you’ve been looking for. The strategy team at CarMax currently consists of over 40 analysts, many of whom
are recent college graduates from top schools with a variety of academic backgrounds (computer science, economics, English, engineering, journalism, math, political science). These
analysts lead advances and decisions in several key business areas:-Inventory and pricing—what is the optimal selection of inventory, how do we acquire it, what should we pay for it, what
should we price it for?

-Expansion planning—which markets should we enter and how do we store those markets? Will each $10-30 million store investment generate a sufficient economic return?

-Credit strategy—how can our bank (CarMax Auto Finance) approve more customers for loans and convert more approvals to sales?

-Marketing and consumer insight—how do we reach our customers, increase traffic to our stores, and best use the internet to drive sales and build our brand

-Industry and competitive research—what middle- and long-term risks are we exposed to, and how best do we prepare to respond? 

-Production—how do we increase vehicle reconditioning quality while reducing cost and production time?

-Sales process and workforce—what is the best way to serve customers in our stores, and how do we manage, motivate and compensate our sales team?



Even early in your career at CarMax, you will have the responsibility to own an area of the business and will be expected to improve it. For example, one undergraduate recruit used data
analysis to reformulate our retail pricing strategy, pitched and sold his idea to the senior executive team, and implemented a new system nationwide in his first 6 months with the company.
That is the kind of impact you can make at CarMax. And as you do this, you will work closely with the senior executives and analytical managers to develop the fundamental and advanced
skills that underpin a successful career in business. In fact, most of our managers in the strategy group started at CarMax as analysts, and our VP of Strategy and Analysis started his
career here through our undergraduate recruiting program. While an MBA is not required to advance or contribute at CarMax, analysts who have chosen to pursue a business degree have
enjoyed superior acceptance rates at their first choice schools, including Harvard, Chicago, UVa, Columbia, and Duke.
Your opportunities to develop, contribute, and lead as an analyst at CarMax are as great as the company’s opportunity to grow. While CarMax is already the largest used car retailer in the
country (with over $8 billion in sales and over 90 superstores across the country), we have only 2% of the 1 to 6-year-old used car market, which, at $280 billion annually, is bigger than the
home improvement or consumer electronics industries. CarMax is already growing at 15% a year, and over the next 10 years plans to have 250-300 stores and achieve $25+ billion in
annual sales. As an analyst, you can be an integral part of that growth, all while enjoying a casual and friendly environment, a diverse group of talented associates, a healthy work-life
balance, and excellent compensation and benefits. 



An ideal candidate will have
--Demonstrated top caliber analytic and problem solving skills --History of achievement demonstrated by top 15% GPA, with a quantitative major(s), and/or other recognition such as
scholarships, awards, honor societies
-- Passion for business and desire to develop into a strong business leader


We encourage you to apply. For more information, please visit us at the career fair, on our website (www.carmax.com/collegerecruiting), or email us at college_recruiting@carmax.com
- datablueprint.com
CarMax Example Job Posting
24Copyright 2014 by Data Blueprint
24
own an area of the business and will be expected to improve it
--solving original, wide-ranging, and open-ended business problems

--not only discovering new insights, but successfully implementing them

--making a significant mark on a growing company

--developing the fundamental skills for a rewarding business career
Polling Question #1
25Copyright 2014 by Data Blueprint
• Do you have/have
you started data
warehousing, marts
and/or other
warehousing forms
of integration?
a. Last year (2014)
b. This year (2015)
c. Next Year (2016)
d. Nope
Data Warehousing Strategies
26
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Technology
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
27Copyright 2014 by Data Blueprint
• ETL
• Change Management Tools
• Data Modeling Tools
• Data Profiling Tools
• Data Cleansing Tools
• Data Integration Tools
• Reference Data Management Applications
• Master Data Management Applications
• Process Modeling Tools
• Meta-data Repositories
• Business Process and Rule Engines
Warehousing Definitions
28Copyright 2014 by Data Blueprint
• Inmon:
– "A subject oriented, integrated, time variant, and non-volatile
collection of summary and detailed historical data used to support
the strategic decision-making processes of the organization."
• Kimball:
– "A copy of transaction data specifically structured for query and
analysis."
• Key concepts focus on:
– Subjects
– Transactions
– Non-volatility
– Restructuring
Courtesy of: http://www.infosys.com/industries/healthcare/industryofferings/Pages/healthcare - data-warehousing.aspx
Warehousing applied to a specific challenge
29
Copyright 2014 by Data Blueprint
Oracle
30
Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Corporate Information Factory Architecture
31Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
MetaMatrix Integration Example
32
Copyright 2014 by Data Blueprint
• EII Enterprise Information
Integration
– between ETL and EAI - delivers
tailored views of information to
users at the time that it is
required
Linked Data
33
Copyright 2014 by Data Blueprint
Linked Data is about using the Web to connect related data
that wasn't previously linked, or using the Web to lower the
barriers to linking data currently linked using other methods.
More specifically, Wikipedia defines Linked Data as "a term
used to describe a recommended best practice for exposing,
sharing, and connecting pieces of data, information, and
knowledge on the Semantic Web using URIs and RDF."
linkeddata.org
Health Care Provider Data Warehouse
34Copyright 2014 by Data Blueprint
"A roomful of MBAs
can accomplish this
analysis faster!"
• 1.8 million members
• 1.4 million providers
• 800,000 providers no key
• 29% prov_ssn ≠ 9 digits
• 2.2% prov_number = 9 digits (required)
• 1 User
• $30 million
Indiana Jones: Raiders Of The Lost Ark
35Copyright 2014 by Data Blueprint
Causes of Data Warehouse Failure
36Copyright 2014 by Data Blueprint
1. The project is over budget
2. Slipped schedule
3. Unimplemented functions 

and capabilities
4. Unhappy users
5. Unacceptable performance
6. Poor availability
7. Inability to expand
8. Poor quality data/reports
9. Too complicated for users
10. Project not cost justified
11. Poor quality data
12. Many more values of gender code than (M/F)
13. Incorrectly structured data
14. Provides correct answer to wrong question
15. Bad warehouse design
16. Overly complex
from The Data Administration Newsletter, www.tdan.com and The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Reframing the question
37
Copyright 2014 by Data Blueprint
• From: How shall we build this data
warehouse?
– (Worse) … What should go into this warehouse?
• To: How can warehousing capabilities 

solve this specific business challenge?
– (Better still) … How can warehousing capabilities 

solve this class of business challenges?
• Other examples
– Are you ready for a data warehouse?
✓ Foundational practices
– Will you get it right the first time?
✓ Is the business environment constantly evolving?
✓ Do you have an agreed upon enterprise-wide vocabulary?
– Is your data warehouse intended to be the enterprise
audit-able system of record?
✓ Extract, transform and load requirements
✓ Data transformation requirements
– How fast do you need results?
✓ Performance of inserts vs reads
✓ Project deliverables
Data Warehousing Strategies
38
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Copyright 2013 by Data Blueprint
Inmon Implementation/3NF
39
OPERATIONAL SYSTEM
OPERATIONAL SYSTEM
FLAT FILES
SUM MARY
DATA
RAW DATA
M ETADATA
PURCHASING
SALES
INVENTORY
ANALYSIS
REPORTING
MINING
Third Normal Form
40
Copyright 2014 by Data Blueprint
• Each attribute in the relationship is a fact about a key
– Highly normalized structure
• Use Cases
– Transactional Systems
– Operational Data Stores
Third Normal Form: Pros and Cons
41
Copyright 2014 by Data Blueprint
Neo4j.com
• Pros
– Easily understood by business and end users
– Reduced data redundancy
– Enforced referential integrity
– Indexed attributes/flexible querying
• Cons
– Joins can be expensive
– Does not scale
Copyright 2013 by Data Blueprint
Kimball Implementation/Dimensional
42
• Comprised of “fact tables” that contain quantitative data, and any
number of adjoining “dimension” tables
• Optimized for business reporting
• Use Cases
– OLAP (Online Analytic Processing)
– BI
Star Schema
43
Copyright 2014 by Data Blueprint
Wikipedia
Star Schema Pros and Cons
44
Copyright 2014 by Data Blueprint
• Pros
– Simple Design
– Fast Queries
– Most major DBMS are optimized for Star Schema Designs
• Cons
– Questions must be 

built into the design
– Data marts are 

often centralized 

on one fact table
Copyright 2013 by Data Blueprint
Data Vault Implementation
45
Data Vault
46
Copyright 2014 by Data Blueprint
Bukhantsov.org
• Designed to facilitate long-term historical storage, focusing on
ease of implementation
• Retains data lineage information (source/date)
• “All the data, all the time” - hybrid approach of Inmon and Kimball.
• Comprised of Hubs (which contain a list of business keys that do
not change often), Links (Associations/transactions between
hubs), and Satellites (descriptive attributes associated with hubs
and links)
• Use Cases
– Data Warehousing
– Complete Audit-ability
Data Vault Pros and Cons
47
Copyright 2014 by Data Blueprint
• Pros
– Simple integration
– Houses immense amounts of
data with excellent
performance
– Full data lineage captured
• Cons
– Complication is pushed to the
“back end”
– Can be difficult to setup for
many data workers
– No widespread support for ETL
tools yet
Comparison
48
Copyright 2014 by Data Blueprint
Polling Question #2
49Copyright 2014 by Data Blueprint
• Do you have?
a. A single enterprise data warehouse
b. Coordinated data marts
c. Both
d. Uncoordinated efforts
e. None
Data Warehousing Strategies
50
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Source:http://dmreview.com/article_sub.cfm?articleID=1000941 used with permission
Meta Data Models
51Copyright 2014 by Data Blueprint
Metadata Data Model
SCREEN
ELEMENT
screen element id #
data item id #
screen element descr.
INTERFACE
ELEMENT
interface element id #
data item id #
interface element descr.
INPUT
ELEMENT
input element id #
data item id #
input element descr.
OUTPUT
ELEMENT
output element id #
data item id #
output element descr.
MODEL
VIEW
model view element id #
data item id #
model view element des.
DEPENDENCY
dependency elem id #
data item id #
process id #
dependency description
CODE
code id #
data item id #
stored data item #
code location
INFORMATION
information id #
data item id #
information descr.
information request
PROCESS
process id #
data item id #
process description
USER TYPE
user type id #
data item id #
information id #
user type description
LOCATION
location id #
information id #
printout element id #
process id #
stored data items id #
user type id #
location description
PRINTOUT
ELEMENT
printout element id #
data item id #
printout element descr.
STORED DATA ITEM
stored data item id #
data item id #
location id #
stored data description
DATA ITEM
data item id #
data item description
52
Copyright 2014 by Data Blueprint
Warehouse  
Process
Warehouse  
Opera.on
Transforma.on
XML
Record-­‐  
Oriented
Mul.  
Dimensional
Rela.onal
Business  
Informa.on
So@ware  
Deployment
ObjectModel  
(Core,  Behavioral,  Rela.onships,  Instance)
Warehouse
Management
Resources
Analysis
Object-­‐  
Oriented  
(ObjectModel)
Foundation
OLAP
Data    
Mining
Informa.on  
Visualiza.on
Business  
Nomenclature
Data  
Types
Expressions
Keys  
Index
Type  
Mapping
Overview of CWM Metamodel
http://www.omg.org/technology/documents/modeling_spec_catalog.htm
53Copyright 2014 by Data Blueprint
Marco & Jennings's Complete Meta Data Model
Source:http://dmreview.com/article_sub.cfm?articleID=1000941 used with permission
54
Copyright 2014 by Data Blueprint
Data Warehousing Strategies
55
Copyright 2014 by Data Blueprint
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Guiding Principles
56Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
1. Obtain executive commitment and support
2. Secure business SMEs
3. Let the business drive the priorities
4. Demonstrate data quality is essential
5. Provide incremental value
6. Transparency and self service
7. One size does not fit all: Secure the right tools 

and products for each of your segments
8. Think and architect globally, act and build locally
9. Collaborate with and integrate all other data initiatives, especially those for
data governance, data quality and metadata
10.Start with the end in mind
11.Summarize and optimize last, not first
Data Reengineering Leverage
57
Copyright 2014 by Data Blueprint
Data Management Practices
Duplicated but ETLed Data

(quality & transformations applied)




"Warehoused" Data




Learning/

Feedback
Marts
Analytics Practices
58
Copyright 2014 by Data Blueprint
Data Warehousing Strategies
1. Warehousing data in the context of data
management
2. Motivation for integration technologies 

(reporting->BI->Analytics)
3. Warehouse integration technologies
4. Three warehousing architecture foci
5. The use of meta models
6. Guiding principles & best practices
Data Warehousing & Business Intelligence Management
59Copyright 2014 by Data Blueprint
Questions?
60
Copyright 2014 by Data Blueprint
It’s your turn!
Use the chat feature or Twitter (#dataed) to submit
your questions to Peter and Steven now.
• www.datablueprint.com/webinar-schedule
• www.Dataversity.net
Brought to you by:
Upcoming Events
January Webinar:
Developing a Data-centric Strategy & Roadmap
January 12, 2016 @ 2:00 PM – 3:30 PM ET
(11:00 AM-12:30 PM PT)
February Webinar:
The Importance of Master Data Management
February 9, 2016 @ 2:00 PM – 3:30 PM ET
(11:00 AM-12:30 PM PT)
Sign up here:
61Copyright 2014 by Data Blueprint
Appendix
62
Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Goals and Principles
63Copyright 2014 by Data Blueprint
1. To support and enable
effective business analysis
and decision making by
knowledgeable workers
2. To build and maintain the
environment/infrastructure to
support business intelligence
activities, specifically
leveraging all the other data
management functions to
cost effectively deliver
consistent integrated data
for all BI activities
Activities
64Copyright 2014 by Data Blueprint
• Understand BI information needs
• Define and maintain the DW/BI 

architecture
• Process data for BI
• Implement data warehouse/data marts
• Implement BI tools and user interfaces
• Monitor and tune DW processes
• Monitor and tune BI activities and performance
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Primary Deliverables
65Copyright 2014 by Data Blueprint
• DW/BI Architecture
• Data warehouses, marts, 

cubes etc.
• Dashboards-scorecards
• Analytic applications
• Files extracts (for data mining, etc.)
• BI tools and user environments
• Data quality feedback mechanism/loop
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Roles and Responsibilities
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
66Copyright 2014 by Data Blueprint
Suppliers:
• Executives/managers
• Subject Matter Experts
• Data governance council
• Information consumers
• Data producers
• Data architects/analysts
Participants:
• Executives/managers
• Data Stewards
• Subject Matter Experts
• Data Architects
• Data Analysts
• Application Architects
• Data Governance Council
• Data Providers
• Other BI Professionals
Consumers:
• Application Users
• BI and Reporting
Users
• Application
Developers and
Architects
• Data integration
Developers and
Architects
• BI Vendors and
Architects
• Vendors, Customers
and Partners
6 Best Practices for Data Warehousing
67Copyright 2014 by Data Blueprint
1.Do some initial architecture
envisioning.
2.Model the details just in time (JIT).
3.Prove the architecture early.
4.Focus on usage.
5.Organize your work by requirements.
6.Active stakeholder participation.
http://www.agiledata.org/essays/dataWarehousingBestPractices.html
Kimball's DW Chess Pieces
68Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
5 Key Business Intelligence Trends
69Copyright 2014 by Data Blueprint
1. There's so much data, but too little 

insight. More data translates to a 

greater need to manage it and make 

it actionable.
2. Market consolidation means fewer 

choices for business intelligence users.
3. Business Intelligence expands from the Board Room to the
front lines. Increasingly, business intelligence tools will be
available at all levels of the corporation
4. The convergence of structured and unstructured data Will
create better business intelligence.
5. Applications will provide new views of business intelligence
data. The next generation of business intelligence
applications is moving beyond the pie charts and bar charts
into more visual depictions of data and trends.
http://www.cio.com/article/150450/Five_Key_Business_Intelligence_Trends_You_Need_to_Know?page=2&taxonomyId=30
Corporate Information Factory Architecture
70Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Corporate Information Factory Architecture
71Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Corporate Information Factory Architecture
72Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
Corporate Information Factory Architecture
73Copyright 2014 by Data Blueprint
from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
References
74Copyright 2014 by Data Blueprint
References
75Copyright 2014 by Data Blueprint
Additional References
76Copyright 2014 by Data Blueprint
• http://www.information-management.com/infodirect/20050909/1036703-1.html
• http://www.agiledata.org/essays/dataWarehousingBestPractices.html
• http://www.cio.com/article/150450/
Five_Key_Business_Intelligence_Trends_You_Need_to_Know?page=2&taxonomyId=3002
• http://www.computerworld.com/s/article/9228736/
Business_Intelligence_and_analytics_Conquering_Big_Data?taxonomyId=9
• http://www.enterpriseirregulars.com/5706/the-top-10-trends-for-2010-in-analytics-business-
intelligence-and-performance-management/
• http://www.itbusinessedge.com/cm/blogs/vizard/taking-the-analytics-pressure-off-the-data-
warehouse/?cs=50698
• http://www.informationweek.com/news/software/bi/240001922

Data-Ed Webinar: Data Warehouse Strategies

  • 1.
    Peter Aiken, Ph.D.& Steven MacLauchlan Data Warehousing Strategies
  • 2.
    2 Copyright 2014 byData Blueprint Premise Two types of listeners … 1. Interested in how to approach the subject of warehousing data – Need to integrate disparate data – Need more holistic view of business operations – Management just discovered data warehouses and wants you to "build one" 2. Have complex and/or messy data warehouse practices – Want to improve them
  • 3.
    Data Warehousing Strategies 3 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 4.
    Maslow's Hierarchiy ofNeeds 4 Copyright 2014 by Data Blueprint
  • 5.
    You can accomplish AdvancedData Practices without becoming proficient in the Foundational Data Management Practices however this will: • Take longer • Cost more • Deliver less • Present 
 greater
 risk
 (with thanks to Tom DeMarco) Data Management Practices Hierarchy Advanced 
 Data 
 Practices • MDM • Mining • Big Data • Analytics • Warehousing • SOA Foundational Data Management Practices 5 Copyright 2014 by Data Blueprint Data Platform/Architecture Data Governance Data Quality Data Operations Data Management Strategy Technologies Capabilities
  • 6.
    UsesReuses What is datamanagement? 6 Copyright 2014 by Data Blueprint Sources Data Governance 
 Data Engineering 
 Data 
 Delivery 
 Data
 Storage Specialized Team Skills Understanding the current and future data needs of an enterprise and making that data effective and efficient in supporting 
 business activities

 Aiken, P, Allen, M. D., Parker, B., Mattia, A., 
 "Measuring Data Management's Maturity: 
 A Community's Self-Assessment" 
 IEEE Computer (research feature April 2007) Data management practices connect data sources and uses in an organized and efficient manner • Storage • Engineering • Delivery • Governance When executed, 
 engineering, storage, and 
 delivery implement governance Note: does not well-depict data reuse
  • 7.
    Maintain fit-for-purpose data, efficientlyand effectively DMM℠ Structure of 
 5 Integrated 
 DM Practice Areas 7 Copyright 2014 by Data Blueprint Manage data coherently Manage data assets professionally Data architecture implementation Data engineering implementation Organizational support
  • 8.
    DataManagementBodyofKnowledge 8 Copyright 2014 byData Blueprint Data Management Functions
  • 9.
    DAMA DM BoK& CDMP 9 Copyright 2014 by Data Blueprint • Data Management Body of Knowledge (DMBOK) – Published by DAMA International, the professional association for 
 Data Managers (40 chapters worldwide) – Organized around primary data management functions focused around data delivery to the organization and several environmental elements • Certified Data Management Professional (CDMP) – Series of 3 exams by DAMA International and ICCP – Membership in a distinct group of 
 fellow professionals – Recognition for specialized knowledge in a 
 choice of 17 specialty areas – For more information, please visit: • www.dama.org, www.iccp.org
  • 10.
    Data Warehousing &Business Intelligence Management 10Copyright 2014 by Data Blueprint
  • 11.
    Warehousing data inthe context of data management 11 Copyright 2014 by Data Blueprint Assumes you have • An overarching data strategy • A strategy for becoming familiar with "big data technologies" • Made a decision to not make available (integrating or storing) needed data • Decided to increase (or decrease) the complexity of existing DM practices • Decided to learn more about this DM BoK slice UsesReusesSources Data Governance 
 Data Engineering 
 Data 
 Delivery 
 Data
 Storage Specialized Team Skills
  • 12.
    Data Warehousing Strategies 12 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 13.
    Payroll Application
 (3rd GL)PayrollData (database) R& D Applications
 (researcher supported, no documentation) R & D Data (raw) Mfg. Data (home grown database) Mfg. Applications
 (contractor supported) 
 Finance Data (indexed) Finance Application
 (3rd GL, batch 
 system, no source) Marketing Application
 (4rd GL, query facilities, 
 no reporting, very large) 
 Marketing Data (external database) Personnel App.
 (20 years old,
 un-normalized data) 
 Personnel Data
 (database) Typical System Evolution 13 Copyright 2014 by Data Blueprint Multiple Sources of (for example) Customer Data
  • 14.
    Payroll Data (database) R &D Data (raw) Mfg. Data (home grown database) 
 Finance Data (indexed) 
 Marketing Data (external database) 
 Personnel Data
 (database) ... Then Integrate 14 Copyright 2014 by Data Blueprint Organizational
 Data
  • 15.
    Payroll Data (database) R &D Data (raw) Mfg. Data (home grown database) 
 Finance Data (indexed) 
 Marketing Data (external database) 
 Personnel Data
 (database) ... Then Re-architect 15 Copyright 2014 by Data Blueprint Organizational
 Data
  • 16.
    An organization's integrationneeds ... 16 Copyright 2014 by Data Blueprint 
 Software Package 1 
 Software Package 2 
 Software Package 3 
 Software Package 4 
 Software Package 5 
 Software Package 6 Data Architecture ... map between and across software packages
  • 17.
    Defining Data Warehousing,BI/Analytics 17Copyright 2014 by Data Blueprint • Data Warehousing – A technology solution supporting … business capabilities 
 such as: query, analysis, reporting and development 
 of these capabilities – Analysis of information not previously integrated – Another, often new, set of organizational capabilities • Business Intelligence (aka. decision support) – Dates at least to 1958 – Support better business decision making – Technologies, applications and practices for the collection, integration, analysis, and presentation of business information – Understanding historical patterns in data to improve future performance – Use of mathematics in business • Analytics (aka.) enterprise decision management, marketing analytics, predictive science, strategy science, credit risk analysis. fraud analytics - often based on computational modeling • Reframing the question … – From: what data warehouse should we build? – To: how can data warehouse-based integration address challenges? From The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 18.
    Descriptive Ask: What happened?What is happening? Find: Structured data Show: Profiles, Bar/pie charts, Narrative Predictive Ask: What will happen? Why will it happen? Find: Structured/unstructured data Show: Risk Profiles, Pros/Cons, Care Recs Prescriptive Ask: What should I do? Why should I do it? Find: Unstructured/structured data Show: Strategic Goals, Support Recs Hemophilia Management Analytics 18 Copyright 2014 by Data Blueprint
  • 19.
    Target Isn't JustPredicting Pregnancies 19 Copyright 2014 by Data Blueprint http://rmportal.performedia.com/node/1373 and http://www.predictiveanalyticsworld.com/patimes/target-really-predict-teens-pregnancy-inside-story/ http://rmportal.performedia.com/rm/paw10/gallery_01#1373
  • 20.
    Basics 20 Copyright 2014 byData Blueprint • Users can 
 "drill" 
 anywhere • Entire collection "cube" is accessible • Summaries to transaction-level detail
  • 21.
    Sample questions … 21 Copyright2014 by Data Blueprint 
 Cancer patient revenue across all facilities
 
 Revenue for diseases this year versus last year in the NE region
 
 Total costs and revenue at top 10 facilities
 • Emphasis on the "cube" – N dimensions • Permits different users to "slice and dice" subsets of data • Viewing from different perspectives
  • 22.
    Example: Set Analysis 22Copyright2014 by Data Blueprint
  • 23.
    Portfolio Analysis 23Copyright 2014by Data Blueprint • Bank accounts are of 
 varying value and risk • Cube by – Social status – Geographical location – Net value, etc. • Strategy or goal: 
 balance return on the loan with risk of default • How to evaluate the portfolio as a whole? – Least risk loan may be to the very wealthy, but there are a very limited number – Many poor customers, but greater risk • Solution may combine types of analyses – When to lend, interest rate charged
  • 24.
    15 years ago,CarMax started as a way to make the car buying experience simple, fair, and fun. Today CarMax is a FORTUNE 500 retailer and one of FORTUNE’s “100 Best Companies to Work For.” And we are hiring talented individuals who are interested in:
 --solving original, wide-ranging, and open-ended business problems
 --not only discovering new insights, but successfully implementing them
 --making a significant mark on a growing company
 --developing the fundamental skills for a rewarding business career
 
 If that sounds like you, the Strategy Analyst position is the unique opportunity you’ve been looking for. The strategy team at CarMax currently consists of over 40 analysts, many of whom are recent college graduates from top schools with a variety of academic backgrounds (computer science, economics, English, engineering, journalism, math, political science). These analysts lead advances and decisions in several key business areas:-Inventory and pricing—what is the optimal selection of inventory, how do we acquire it, what should we pay for it, what should we price it for?
 -Expansion planning—which markets should we enter and how do we store those markets? Will each $10-30 million store investment generate a sufficient economic return?
 -Credit strategy—how can our bank (CarMax Auto Finance) approve more customers for loans and convert more approvals to sales?
 -Marketing and consumer insight—how do we reach our customers, increase traffic to our stores, and best use the internet to drive sales and build our brand
 -Industry and competitive research—what middle- and long-term risks are we exposed to, and how best do we prepare to respond? 
 -Production—how do we increase vehicle reconditioning quality while reducing cost and production time?
 -Sales process and workforce—what is the best way to serve customers in our stores, and how do we manage, motivate and compensate our sales team?
 
 Even early in your career at CarMax, you will have the responsibility to own an area of the business and will be expected to improve it. For example, one undergraduate recruit used data analysis to reformulate our retail pricing strategy, pitched and sold his idea to the senior executive team, and implemented a new system nationwide in his first 6 months with the company. That is the kind of impact you can make at CarMax. And as you do this, you will work closely with the senior executives and analytical managers to develop the fundamental and advanced skills that underpin a successful career in business. In fact, most of our managers in the strategy group started at CarMax as analysts, and our VP of Strategy and Analysis started his career here through our undergraduate recruiting program. While an MBA is not required to advance or contribute at CarMax, analysts who have chosen to pursue a business degree have enjoyed superior acceptance rates at their first choice schools, including Harvard, Chicago, UVa, Columbia, and Duke. Your opportunities to develop, contribute, and lead as an analyst at CarMax are as great as the company’s opportunity to grow. While CarMax is already the largest used car retailer in the country (with over $8 billion in sales and over 90 superstores across the country), we have only 2% of the 1 to 6-year-old used car market, which, at $280 billion annually, is bigger than the home improvement or consumer electronics industries. CarMax is already growing at 15% a year, and over the next 10 years plans to have 250-300 stores and achieve $25+ billion in annual sales. As an analyst, you can be an integral part of that growth, all while enjoying a casual and friendly environment, a diverse group of talented associates, a healthy work-life balance, and excellent compensation and benefits. 
 
 An ideal candidate will have --Demonstrated top caliber analytic and problem solving skills --History of achievement demonstrated by top 15% GPA, with a quantitative major(s), and/or other recognition such as scholarships, awards, honor societies -- Passion for business and desire to develop into a strong business leader 
 We encourage you to apply. For more information, please visit us at the career fair, on our website (www.carmax.com/collegerecruiting), or email us at college_recruiting@carmax.com - datablueprint.com CarMax Example Job Posting 24Copyright 2014 by Data Blueprint 24 own an area of the business and will be expected to improve it --solving original, wide-ranging, and open-ended business problems
 --not only discovering new insights, but successfully implementing them
 --making a significant mark on a growing company
 --developing the fundamental skills for a rewarding business career
  • 25.
    Polling Question #1 25Copyright2014 by Data Blueprint • Do you have/have you started data warehousing, marts and/or other warehousing forms of integration? a. Last year (2014) b. This year (2015) c. Next Year (2016) d. Nope
  • 26.
    Data Warehousing Strategies 26 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 27.
    Technology from The DAMAGuide to the Data Management Body of Knowledge © 2009 by DAMA International 27Copyright 2014 by Data Blueprint • ETL • Change Management Tools • Data Modeling Tools • Data Profiling Tools • Data Cleansing Tools • Data Integration Tools • Reference Data Management Applications • Master Data Management Applications • Process Modeling Tools • Meta-data Repositories • Business Process and Rule Engines
  • 28.
    Warehousing Definitions 28Copyright 2014by Data Blueprint • Inmon: – "A subject oriented, integrated, time variant, and non-volatile collection of summary and detailed historical data used to support the strategic decision-making processes of the organization." • Kimball: – "A copy of transaction data specifically structured for query and analysis." • Key concepts focus on: – Subjects – Transactions – Non-volatility – Restructuring
  • 29.
    Courtesy of: http://www.infosys.com/industries/healthcare/industryofferings/Pages/healthcare- data-warehousing.aspx Warehousing applied to a specific challenge 29 Copyright 2014 by Data Blueprint
  • 30.
    Oracle 30 Copyright 2014 byData Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 31.
    Corporate Information FactoryArchitecture 31Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 32.
    MetaMatrix Integration Example 32 Copyright2014 by Data Blueprint • EII Enterprise Information Integration – between ETL and EAI - delivers tailored views of information to users at the time that it is required
  • 33.
    Linked Data 33 Copyright 2014by Data Blueprint Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." linkeddata.org
  • 34.
    Health Care ProviderData Warehouse 34Copyright 2014 by Data Blueprint "A roomful of MBAs can accomplish this analysis faster!" • 1.8 million members • 1.4 million providers • 800,000 providers no key • 29% prov_ssn ≠ 9 digits • 2.2% prov_number = 9 digits (required) • 1 User • $30 million
  • 35.
    Indiana Jones: RaidersOf The Lost Ark 35Copyright 2014 by Data Blueprint
  • 36.
    Causes of DataWarehouse Failure 36Copyright 2014 by Data Blueprint 1. The project is over budget 2. Slipped schedule 3. Unimplemented functions 
 and capabilities 4. Unhappy users 5. Unacceptable performance 6. Poor availability 7. Inability to expand 8. Poor quality data/reports 9. Too complicated for users 10. Project not cost justified 11. Poor quality data 12. Many more values of gender code than (M/F) 13. Incorrectly structured data 14. Provides correct answer to wrong question 15. Bad warehouse design 16. Overly complex from The Data Administration Newsletter, www.tdan.com and The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 37.
    Reframing the question 37 Copyright2014 by Data Blueprint • From: How shall we build this data warehouse? – (Worse) … What should go into this warehouse? • To: How can warehousing capabilities 
 solve this specific business challenge? – (Better still) … How can warehousing capabilities 
 solve this class of business challenges? • Other examples – Are you ready for a data warehouse? ✓ Foundational practices – Will you get it right the first time? ✓ Is the business environment constantly evolving? ✓ Do you have an agreed upon enterprise-wide vocabulary? – Is your data warehouse intended to be the enterprise audit-able system of record? ✓ Extract, transform and load requirements ✓ Data transformation requirements – How fast do you need results? ✓ Performance of inserts vs reads ✓ Project deliverables
  • 38.
    Data Warehousing Strategies 38 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 39.
    Copyright 2013 byData Blueprint Inmon Implementation/3NF 39 OPERATIONAL SYSTEM OPERATIONAL SYSTEM FLAT FILES SUM MARY DATA RAW DATA M ETADATA PURCHASING SALES INVENTORY ANALYSIS REPORTING MINING
  • 40.
    Third Normal Form 40 Copyright2014 by Data Blueprint • Each attribute in the relationship is a fact about a key – Highly normalized structure • Use Cases – Transactional Systems – Operational Data Stores
  • 41.
    Third Normal Form:Pros and Cons 41 Copyright 2014 by Data Blueprint Neo4j.com • Pros – Easily understood by business and end users – Reduced data redundancy – Enforced referential integrity – Indexed attributes/flexible querying • Cons – Joins can be expensive – Does not scale
  • 42.
    Copyright 2013 byData Blueprint Kimball Implementation/Dimensional 42
  • 43.
    • Comprised of“fact tables” that contain quantitative data, and any number of adjoining “dimension” tables • Optimized for business reporting • Use Cases – OLAP (Online Analytic Processing) – BI Star Schema 43 Copyright 2014 by Data Blueprint Wikipedia
  • 44.
    Star Schema Prosand Cons 44 Copyright 2014 by Data Blueprint • Pros – Simple Design – Fast Queries – Most major DBMS are optimized for Star Schema Designs • Cons – Questions must be 
 built into the design – Data marts are 
 often centralized 
 on one fact table
  • 45.
    Copyright 2013 byData Blueprint Data Vault Implementation 45
  • 46.
    Data Vault 46 Copyright 2014by Data Blueprint Bukhantsov.org • Designed to facilitate long-term historical storage, focusing on ease of implementation • Retains data lineage information (source/date) • “All the data, all the time” - hybrid approach of Inmon and Kimball. • Comprised of Hubs (which contain a list of business keys that do not change often), Links (Associations/transactions between hubs), and Satellites (descriptive attributes associated with hubs and links) • Use Cases – Data Warehousing – Complete Audit-ability
  • 47.
    Data Vault Prosand Cons 47 Copyright 2014 by Data Blueprint • Pros – Simple integration – Houses immense amounts of data with excellent performance – Full data lineage captured • Cons – Complication is pushed to the “back end” – Can be difficult to setup for many data workers – No widespread support for ETL tools yet
  • 48.
  • 49.
    Polling Question #2 49Copyright2014 by Data Blueprint • Do you have? a. A single enterprise data warehouse b. Coordinated data marts c. Both d. Uncoordinated efforts e. None
  • 50.
    Data Warehousing Strategies 50 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 51.
    Source:http://dmreview.com/article_sub.cfm?articleID=1000941 used withpermission Meta Data Models 51Copyright 2014 by Data Blueprint
  • 52.
    Metadata Data Model SCREEN ELEMENT screenelement id # data item id # screen element descr. INTERFACE ELEMENT interface element id # data item id # interface element descr. INPUT ELEMENT input element id # data item id # input element descr. OUTPUT ELEMENT output element id # data item id # output element descr. MODEL VIEW model view element id # data item id # model view element des. DEPENDENCY dependency elem id # data item id # process id # dependency description CODE code id # data item id # stored data item # code location INFORMATION information id # data item id # information descr. information request PROCESS process id # data item id # process description USER TYPE user type id # data item id # information id # user type description LOCATION location id # information id # printout element id # process id # stored data items id # user type id # location description PRINTOUT ELEMENT printout element id # data item id # printout element descr. STORED DATA ITEM stored data item id # data item id # location id # stored data description DATA ITEM data item id # data item description 52 Copyright 2014 by Data Blueprint
  • 53.
    Warehouse   Process Warehouse   Opera.on Transforma.on XML Record-­‐  Oriented Mul.   Dimensional Rela.onal Business   Informa.on So@ware   Deployment ObjectModel   (Core,  Behavioral,  Rela.onships,  Instance) Warehouse Management Resources Analysis Object-­‐   Oriented   (ObjectModel) Foundation OLAP Data     Mining Informa.on   Visualiza.on Business   Nomenclature Data   Types Expressions Keys   Index Type   Mapping Overview of CWM Metamodel http://www.omg.org/technology/documents/modeling_spec_catalog.htm 53Copyright 2014 by Data Blueprint
  • 54.
    Marco & Jennings'sComplete Meta Data Model Source:http://dmreview.com/article_sub.cfm?articleID=1000941 used with permission 54 Copyright 2014 by Data Blueprint
  • 55.
    Data Warehousing Strategies 55 Copyright2014 by Data Blueprint 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 56.
    Guiding Principles 56Copyright 2014by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International 1. Obtain executive commitment and support 2. Secure business SMEs 3. Let the business drive the priorities 4. Demonstrate data quality is essential 5. Provide incremental value 6. Transparency and self service 7. One size does not fit all: Secure the right tools 
 and products for each of your segments 8. Think and architect globally, act and build locally 9. Collaborate with and integrate all other data initiatives, especially those for data governance, data quality and metadata 10.Start with the end in mind 11.Summarize and optimize last, not first
  • 57.
    Data Reengineering Leverage 57 Copyright2014 by Data Blueprint Data Management Practices Duplicated but ETLed Data
 (quality & transformations applied) 
 
 "Warehoused" Data 
 
 Learning/
 Feedback Marts Analytics Practices
  • 58.
    58 Copyright 2014 byData Blueprint Data Warehousing Strategies 1. Warehousing data in the context of data management 2. Motivation for integration technologies 
 (reporting->BI->Analytics) 3. Warehouse integration technologies 4. Three warehousing architecture foci 5. The use of meta models 6. Guiding principles & best practices
  • 59.
    Data Warehousing &Business Intelligence Management 59Copyright 2014 by Data Blueprint
  • 60.
    Questions? 60 Copyright 2014 byData Blueprint It’s your turn! Use the chat feature or Twitter (#dataed) to submit your questions to Peter and Steven now.
  • 61.
    • www.datablueprint.com/webinar-schedule • www.Dataversity.net Broughtto you by: Upcoming Events January Webinar: Developing a Data-centric Strategy & Roadmap January 12, 2016 @ 2:00 PM – 3:30 PM ET (11:00 AM-12:30 PM PT) February Webinar: The Importance of Master Data Management February 9, 2016 @ 2:00 PM – 3:30 PM ET (11:00 AM-12:30 PM PT) Sign up here: 61Copyright 2014 by Data Blueprint
  • 62.
  • 63.
    from The DAMAGuide to the Data Management Body of Knowledge © 2009 by DAMA International Goals and Principles 63Copyright 2014 by Data Blueprint 1. To support and enable effective business analysis and decision making by knowledgeable workers 2. To build and maintain the environment/infrastructure to support business intelligence activities, specifically leveraging all the other data management functions to cost effectively deliver consistent integrated data for all BI activities
  • 64.
    Activities 64Copyright 2014 byData Blueprint • Understand BI information needs • Define and maintain the DW/BI 
 architecture • Process data for BI • Implement data warehouse/data marts • Implement BI tools and user interfaces • Monitor and tune DW processes • Monitor and tune BI activities and performance from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 65.
    Primary Deliverables 65Copyright 2014by Data Blueprint • DW/BI Architecture • Data warehouses, marts, 
 cubes etc. • Dashboards-scorecards • Analytic applications • Files extracts (for data mining, etc.) • BI tools and user environments • Data quality feedback mechanism/loop from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 66.
    Roles and Responsibilities fromThe DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International 66Copyright 2014 by Data Blueprint Suppliers: • Executives/managers • Subject Matter Experts • Data governance council • Information consumers • Data producers • Data architects/analysts Participants: • Executives/managers • Data Stewards • Subject Matter Experts • Data Architects • Data Analysts • Application Architects • Data Governance Council • Data Providers • Other BI Professionals Consumers: • Application Users • BI and Reporting Users • Application Developers and Architects • Data integration Developers and Architects • BI Vendors and Architects • Vendors, Customers and Partners
  • 67.
    6 Best Practicesfor Data Warehousing 67Copyright 2014 by Data Blueprint 1.Do some initial architecture envisioning. 2.Model the details just in time (JIT). 3.Prove the architecture early. 4.Focus on usage. 5.Organize your work by requirements. 6.Active stakeholder participation. http://www.agiledata.org/essays/dataWarehousingBestPractices.html
  • 68.
    Kimball's DW ChessPieces 68Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 69.
    5 Key BusinessIntelligence Trends 69Copyright 2014 by Data Blueprint 1. There's so much data, but too little 
 insight. More data translates to a 
 greater need to manage it and make 
 it actionable. 2. Market consolidation means fewer 
 choices for business intelligence users. 3. Business Intelligence expands from the Board Room to the front lines. Increasingly, business intelligence tools will be available at all levels of the corporation 4. The convergence of structured and unstructured data Will create better business intelligence. 5. Applications will provide new views of business intelligence data. The next generation of business intelligence applications is moving beyond the pie charts and bar charts into more visual depictions of data and trends. http://www.cio.com/article/150450/Five_Key_Business_Intelligence_Trends_You_Need_to_Know?page=2&taxonomyId=30
  • 70.
    Corporate Information FactoryArchitecture 70Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 71.
    Corporate Information FactoryArchitecture 71Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 72.
    Corporate Information FactoryArchitecture 72Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 73.
    Corporate Information FactoryArchitecture 73Copyright 2014 by Data Blueprint from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International
  • 74.
  • 75.
  • 76.
    Additional References 76Copyright 2014by Data Blueprint • http://www.information-management.com/infodirect/20050909/1036703-1.html • http://www.agiledata.org/essays/dataWarehousingBestPractices.html • http://www.cio.com/article/150450/ Five_Key_Business_Intelligence_Trends_You_Need_to_Know?page=2&taxonomyId=3002 • http://www.computerworld.com/s/article/9228736/ Business_Intelligence_and_analytics_Conquering_Big_Data?taxonomyId=9 • http://www.enterpriseirregulars.com/5706/the-top-10-trends-for-2010-in-analytics-business- intelligence-and-performance-management/ • http://www.itbusinessedge.com/cm/blogs/vizard/taking-the-analytics-pressure-off-the-data- warehouse/?cs=50698 • http://www.informationweek.com/news/software/bi/240001922