/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T.C. Redman, Page 1
What Does
“Manage Data Assets”
Really Mean?
Dataversity Webinar
March 15, 2016
Thomas C. Redman, Ph.D.
Data Quality Solutions
www.dataqualitysolutions.com
Objectives
!  Provide a simple way of thinking about what it takes for data
to qualify as assets.
!  Re-examine the management system for data quality, calling
out the essential roles of data creators and data customers.
!  Show how improving data quality is a legitimate strategy in
its own right.
!  Ask that everyone make a “Friday Afternoon Measurement”
to baseline data quality in their area.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 2
But first, a bit of context
!  Circa 2005: “Manage data as assets” enters the lexicon, without
any content.
"  An odd choice of phrase.
!  2008: wrote Data Driven, partly in response.
!  Soon thereafter: “Data driven” enters the lexicon, though usually in
a narrower context.
!  In parallel: “Big data,” “Data scientist,” “Chief Data Officer” enter
the lexicon.
!  Recently: Witness the disruption caused by Uber, simply by
connecting “I need a ride” with “I’m looking for a fare.”
!  Looking forward: For most, the future depends on data.
BIGGEST CHANGE: THE URGENCY OF RECOGNIZING THAT,
“PROPERLY MANAGED, DATA ARE ASSETS OF GREAT
POWER”
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 3
But all is far from well in the data space
The future may depend on data, but:
!  Increasingly apparent that data management has failed
most companies.
!  Fear has replaced apathy as the number one enemy of
all things data.
!  Most organizations can only be judged as “unfit for
data.”
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 4
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 5
What Does “Manage Data Assets” Really
Mean?
Generally recognized as assets:
!  Capital, in its various forms
!  People, including the knowledge in
their heads.
When one recognizes anything as an
asset, they:
!  Take care of it.
!  Put it to work, in the private sector,
usually to make money.
!  Advance the management system
in recognition of the ways the asset
is different from other assets.
Suppose You’re Good With Numbers….
To advance quantitative skill to the level of an asset, you
would:
Take Care of that Skill:
!  Do the KenKen.
!  Build that skill by getting a technical degree.
Put that Skill to Work:
!  Become a data scientist.
!  Build a career based on those growing skills.
Advance your personal management of the skill:
!  Manage yourself to take full advantage of the skill.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 6
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 7
For Data, “taking care” is about quality,
security and privacy
Prescription 1:
!  Possess and acquire the right kinds of data.
!  Make sure people can access, understand, and trust
them.
!  They are of high enough quality to withstand market
scrutiny.
!  Improve/build on the most important data.
!  Keep them safe from loss or theft.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 8
Putting data to work
Prescription 2: Use data to create new revenue
!  Sell them directly in the market.
!  Build them into other products and services.
!  Uncover and exploit the “hidden nuggets” that lie therein
(big data/analytics).
!  Create and exploit an “information asymmetry.”
!  Make better decisions/improve the day-in, day-out
running of the business.
Critical point: Management must think through how they
will put data to work to create new value.
So far, I’ve identified eighteen distinct
ways to “put data to work”
Provide (Sell) Content
!  New Content
!  Re-package
!  Informationalization
!  Unbundling
!  Exploiting
Asymmetries
!  Closing Asymmetries
Facilitators
!  Own the Identifiers
!  Infomediation
!  Big Data/Advanced
Analytics
!  Privacy and security
!  Training
!  New Marketplaces
!  Infrastructure
technologies
!  Information appliances
!  Tools
Working out “what’s right for us” is
the key challenge for senior
leadership! And “forcing the
conversation” is the number one
priority of the “Top Data Job.”
Internally
!  Improve
operational
efficiency
!  360°-view
!  Data-Driven
Culture
/Redman-DataAssets-DV-March2016 T. C. Redman, Page 9©Data Quality Solutions, 2000-2016
Four basic strategies for competing with
data (with dozens of variants)
!  Innovation (Big Data/Advanced Analytics): Find
hidden nuggets in the data and,…
!  Content: Provide or exploit content that others don’t
have.
"  Informationalization.
"  Infomediation (e.g., Google).
"  Asymmetry (e.g., Hedge fund).
!  Build a Data-Driven Culture: Make better decisions,
bottom-to-top and across the company.
!  Be the low-cost provider: Superior data quality keeps
costs down!
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 10
Uber: Informationalizing the Taxi
Business
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 11
I need a
ride
I’m
looking
for a fare
Witness the disruption caused by connecting with
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 12
Advance the management system
Prescription 3:
!  Recognize that data have unique properties:
"  Example: Unlike other assets, data can be shared
"  Data may be the “ultimate proprietary asset.”
!  Craft a “data friendly organization:”
"  Not led out of IT.
"  Top Data Job. In time, true C-band Chief Data
Officers.
"  Federated structure.
"  Many, many, more people!
!  To improve quality, recognize data customers and
creators with specific responsibilities to behave as such.
Data possess properties unlike any other asset.
We are just beginning to understand these properties
Property Example Implication for DQ
Data multiplies Sheer rate of growth strains ability to manage them
Data are more complex than
they appear
Much (data model, correct values, presentation) has to go
right
Data can be digitized They are (technically) easy to share. And steal.
Data create value “on the
move”
Data sitting in a database are profoundly uninteresting
Data are intangible They have no physical properties, complicating measurement
Data are organic Data morph as they move to suit different needs
Data are the means by which
organizations encode
knowledge
They have very long lifetimes.
Data are subtle and nuanced.
They have become the
organization’s lingua franca
The distinction between the “right data” and “almost right
data” is akin to the distinction between “lightning” and a
“lightning bug” (after Twain)
Each organization’s data are
uniquely its own
They are the “ultimate proprietary technology”
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 13
Data Quality
1.  What are the implications of “data are
organic?”
2.  Does Improving Data Quality present a
legitimate strategic choice?
3.  First Step: Perform a Friday Afternoon
Measurement
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 14
For Data, Only Two Moments Really Matter
©Data Quality Solutions, 2000-2016
The Moment of UseThe Moment of Creation
PATH FROM
CREATOR TO
CUSTOMER
DATA CREATOR DATA CUSTOMER
The whole point of
data quality
management is to
connect the two!
Note that they
DO NOT occur
in IT
T. C. Redman, Page 15/Redman-DataAssets-DV-March2016
Implications
The “management system” must explicitly call out the
responsibilities of:
!  Data customers to clarify their needs.
!  Data creators, to understand those needs and meet
them.
!  A data quality team and leadership to connect the two
for the organizations most important data.
A Note on Tech: While active data customers and data
creators are ravenous consumer of technology, Tech is
singularly ill-positioned to lead the effort.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 16
The Hidden Data Factory
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016
Data
ok?
Input
data
Fix errors
Complete our
Value-added
work
no
yes
Said differently, if the “straight-
through path,” costs a dollar,
then the “fix errors path” costs
ten.
the Data Doc’s Rule of Ten: “It costs ten times as
much to complete a unit of work when the data is
flawed in any way as it does when they are perfect!
T. C. Redman, Page 17
Rule of Ten Illustrated
The costs associated with this “fix errors path” can
mount up very quickly.
Consider “our data is 99% accurate.” Sounds pretty good!
And suppose:
!  It takes twelve pieces of data to complete a simple operation (many
simple operations require 10-25 pieces of data).
!  Further suppose the organization completes 100 work items/day
!  Recall the straight-thru path costs $1.00
Then:
!  The “value-added cost” is $100 (100 items at $1.00 each)
!  Twelve percent (12%) of the work items have defective data.
These cost $10 each, or $120 total.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 18
Data Quality as the means to cost
reduction
Cost-of-Poor-Data Quality (COPDQ) Calculator
Cost to complete 88 defect-free ($1 each) = $88.00
Cost to complete 12 defectives ($10 each) = $120.00
Total cost = $208.00
Non-value-added cost = $108.00
%COPDQ ($108/$208) = 52%
This is any extremely important point: Even a “seemingly low”
error rate can lead to an enormous increase in operational costs.
IMPROVING DATA QUALITY CAN REDUCE SUCH COSTS BY TWO-
THIRDS OR MORE!
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 19
Bad data adds other costs as well
!  Some errors leak through and making good for
customers or undoing bad decisions is even more
costly.
!  Data customers spend excessive time looking for what
they need.
!  “Systems don’t talk”/shadow systems
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 20
IMPROVING DATA QUALITY (e.g., CLOSING THE HIDDEN
DATA FACTORY) MAY BE THE SIMPLEST, MOST-EFFECTIVE
WAY TO PURSUE A STRATEGY OF COST REDUCTION.
A powerful first step: The Friday
Afternoon Measurement
The Friday afternoon measurement aims to help
answer the question,
“Do I need to worry about data quality?”
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 21
Friday Afternoon Measurement Protocol
Assemble
“last 100
records”
Assemble
2-3 experts
Mark
“obviously-
erred” data
in red
Summarize
and
interpret
results
©Data Quality Solutions, 2000-2016/Redman-DataAssets-DV-March2016 T. C. Redman, Page 22
Assemble the last 100 records
After Fig 18.2, Redman, Data Quality:
The Field Guide©Data Quality Solutions, 2000-2016
Attribute 1:
Attribute 2:
Size
Attribute 3:
Amount
etc
Record A Jane Doe Null $472.13
Record B John Smith Medium $126.93
C
Stuart
Madnick
XXXL Null
D
Thoams
Jones
Record 100
James
Olsen
One Locked
Place
$76.24
/Redman-DataAssets-DV-March2016 T. C. Redman, Page 23
Select 10 – 15 most important attributes
Mark the obvious errors
After Fig 18.2, Redman, Data Quality:
The Field Guide©Data Quality Solutions, 2000-2016
Attribute 1:
Attribute 2:
Size
Attribute 3:
Amount
etc
Record A Jane Doe Null $472.13
Record B John Smith Medium $126.93
C
Stuart
Madnick
XXXL Null
D
Thoams
Jones
Record 100
James
Olsen
One Locked
Place
$76.24
/Redman-DataAssets-DV-March2016 T. C. Redman, Page 24
Rate the record as “perfect” or not
After Fig 18.2, Redman, Data Quality:
The Field Guide©Data Quality Solutions, 2000-2016
Attribute 1:
Attribute 2:
Size
Attribute 3:
Amount
etc
record
perfect?
(y/n)
Record A Jane Doe Null $472.13 n
Record B John Smith Medium $126.93 y
C
Stuart
Madnick
XXXL Null n
D
Thoams
Jones
n
Record 100
James
Olsen
One Locked
Place
$76.24 n
/Redman-DataAssets-DV-March2016 T. C. Redman, Page 25
Count the “perfects”
After Fig 18.2, Redman, Data Quality:
The Field Guide©Data Quality Solutions, 2000-2016
Attribute 1:
Attribute 2:
Size
Attribute 3:
Amount
etc
record
perfect? (y/
n)
Record A Jane Doe Null $472.13 n
Record B John Smith Medium $126.93 y
C
Stuart
Madnick
XXXL Null n
D
Thoams
Jones
n
Record 100 James Olsen
One Locked
Place
$76.24 n
Count
perfect
67
/Redman-DataAssets-DV-March2016 T. C. Redman, Page 26
Data Quality = 67%
Here the interpretation is a full third of recent
customer orders had a serious DQ issue.
A worry indeed!
©Data Quality Solutions, 2000-2016/Redman-DataAssets-DV-March2016 T. C. Redman, Page 27
Summary Against Objectives
!  To qualify as assets, data must pass three, deliberately-
demanding (yet common sense) criteria:
"  You must take care of them
"  You must put them to work
"  You must advance the management system
!  Re-examined the management system for data quality,
emphasizing the essential roles of data creators and data
customers.
!  Showed how improving data quality is a legitimate “low-cost
provider” strategy in its own right.
!  Explained the simple elegance of the “Friday Afternoon
Measurement” and asked everyone to make one.
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 28
©Data Quality Solutions, 2000-2016 T. C. Redman, Page 29
Questions?
Thomas C. Redman, Ph.D.
“the Data Doc”
+1 732-933-4669
tomredman@dataqualitysolutions.com
www.navesinkconsultinggroup.com
/Redman-DataAssets-DV-March2016
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T.C. Redman, Page 30
Thomas C. Redman, “the Data Doc”
!  Ph.D., Statistics, Florida State, 1980.
!  Conceived and led the Data Quality Lab at AT&T
Bell Labs.
!  Formed Data Quality Solutions in 1996.
!  Latest and greatest: “Data’s Credibility Problem,”
HBR, Dec, 2013.
!  Data Driven: Profiting from Your Most Important
Business Asset, Harvard Business School Press,
2008.
!  Known bias: “Data are quite obviously the key
asset of the Information Age. Yet today’s
organizations are unfit for data. Further, despite
enormous potential, few are yet considering how
they will compete with data. Finally, high-quality
data is pre-requisite. These crystallize THE
management challenges of the 21st century.”
Picture
©Data Quality Solutions, 2000-2016 T. C. Redman, Page 31
Data Quality: The Non-delegatable Choice
/Redman-DataAssets-DV-March2016
Unmanaged
Eliminate The Sources Of Pollutant
To Clean Up The Lake, One Must First
Typical Result
/Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016
Each	
  error	
  not	
  made	
  saves	
  an	
  average	
  of	
  $500.	
  	
  Quickly	
  millions!	
  
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20
FractionPerfectRecords
Month
First-time, on-time results
Accuracy Rate ave lower control limit upper control limit target
2. First Meas
Program start
1. Rqmts defined
3. Improvements
4. Control
T.C. Redman, Page 32

DAMA Webinar: What Does "Manage Data Assets" Really Mean?

  • 1.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T.C. Redman, Page 1 What Does “Manage Data Assets” Really Mean? Dataversity Webinar March 15, 2016 Thomas C. Redman, Ph.D. Data Quality Solutions www.dataqualitysolutions.com
  • 2.
    Objectives !  Provide asimple way of thinking about what it takes for data to qualify as assets. !  Re-examine the management system for data quality, calling out the essential roles of data creators and data customers. !  Show how improving data quality is a legitimate strategy in its own right. !  Ask that everyone make a “Friday Afternoon Measurement” to baseline data quality in their area. /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 2
  • 3.
    But first, abit of context !  Circa 2005: “Manage data as assets” enters the lexicon, without any content. "  An odd choice of phrase. !  2008: wrote Data Driven, partly in response. !  Soon thereafter: “Data driven” enters the lexicon, though usually in a narrower context. !  In parallel: “Big data,” “Data scientist,” “Chief Data Officer” enter the lexicon. !  Recently: Witness the disruption caused by Uber, simply by connecting “I need a ride” with “I’m looking for a fare.” !  Looking forward: For most, the future depends on data. BIGGEST CHANGE: THE URGENCY OF RECOGNIZING THAT, “PROPERLY MANAGED, DATA ARE ASSETS OF GREAT POWER” /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 3
  • 4.
    But all isfar from well in the data space The future may depend on data, but: !  Increasingly apparent that data management has failed most companies. !  Fear has replaced apathy as the number one enemy of all things data. !  Most organizations can only be judged as “unfit for data.” /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 4
  • 5.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T. C. Redman, Page 5 What Does “Manage Data Assets” Really Mean? Generally recognized as assets: !  Capital, in its various forms !  People, including the knowledge in their heads. When one recognizes anything as an asset, they: !  Take care of it. !  Put it to work, in the private sector, usually to make money. !  Advance the management system in recognition of the ways the asset is different from other assets.
  • 6.
    Suppose You’re GoodWith Numbers…. To advance quantitative skill to the level of an asset, you would: Take Care of that Skill: !  Do the KenKen. !  Build that skill by getting a technical degree. Put that Skill to Work: !  Become a data scientist. !  Build a career based on those growing skills. Advance your personal management of the skill: !  Manage yourself to take full advantage of the skill. /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 6
  • 7.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T. C. Redman, Page 7 For Data, “taking care” is about quality, security and privacy Prescription 1: !  Possess and acquire the right kinds of data. !  Make sure people can access, understand, and trust them. !  They are of high enough quality to withstand market scrutiny. !  Improve/build on the most important data. !  Keep them safe from loss or theft.
  • 8.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T. C. Redman, Page 8 Putting data to work Prescription 2: Use data to create new revenue !  Sell them directly in the market. !  Build them into other products and services. !  Uncover and exploit the “hidden nuggets” that lie therein (big data/analytics). !  Create and exploit an “information asymmetry.” !  Make better decisions/improve the day-in, day-out running of the business. Critical point: Management must think through how they will put data to work to create new value.
  • 9.
    So far, I’veidentified eighteen distinct ways to “put data to work” Provide (Sell) Content !  New Content !  Re-package !  Informationalization !  Unbundling !  Exploiting Asymmetries !  Closing Asymmetries Facilitators !  Own the Identifiers !  Infomediation !  Big Data/Advanced Analytics !  Privacy and security !  Training !  New Marketplaces !  Infrastructure technologies !  Information appliances !  Tools Working out “what’s right for us” is the key challenge for senior leadership! And “forcing the conversation” is the number one priority of the “Top Data Job.” Internally !  Improve operational efficiency !  360°-view !  Data-Driven Culture /Redman-DataAssets-DV-March2016 T. C. Redman, Page 9©Data Quality Solutions, 2000-2016
  • 10.
    Four basic strategiesfor competing with data (with dozens of variants) !  Innovation (Big Data/Advanced Analytics): Find hidden nuggets in the data and,… !  Content: Provide or exploit content that others don’t have. "  Informationalization. "  Infomediation (e.g., Google). "  Asymmetry (e.g., Hedge fund). !  Build a Data-Driven Culture: Make better decisions, bottom-to-top and across the company. !  Be the low-cost provider: Superior data quality keeps costs down! /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 10
  • 11.
    Uber: Informationalizing theTaxi Business /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 11 I need a ride I’m looking for a fare Witness the disruption caused by connecting with
  • 12.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T. C. Redman, Page 12 Advance the management system Prescription 3: !  Recognize that data have unique properties: "  Example: Unlike other assets, data can be shared "  Data may be the “ultimate proprietary asset.” !  Craft a “data friendly organization:” "  Not led out of IT. "  Top Data Job. In time, true C-band Chief Data Officers. "  Federated structure. "  Many, many, more people! !  To improve quality, recognize data customers and creators with specific responsibilities to behave as such.
  • 13.
    Data possess propertiesunlike any other asset. We are just beginning to understand these properties Property Example Implication for DQ Data multiplies Sheer rate of growth strains ability to manage them Data are more complex than they appear Much (data model, correct values, presentation) has to go right Data can be digitized They are (technically) easy to share. And steal. Data create value “on the move” Data sitting in a database are profoundly uninteresting Data are intangible They have no physical properties, complicating measurement Data are organic Data morph as they move to suit different needs Data are the means by which organizations encode knowledge They have very long lifetimes. Data are subtle and nuanced. They have become the organization’s lingua franca The distinction between the “right data” and “almost right data” is akin to the distinction between “lightning” and a “lightning bug” (after Twain) Each organization’s data are uniquely its own They are the “ultimate proprietary technology” /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 13
  • 14.
    Data Quality 1.  Whatare the implications of “data are organic?” 2.  Does Improving Data Quality present a legitimate strategic choice? 3.  First Step: Perform a Friday Afternoon Measurement /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 14
  • 15.
    For Data, OnlyTwo Moments Really Matter ©Data Quality Solutions, 2000-2016 The Moment of UseThe Moment of Creation PATH FROM CREATOR TO CUSTOMER DATA CREATOR DATA CUSTOMER The whole point of data quality management is to connect the two! Note that they DO NOT occur in IT T. C. Redman, Page 15/Redman-DataAssets-DV-March2016
  • 16.
    Implications The “management system”must explicitly call out the responsibilities of: !  Data customers to clarify their needs. !  Data creators, to understand those needs and meet them. !  A data quality team and leadership to connect the two for the organizations most important data. A Note on Tech: While active data customers and data creators are ravenous consumer of technology, Tech is singularly ill-positioned to lead the effort. /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 16
  • 17.
    The Hidden DataFactory /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 Data ok? Input data Fix errors Complete our Value-added work no yes Said differently, if the “straight- through path,” costs a dollar, then the “fix errors path” costs ten. the Data Doc’s Rule of Ten: “It costs ten times as much to complete a unit of work when the data is flawed in any way as it does when they are perfect! T. C. Redman, Page 17
  • 18.
    Rule of TenIllustrated The costs associated with this “fix errors path” can mount up very quickly. Consider “our data is 99% accurate.” Sounds pretty good! And suppose: !  It takes twelve pieces of data to complete a simple operation (many simple operations require 10-25 pieces of data). !  Further suppose the organization completes 100 work items/day !  Recall the straight-thru path costs $1.00 Then: !  The “value-added cost” is $100 (100 items at $1.00 each) !  Twelve percent (12%) of the work items have defective data. These cost $10 each, or $120 total. /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 18
  • 19.
    Data Quality asthe means to cost reduction Cost-of-Poor-Data Quality (COPDQ) Calculator Cost to complete 88 defect-free ($1 each) = $88.00 Cost to complete 12 defectives ($10 each) = $120.00 Total cost = $208.00 Non-value-added cost = $108.00 %COPDQ ($108/$208) = 52% This is any extremely important point: Even a “seemingly low” error rate can lead to an enormous increase in operational costs. IMPROVING DATA QUALITY CAN REDUCE SUCH COSTS BY TWO- THIRDS OR MORE! /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 19
  • 20.
    Bad data addsother costs as well !  Some errors leak through and making good for customers or undoing bad decisions is even more costly. !  Data customers spend excessive time looking for what they need. !  “Systems don’t talk”/shadow systems /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 20 IMPROVING DATA QUALITY (e.g., CLOSING THE HIDDEN DATA FACTORY) MAY BE THE SIMPLEST, MOST-EFFECTIVE WAY TO PURSUE A STRATEGY OF COST REDUCTION.
  • 21.
    A powerful firststep: The Friday Afternoon Measurement The Friday afternoon measurement aims to help answer the question, “Do I need to worry about data quality?” /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 21
  • 22.
    Friday Afternoon MeasurementProtocol Assemble “last 100 records” Assemble 2-3 experts Mark “obviously- erred” data in red Summarize and interpret results ©Data Quality Solutions, 2000-2016/Redman-DataAssets-DV-March2016 T. C. Redman, Page 22
  • 23.
    Assemble the last100 records After Fig 18.2, Redman, Data Quality: The Field Guide©Data Quality Solutions, 2000-2016 Attribute 1: Attribute 2: Size Attribute 3: Amount etc Record A Jane Doe Null $472.13 Record B John Smith Medium $126.93 C Stuart Madnick XXXL Null D Thoams Jones Record 100 James Olsen One Locked Place $76.24 /Redman-DataAssets-DV-March2016 T. C. Redman, Page 23 Select 10 – 15 most important attributes
  • 24.
    Mark the obviouserrors After Fig 18.2, Redman, Data Quality: The Field Guide©Data Quality Solutions, 2000-2016 Attribute 1: Attribute 2: Size Attribute 3: Amount etc Record A Jane Doe Null $472.13 Record B John Smith Medium $126.93 C Stuart Madnick XXXL Null D Thoams Jones Record 100 James Olsen One Locked Place $76.24 /Redman-DataAssets-DV-March2016 T. C. Redman, Page 24
  • 25.
    Rate the recordas “perfect” or not After Fig 18.2, Redman, Data Quality: The Field Guide©Data Quality Solutions, 2000-2016 Attribute 1: Attribute 2: Size Attribute 3: Amount etc record perfect? (y/n) Record A Jane Doe Null $472.13 n Record B John Smith Medium $126.93 y C Stuart Madnick XXXL Null n D Thoams Jones n Record 100 James Olsen One Locked Place $76.24 n /Redman-DataAssets-DV-March2016 T. C. Redman, Page 25
  • 26.
    Count the “perfects” AfterFig 18.2, Redman, Data Quality: The Field Guide©Data Quality Solutions, 2000-2016 Attribute 1: Attribute 2: Size Attribute 3: Amount etc record perfect? (y/ n) Record A Jane Doe Null $472.13 n Record B John Smith Medium $126.93 y C Stuart Madnick XXXL Null n D Thoams Jones n Record 100 James Olsen One Locked Place $76.24 n Count perfect 67 /Redman-DataAssets-DV-March2016 T. C. Redman, Page 26
  • 27.
    Data Quality =67% Here the interpretation is a full third of recent customer orders had a serious DQ issue. A worry indeed! ©Data Quality Solutions, 2000-2016/Redman-DataAssets-DV-March2016 T. C. Redman, Page 27
  • 28.
    Summary Against Objectives ! To qualify as assets, data must pass three, deliberately- demanding (yet common sense) criteria: "  You must take care of them "  You must put them to work "  You must advance the management system !  Re-examined the management system for data quality, emphasizing the essential roles of data creators and data customers. !  Showed how improving data quality is a legitimate “low-cost provider” strategy in its own right. !  Explained the simple elegance of the “Friday Afternoon Measurement” and asked everyone to make one. /Redman-DataAssets-DV-March2016 ©Data Quality Solutions, 2000-2016 T. C. Redman, Page 28
  • 29.
    ©Data Quality Solutions,2000-2016 T. C. Redman, Page 29 Questions? Thomas C. Redman, Ph.D. “the Data Doc” +1 732-933-4669 tomredman@dataqualitysolutions.com www.navesinkconsultinggroup.com /Redman-DataAssets-DV-March2016
  • 30.
    /Redman-DataAssets-DV-March2016 ©Data QualitySolutions, 2000-2016 T.C. Redman, Page 30 Thomas C. Redman, “the Data Doc” !  Ph.D., Statistics, Florida State, 1980. !  Conceived and led the Data Quality Lab at AT&T Bell Labs. !  Formed Data Quality Solutions in 1996. !  Latest and greatest: “Data’s Credibility Problem,” HBR, Dec, 2013. !  Data Driven: Profiting from Your Most Important Business Asset, Harvard Business School Press, 2008. !  Known bias: “Data are quite obviously the key asset of the Information Age. Yet today’s organizations are unfit for data. Further, despite enormous potential, few are yet considering how they will compete with data. Finally, high-quality data is pre-requisite. These crystallize THE management challenges of the 21st century.” Picture
  • 31.
    ©Data Quality Solutions,2000-2016 T. C. Redman, Page 31 Data Quality: The Non-delegatable Choice /Redman-DataAssets-DV-March2016 Unmanaged Eliminate The Sources Of Pollutant To Clean Up The Lake, One Must First
  • 32.
    Typical Result /Redman-DataAssets-DV-March2016 ©DataQuality Solutions, 2000-2016 Each  error  not  made  saves  an  average  of  $500.    Quickly  millions!   0.5 0.6 0.7 0.8 0.9 1 0 5 10 15 20 FractionPerfectRecords Month First-time, on-time results Accuracy Rate ave lower control limit upper control limit target 2. First Meas Program start 1. Rqmts defined 3. Improvements 4. Control T.C. Redman, Page 32