DDMA Data Quality Award 2010 - Presentatie T- Mobile Netherlands - Jos Leber
Event: DDMA Data Quality Awards 2010
Spreker: Jos Leber – T-Mobile Netherlands
Datum: 14 oktober 2010, Klooster Noordwijk
How to get Data Quality
Sr. Data Manager
Date: October 14th 2010
How does Data Quality start?
What is Data Quality within T-Mobile Netherlands
Data Quality monitoring and Tools
Cost of poor Data Quality
TMNL Data Quality mission statement
Data Management Maturity Models
T-Mobile biedt het wereldwijde netwerk.
Gestart in 1999 als
Sinds 2003 T-Mobile
Toonaangevend bedrijf in mobiele
Een van de drie strategische business
units van Deutsche Telekom.
Wereldwijd bijna 151 miljoen mobiele
In Nederland in 2009 een jaaromzet van
1,807 miljard euro en ruim 2000
Qua Data Quality Management:
Meer dan 100 systemen
En 60 tools
T-Mobile internationaal netwerk.
• T-Mobile Amerika
• T-Mobile Engeland
• T-Mobile Duitsland
• T-Mobile Tsjechië
• T-Mobile Hongarije
• T-Mobile Oostenrijk
• T-Mobile Kroatië
• T-Mobile Slovenië
• T-Mobile Macedonië
• T-Mobile Montenegro
• ERA Polen
• OTE groep (Griekenland,
Daarnaast heeft T-Mobile roaming afspraken met ruim 400 roaming
partners in meer dan 185 landen.
Complexity of Mobile communications
• Ca 100 different barrings and SMS
• Block dialing all 0900 numbers
• Block 0900 for 18+ numbers
• Block downloading games etc
• Block in cases of bad debt
• Voice mail on / off?
• “Nummer weergave”
• Etc, etc
• Family plan
• Mobile to Mobile (business customers
• Information technology
• Network technology
The start for data quality in 2001
NEN 5825 Standard for street and city names
naam-gegevens titel(s) 000 %
voorletter(s) 100 %
voorvoegsel(s) 021 %
achternaam 100 %
achtervoegsel(s) 000 %
volledige zakelijke naam 001 %
AW-gegevens straatnaam 100 %
postbus 000 %
huis/postbusnummer 100 %
huisnummer-toevoeging 015 %
postcode 100 %
w oonplaats 100 %
telefoon-gegevens netnummer 000 %
abonneenummer 000 %
net- en abonneenummer 042 %
Open / Solved cases
Customer Data Inconsistencies (2004)
Differences between the CRM and Billing system
Steps during the Phoenix project
Q4 2004 Q1 2005 Q2 2005 Q3 2005 Q4 2005 Q1 2006
0. Definitie Scope Phoenix Go Live
1. Data Cleaning
Data Cleaning (‘X’ issues)
Criteria for data cleaning
Data Mapping versus data standard
2. Data Migration
DAT Data Acceptance Test plan
Data Base Attributes List
Business rule book
Data Display Tests
Special Test Cases
3. Tooling & Process
Compare & Quality tools
Daily DQM meetings
Reporting to management (IPB) 21/10/2010
Data Quality definition
Data are of high quality if they are fit for their intended uses in
operations, decision making, and planning (after Joseph Juran)
Data that are fit for use are
Free of defect: Posses desired features:
- accessible - relevant
- accurate - comprehensive
- timely - proper level of detail
- complete - easy to read
- consistent with other sources etc - easy to interpret etc
What is Data Quality for T-Mobile Netherlands?
Definition of Data Quality according to a simple keyword: A.C.C.U.
Actual is data still ‘up-to-date’ ?
(e.g. Outdated data is corrected to the new/changed data
Correct Data is filled in within the confirmed standards
(e.g. empty or not in the agreed format (Numeric, NEN conform etc))
Complete is any information missing?
Unique is it unique, no duplicate relations (within a single system)
What is Inconsistency?
The same information different in two or more systems
Data are only of high quality if those who use them say so.
Example of a data standard
In a Data standard attributes (or fields) are defined for e.g. Dutch Postcode:
how it is named
for what purpose do we use and maintain this attribute
what is the master?
it’s validation rules
Standard Name Standard NL Attribute name Screen name Description and objective Norm Y/N Measured Format Master Mandatory Rules and values Comment
name Logical Data Model Clarify Y/N?
Postcode Postcode ZIPcode Postcode The Postal code of the formal physical Text20 x Yes Capitals
location where a Customer is NL Postcode is stored as dddd AA (with
settled/established. single space)
In case of a foreign Postal code (ZIP0 the
format is free text with a maximum of 20
Overruled Afwijzing Overruled (not displayed) Indicates whether the postcode check Boolean x No
(overschrijven) is overruled by Supervisor
X-coordinate X-coördinaat X_X_COORDINATE Coordinates x/y Geological x-coordinate of the location Number x No Selected fromGeo-tool table X-coordinate is specified using the
identified by postcode and house ‘abc’ notation. According to this
number; is used to calculate the standard the X-coordinate can be
location of the “Home Zone” in the maximally 6 digits long
GSM network (unconfirmed).
Y-coordinate Y-coördinaat X_Y_COORDINATE Coordinates x/y Geological y-coordinate of the location Number x No Selected from ‘xyx’-tool table Y-coordinate is specified using the
identified by postcode and house ‘abc’ notation. According to this
number; is used to calculate the standard the Y-coordinate can be
location of the “Home Zone” in the maximally 7 digits long
GSM network (unconfirmed).
Data Monitoring and Data Inspection tools
Percentage Custom ers with an
2005- 2005- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006- 2006-
11-17 12-15 01-19 02-23 03-23 04-20 05-19 06-22 07-19 08-02 08-17 08-31 09-14 10-10 11-08
Bi weekly Quality Monitoring
% active Customers with an inconsistency issue without customer impact
% active Customers with an inconsistency issue with direct customer impact
To measure is to know
Meten = weten
Messen ist Wissen
Technical concept DQ Dashboard
details file used
to correct data
Weekly Statistics (Excel)
Persistent details file
Compar es CSV
e Quality issues
Extract & % active Customers with % active Customers with a
aantal met a blocking data quality non-blocking data quality
Percentage Customers with
Datum # active customers impact error aantal zonder impact error Target
2008-02-08 592496 13279 2.2412% 22365 3.7747% 0.50%
2008-03-17 581846 12240 2.1036% 15708 2.6997% 0.50% 4.00%
2008-04-23 730445 13451 1.8415% 14149 1.9370% 0.50% 3.00%
2008-05-14 735347 12390 1.6849% 8871 1.2064% 0.50% 2.00%
20 4- 2
20 6- 1
20 8- 2
20 9- 1
2008-08-11 766990 11818 1.5408% 8490 1.1069% 0.50%
2008-08-21 767645 5607 0.7304% 8991 1.1712% 0.50%
2008-09-08 768004 4424 0.5760% 5984 0.7792% 0.50%
% active Customers w ith a non-blocking data quality error
2008-09-18 768244 3628 0.4722% 2624 0.3416% 0.50%
% active Customers w ith a blocking data quality error
2008-10-22 768690 3034 0.3947% 1581 0.2057% 0.50% Target
2008-11-22 769156 2988 0.3885% 961 0.1249% 0.50%
2008-12-08 769156 2905 0.3777% 883 0.1148% 0.50%
Summary Sheet Report
B (Excel) (Powerpoint)
Compare statistics & details file examples
Compare statistic active customers 0 01-08-10 08-08-10 15-08-10 21-08-10 29-08-10 Delta Impact
Customer records compared 1 2190973 2195402 2199179 2203809 2211593 7784
Inconsistencies for category Customer 2 1823 1773 1766 1760 3366 1606
Customer.type 5 8 8 8 8 8 0
Customer.account_number 6 118 118 120 120 1694 1574 Verkeerd reknr op acc giro of A.I.
Customer.payment_method 7 96 99 100 97 122 25 Verkeerde betaalwijze acc.giro of A.I.
Customer.status 8 33 39 39 38 39 1 Klant inactive in system A and active in system B
Customer.name 9 508 505 505 504 501 -3
Customer.billing_address.street 10 224 218 213 213 215 2 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.housenr 11 178 173 171 171 172 1 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.housenr_add 12 62 63 63 63 62 -1 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.city 13 256 255 254 254 256 2 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.zipcode 14 222 217 214 213 214 1 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.country 15 10 10 11 11 13 2 post niet naar juiste adres (SWL, Factuur)
Customer.billing_address.bill_line_2 16 108 68 68 68 70 2
Details file example
attribute primary_key found_in found_in value_in_system a value_in_system b first_detected last_detected_ customer
_system a_system b _date date _status
Customer.account_number <<number>> Y Y 494291400 603238416 14-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 3243325 14-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 420657096 21-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 6287953 28-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 3674570 2838689 28-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 2195502 463859987 28-Aug-10 29-Aug-10 active
Customer.account_number <<number>> Y Y 546149928 559308787 28-Aug-10 29-Aug-10 active
Data Quality check examples
attribute primary_key parent_key ext_ref Source value first_detected_datelast_detected_date customer_status
Cla_contact.birthname 123456677 1.11591288/0 System a 01/07/1955 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 269859164 1.11689312/0 System a BUSCH 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 269859164 1.11846553/0 System a CHEN 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 269859164 1.11870069/0 System a ROUS 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 270042435 1.11872380/0 System a VERHAGEN 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 270060987 1.11890909/0 System a SEWPERSAD 03-Oct-10 04-Oct-10 active
Cla_contact.birthname 270070652 1.11900568/0 System a PIETERSE 03-Oct-10 04-Oct-10 active
attribute primary_key parent_key ext_ref source value first_detected_datelast_detected_date customer_status
Cla_contact.initials 272300706 1.11580066/1905523 System b JJ 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272300752 1.11580066/1905569 System b DJ 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272300754 1.11580066/1905571 System b MCA 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272300810 1.11580066/1905627 System b HJM 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272300982 1.11580066/1905799 System b HWA 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272300989 1.11580066/1905806 System b PFM 03-Oct-10 04-Oct-10 active
Cla_contact.initials 272301111 1.11580067/1905928 System b r 03-Oct-10 04-Oct-10 active
Data Quality & Inconsistency Monitor
Data Quality & Inconsistency Monitor
Total System Overview - Direct Impact
System Overview X % X %
<<amount>> X% ? X%
New compare New compare
Jan Feb Mrt Apr Mei Jun Jul Aug Sep Oct Nov Dec
System A versus B System X Quality System L versus H
System C versus D System h / K Barrings System K retrospectively
dbA versus dbB System Z Quality System L versus P
System E / F retrospectively Overig (a,b,c,d,e,f,) System K retrospectively
Percentage of total customers Overall target % 21/10/2010
DQM “Driehoeks overleg” and Tooling
System Originator Business IT/NT Ops Development DQM Fix Tools
System a name name name
System b name name
Out of Scope SLA ’s
IT Service IT
Data Quality improvement and 6 Sigma
DMAIC methodology - How to systematically improve data quality
What is the problem
in data quality ?
How can we How big is the data
make the data quality problem?
improvement Direct customer impact ?
sustainable? Indirect customer impact?
Document new No customer impact?
Set in place
What is the fix What is the root cause of
to the data problem? the data problem?
Design and develop data fix Work around required ?
Implement data fix Communicate
solution/work around to
Create known error (KER)
Cost of poor Data Quality (summary)
In the period from july 13th till august 13th 2009,
‘x amount’ calls to Customer Service were
related to dataquality issues.
‘y amount’ of these calls needed a case to
The majority of these calls were related to:
Loyalty (e.g. not receiving gifts or points)
Customers unable to use certain services
(e.g. outgoing calls, service ‘b’)
The total costs for TMNL in 31 days are € z
This means that the direct initial costs on a
yearly base for TMNL/Customer Service
caused by dataquality issues are € y
amount of Page 21
TMNL Data Quality mission statement 2007
TMNL Mission statement
The goal of Data Quality Management is to initiate, stimulate,
coordinate and support activities that improve and maintain the
quality of data of T-Mobile Netherlands so that data can be trusted
and used to support company business processes internally and
externally in the most efficient and effective way.
Key area’s of the function are:
Representing the business interest in data quality for customer, contract and
product data in all parts of T-Mobile Netherlands where data quality is involved to
ensure that the elements of data management are part of operational processes. =€?
Ensuring that the appropriate tools are in place to measure data quality,
regularly reporting on the status of data quality and making sure that where there
are problems with data quality or inconsistencies and that the appropriate
measures are taken to solve them.
Evangelise the information culture, represent the right behaviours for a mature
information based organization and raising the profile of Data Quality as a
business issue by making the business value of data quality clear.
Continuously looking for new opportunity areas where data quality can be
improved, and gaining the support from the relevant departments to undertake
new data quality initiatives.
Maturity Models Overview 2010
Maslow’s Hierarchy of Needs 1943, psychologist proposed such a model for 5
levels of human needs
Richard Nolan’s SGM (Stages of Growth Model) 1970 – 1979; maturity of
CMM - Capability Maturity Model for Software (also known as CMM and SW-
CMM) published by Software Engineering Institute (SEI) and Carnegie Mellon
University and defines software development maturity of organizations based on
procedures and processes
CMMI-SE/SW CMM Integration (CMMI) ; successor of CMM
CMM - ITSM; IT Service Management
Data Warehouse Maturity models
VDC Maturity Model - Virtual Data Center (VDC) of tomorrow--the data center
where virtualization technologies work together to deliver applications.
Internet Marketing Maturity Models
Gartner's web analytics maturity model presented by Bill Grassman at eMetrics
San Francisco is to analyze the vendors themselves in comparison to what they
data they can provide.
The Architecture Maturity Model is organised into 5 levels, based on the 21/10/2010
Carnegie-Mellon Software Engineering Institute’s Capability Maturity Model for
Maturity Models 2010 (continued)
New Services Maturity Model technology professional services maturity model
The Professional Services Maturity Model
The study has been developed to measure the correlation between process maturity
and service performance excellence.
Project management maturity model
Corporate Sustainability: Capability Maturity Model:
The first step in developing a sustainability program is to assess where your firm is
and where you want it to be on the following five-level corporate sustainability
capability maturity model.
BPM Maturity Model Alignment to a BPM Maturity model helps to ensure that the
overall Organisational BPM intiative is in alignment with a solid internal BPM
SOA Maturity Model has become a great foundation for companies worldwide who
have approached application integration using a service-oriented architecture (SOA).
It provides IT decision makers with a simple framework for benchmarking the
strategic value of their SOA planning and implementation—and a model for
visualizing future success.
E-Business Maturity Model
Why do you need a Data Maturity Model?
Data Quality Management
From Reactive to Adaptive Data Management
Our approach to manage data quality is to continue the operative cleaning
started with Phoenix and in parallel establish a conceptual data management to
reduce the required cleaning effort
Reacti Proactive Adaptive
Incident &problem Preventive testing & data Make sure new projects and
management; Clean/ repair inconsistency monitoring in order to changes are in line with
data when problems become proactively business and data model
visible identify and correct errors /problems
•ITT and UAT testing •Develop business model
Clean data manually or via •Logical data model
•End to end testing
script •Technical data model
•Data Acceptance testing
•Data Distribution matrix
•Glossary of terms
•Data Monitoring •Data standard
•Create incidents/problems •GUI design standards
Find and fix the root •Work around scripts •Interface architecture
•Business & validation rules
Clarif y Qualit y
6% My T-Mobile
Clarif y - BSCS inconsist ency •Monitor data quality
•KPI’s for data quality in SLA &
Current focus is on reactive data- 3%
management. Trouble shooting 2%
when problems get identified 1%
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
T-Mobile Enterprise Data Management Maturity model 2006
Initial Repeatable Defined Managed Optimizing
People – Who is involved and what contributions must they make?
Process – What activities must be performed?
Technology – What investments in technology must be made?
Risk and Reward – What risks does the organization face at the current stage and what
could it gain from progressing forward?
The IBM model for Data Management Maturity (2008)
Stage1: Uncertainty Stage2: Awakening Stage 3: Enlightenment Stage 4: Wisdom Stage 5: Certainty
(ad hoc) (repeatable) (defined) (managed) (optimizing)
1. Strategy and Understanding Execs are not aware of data IT execs support data IT and business teaming on data- Execs support data governance Execs manage data assets as
* Executive Interest governance. management. Limited, informal, related projects. Cross-dept financially, incl personal emphasis. driver of efficiency, performance
* Alignment of Business and No coordinated information talk on data initiatives. Elements of information strategy in place, Benefits tracked; strategy adjusted and comp. differentiation.
Information Strategy strategy. information strategy exist. Initiatives aligned with business strategy. to maximize benefits and support Partners support info strategy.
* Communication on Data Projects executed in ad hoc way. coordinate on stand alone basis. Regular communications on data- business priorities. Data is 'talk of the town'.
No communication on data-projects projects & results.
2. Organization Business/IT roles in data Data management roles and Roles & responsibilities assigned, Strategic business planning leads Business/IT roles implemented and
* Business & IT roles in management and projects not responsibilities in business/IT are not always executed. Business efforts to bring info innovation into adaptive.
Information Lifecycle clearly defined. Inconsistent defined. Data management skills directing data mgt priorities. business plans. Deep role- based Changing in- and external
* Data Skills, Learning and business participation. Data skills and training available across the IT Consistent development of data training on data mgt in business & environments supported by
Training not always available.. organization. skills. IT. ongoing development of in- and
external data skills
3. Processes Data collection takes up most of Some data integration. Controls Services-based data apps. Business process integration via In- and external data shared and
* Processes for obtaining the time. Sources of data often silo- developed around changes of data Integration of data silos. Key data information services. Data is readily available.
information Customer, Service, ed. Information is non¬integrated. definitions. Some common data available. seamless, shared and available Additional sources easily added.
Product Data Definition Processes Changes to data are uncontrolled. definitions. Different guidelines and Data management processes throughout processes, enabling High level of standards-centric
* Alignment of Business No common data definitions. processes around definitions and rationalized. Common data process innovation. Definitions information definition, creation and
Processes and Data Mgt requirements gathering. definitions, shared between shared and centrally managed use across business and IT.
business and IT. Controlled
4. Governance No data governance organization, Stronger, informal governance role Governance organization in place. Data governance in place, linked to Governance extended to bus.
* Data Governance Org. policies or standards. Data aspects and policies exists. Departmental Standard processes to address data key internal processes. Preventive partners. Prevention has main
* Stewardship & Ownership of business & IT projects seldom processes address data aspects aspects of projects. Data action. Deliveries of projects that focus. All demand and supply
* Policies & Procedures linked or addressed. of/between IT/business projects. Stewardship implemented. address data aspects are reviewed. processes address data aspects.
* Data aspects in No Data Stewardship. Data owned Data stewardship and ownership on Accountability and authority over Data linked to exec. performance. Stewardship extended to bus.
projects/processes at departmental level departmental level. data definitions and changes Policies stored and accessible. Partners. Adherence to policies is
coordinated. enforced and trained.
5. Data and Data Quality Decisions cannot be made due to Data Quality monitored. Enterprise data architecture Flexible data architecture - Partners managed to use data
* Data Architecture & Standards unreliable data : no quality checks. Preventative data quality developed and managed. Quality information as a service. DQ architecture. Master data controlled
* Master Data Management DQ Ad hoc efforts to meet quality processes. Ad hoc correction requirements governed by metrics embedded in processes across bus. partners. DQ meets
Management needs. Manual effort to coordinate efforts. Loose, not uniform, master business/IT. Processes to validate and systems. DQ approach industry quality standards. Self-
* DQ Metrics & Standards master data. Capture of metadata data mgt. Silos of metadata. High data quality compliance. Master adjusted when bus. strategy healing DQ
Metadata Management when it adds value. level architectural standards. data owned and controlled across changes. Metadata integrated capabilities. Metadata capturing
No version of the truth. Multiple versions of the truth processes and depts. Metadata across processes/technologies. and exchange with business
captured and used consistently. Single version of truth. partners.
Key Elements of Data Maturity
Level 1: Ad Hoc (1998 – 2004)
Executives are not aware of data management
No data management organisation, policies or standards
Ad hoc efforts to meet quality needs (project oriented)
Level2: Repeatable (2005 – 2009)
Full Time Data Manager (role)
Some common data definitions
Data Quality monitoring “To measure is to know”
Level 3: Defined (2009 – 201x)
Data Quality Management/Governance processes in place
Level 4: Managed
Data Quality Budget
Level 5: Optimized
DQM maturity timelines
Stage 1 “Ad hoc” Stage 2 “Repeatable”
Stage 3 “Defined”
Data Quality Background
Data inconsistency meetings New Data Quality compares Financial effect DQ issues
were held were created for CS made visible
The first draft for a Customer
Data Quality standard was Data inconsistency reports Data Quality Targets First draft on Product- and
created were created officialized Contract Data Standard
Data Manager role was
2002 2003 2004 2005 2006 2007 2008 2009 2010
Data cleaning was started to Data Manager role
prepare for customer data Standardization in reporting
Customer Data Standard
realized and officialized
Data Quality Dashboard Second Data Manager
A start was made with
Summary: Why do we need a data maturity model?
You need to know at what stage you are currently and why you are
You can understand the risks associated with undervalued data
Help understand the benefits and costs associated with a move to
the next stage
To improve you have to change the entire culture of your
organization – from personnel to technology to management
You can accurately set goals for data maturity (and it takes time)
This will help you to move to the next stage (to-be)
Current Stage + Best practice
Roadmap for mature data
Thank you for your attention.