Page 1
WHITE PAPER: MDM
WHITE PAPER /
Data quality:
A key factor for successful master data management
Master Data Management systems, also referred to as MDM sys-
tems, enable a «single version of truth» to be represented for a
wide range of company data.
This means that all the relevant data systems of a company and
their contents are consolidated, so that searching for information in
several different, isolated systems becomes unnecessary. Irrespec-
tive of which MDM system architecture has been chosen, both the
high quality of the content data and the meta data are key factors
for the success of the project. This applies to the implementation
phase and the hoped-for increased efficiency of use of the com-
pany data.
The following White Paper describes the importance of high data
quality in the MDM environment. In this respect, all the phases of a
system which is being planned or is to be implemented are consid-
ered. Furthermore, appropriate suggestions for (a) optimizing the
data quality and (b) keeping it at a high level during day-to-day
operation are also provided for existing MDM systems.
All company and product names and logos used in this document are trade names and/or registered trademarks of the
respective companies.
Page 2© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Contents
Master Data Management:
what exactly does it mean?
Data quality and
master data management
The quality of the meta data
The quality of the content data
The Data Quality Life Cycle
Master data management systems:
efficiency in use
Uniserv Data Quality Products:
effective options for
integration in your MDM system
It’s time to get on board:
The Data Quality Audit
List of references
PAGE 3
PAGE 4
PAGE 5
PAGE 7
PAGE 8
PAGE 10
PAGE 11
PAGE 13
PAGE 13
Page 3© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Master Data Management:
what exactly does it mean?
One of the great challenges in the corporate
environment nowadays is effective master data
management (MDM). The precise meaning of this
term can be explained best of all by the following
definition of David Loshin (2009):
The aim is to obtain the so-called «single version
of truth», i.e. that only one system has to be con-
sulted to display all the relevant data of the entire
company for the respective aspect. Searching for
information across different systems is not neces-
sary. Working with the data is more efficient.
From this consideration, it can be recognized
that MDM is not just a software solution but more
importantly a question of the internal organization
and definition of company data. In a final step,
This data is synchronized in various ways and
made available to the end user. Depending on the
complexity of the existing systems and data of a
company, the successful implementation of a MDM
system can take several months or even years.
THE IMPLEMENTATION IS TYPICALLY DIVIDED INTO DIFFER-
ENT PHASES:
–– first of all, the actual master data must be identi-
fied and a list drawn up of the systems in which
it is stored.
–– The next step is to agree on a Master Data
Object Model and ensure that the meta data
is consistent.
Once these fundamental aspects have been clari-
fied, the various requirements for the data are
decided on. Firstly, the requirements of the data
users in the operative area must be considered.
However, these specific demands also have to be
considered in the implementation if MDM is to be
basis of a Data Warehouse or Business Intelligence
analyses.
Further important points are issues such as data
protection or compliance with anti-terrorism regula-
tions. In short, all the legal, operative and analyti-
cal requirements for the data must be considered.
Moreover, the data should be available in an
appropriate quality, so that the content require-
ments for the data can be satisfied.
Setting up a master data management system always
means implementing a company-wide project.
Projects of this size demand a great deal of brain-
work from all concerned, especially if they involve
the creation of the Master Data Object Model or the
standardization of the meta data. The current trend is
to call in external consultants who can liaise between
the various stakeholders and departments and the
personnel responsible for the various data sources
with a degree of detachment in these phases.
« [MDM IS…] ENVISIONING HOW TO
ORGANIZE AN ENTERPRISE VIEW OF THE
ORGANIZATION’S KEY BUSINESS
INFORMATION OBJECTS AND TO
GOVERN THEIR QUALITY, USE, AND
SYNCHRONIZATION TO OPTIMIZE THE
USE OF INFORMATION TO ACHIEVE
THE ORGANIZATION’S OPERATIONAL
AND STRATEGIC BUSINESS OBJECTS ».
Page 4© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Data quality and
master data management
One issue which runs through all phases of
master data management is data quality. This
concerns very basic aspects of data governance,
i.e. specific solutions for the master meta data,
standardized data formats, standardized refer-
ence keys, databases which are duplicate-free
as far as possible, postally correct addresses of
customers and suppliers and much more.
It also concerns the appropriate data quality
measures for each phase of the MDM project
and the subsequent productive system.
No matter from which aspect a MDM system is
considered, a high quality of the data is critical
for its success and therefore also for its efficient
use in the company.
What does high quality of the data mean here
precisely? Firstly, it is certainly important that the
data contents are correct, in order to pass on the
correct information to the end user in the opera-
tive or analytical business. However, some basic
qualities such as unambiguousness, completeness,
up-to-dateness and editability (in addition to others)
also play a big part.
On the one hand, many of the attributes stated
here can be obtained by means of an appropriate
system architecture which enables the relevant busi-
ness rules to be mapped. On the other hand, the
characteristics of the data itself, such as accuracy
or objectivity, guarantee that it has a high quality.
Two different types of data can be distin-
guished in MDM:
–– The meta data. This is data which is used to
define the attributes of the content data or the
actual user data. This also includes correct
linking of the fields of different data sources,
description of the reference keys, etc.
–– The content data. The quality of the con-
tents and the syntactic quality are concerned
here. The correct address of a contact is an
example of this, as are a standardized date
format or the duplicate-free database of a
variety of master data.
Both of the data types stated here are equally cru-
cial and essential for the data quality. The high (or
low) level of one data type directly affects the level
of the other.
Regardless of whether an MDM system has been
implemented or an MDM project is planned for the
future, there are various possibilities for influencing
the quality of the data at the different levels. The
quality of the meta data and the quality of the con-
tent data will be considered separately to provide
a clearer overview.
NO MATTER FROM WHICH ASPECT A
MDM SYSTEM IS CONSIDERED, A HIGH
QUALITY OF THE DATA IS CRITICAL FOR ITS
SUCCESS AND THEREFORE ALSO FOR ITS
EFFICIENT USE IN THE COMPANY.
Page 5© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
The quality of the meta data
As mentioned above, meta data is data which is
used to describe the attributes of the content data (user
data). This includes the field names as well as value
ranges or specified data formats. In a wider sense,
this category also includes information which makes
statements about data models, e.g. the linking of
tables in a database.
THE FOLLOWING EXAMPLES PROVIDE AN OVERVIEW OF
POSSIBLE PROBLEMS IN THE AREA OF META DATA.
–– If a large number of different data sources
are involved, it is often noticeable that fields
have the same names but contain different
data or information. In the opposite case,
different data sources have the same data
contents and information, but the field name
varies. Figure 1 illustrates this.
				 Field Name
				 Data source 1	 Data source 2
				 First name	 Name			
				 Last name	 Name
			 	 Street		 Address field 1
				 Postcode	 Address field 2
				 Place		 Address field 3
				 Telephone 1	 Phone, business
				 Telephone 2	 Mobile
Fig. 1: The different field names of the two data sources refer to the same data.
The meta data of data source 2 is ambiguous and allows a great deal of room for
interpretation. The same applies to the telephone data in data source 1.
–– Agreement on data formats is just as important.
Dates can be displayed in a variety of formats:
System 1: 2010-06-11
System 2: 10-06-11
System 3: 11th June 2010
System 4: 11.6.2010
There are two possibilities for interpretation in
the case of the format in System 2: 10th June
2011 or 11th June 2010.
–– The complexity of the meta data and the
meaning of field contents is illustrated using
the example of the field “Customer status”.
Meaning in system 1:
Customers are those who
have received adverti-
sing material.
Meaning in system 2:
Customers are those who
have registered on the
company website.
Meaning in system 3:
Customers are those who
have paid an invoice.
Roland
Pfeiffer
Rastatter Str.13
75179
Pforzheim
07231 936-1000
07231 936-1000
Page 6© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Data profiling is recommended, in order to obtain
a comprehensive overview of the data, especially
of the meta data of the source systems. In this
respect, the meta data is identified in a reverse
engineering process on the basis of content data.
Profiling software provides an overview of the meta
data in a very short time and uses a «drill-down»
function to provide an overview of the data itself.
The number of fields, their name, content type,
value range, primary key potential and much
more besides can be quickly viewed and ana-
lyzed. Ideally, further dependencies between
the source data tables can also be identified by
means of join and dependency analyses Finally,
on-going problems of the data or the filling of the
data fields in the source systems can be inves-
tigated in the profiling. The problems identified
here often cannot be remedied with automatic
cleansing at the press of a button but require cus-
tomization of the processes instead.
After the structure of all data sources which are to
fill the master data management system has been
ascertained, a Master Meta Data Model must
be chosen. This model describes the data in the
MDM system. An agreement on the meaning of
the field names and the field contents has already
been reached across the company. The creation
of a Master Meta Data Model is important for
the data quality, because it specifies which fields
of the data sources fill which fields in the MDM
system and which information the fields contain.
Representation of the standardized meta data in the MDM and the mapping from the source systems.
DATA SOURCE 1 CRM
FIRST
NAME
LAST
NAME
STREET POST-
CODE
PLACE
MDM SYSTEM
FIRST
NAME
LAST
NAME
... PRODUCT
NAME
...
DATA SOURCE 3 ORDERING
NAME 1 NAME 2 PRODUCT AMOUNT ...
DATA SOURCE 2 ERP
PRODUCT
NO.
PRODUCT
NAME
PRICE ... ...
Page 7© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
The quality of the content data is just as important
as the quality and unambiguousness of the meta
data in a Master Meta Data Model. Even if the
format of the content data matches and the manda-
tory fields have been filled, this says nothing about
the quality of the data. The following example
explains this aspect:
The example shown above may appear to be
The example shows that even if the meta data is unambiguous, the quality of the
content data can leave much to be desired. Contrary to expectations, the ID as the
primary key is ambiguous. Although the street line is always filled, the streets are either
spelled incorrectly or do not exist. The same applies to the postcodes and the place
names. Even if the last field of the table should contain an unambiguous fax number, an
e-mail address is found here. The e-mail address was obviously not considered to be part
of the master data. No field was provided for this information or syntactic validation did
not took place when the data was created.
The example shown above may appear to be
trivial, but possibilities are already available
for automatic cleansing, particularly in the area
of addresses, both in batch processing and in
online mode. This automated address cleans-
ing can be used e.g. in the validation of the
correct address.
There is also the possibility of arrang-
ing duplicate or similar data records
in groups, attaching information to
the other data record or eliminating
clear duplicates, if required, or at
least to mark them as such.
Master Meta
Data Model Data record 1 Data record 2
ID (Primary key) 0815 0815 0815
First name M. Roland Max
Last name Diener Pfeiffer Mustermann
Street&Housenumber Rastaterstrasse 14 Rastatter Str. 13 Sommer-Strasse
Postcode 75197 75197 12345
Forzheim Pforzheim Neustadt
Telephone, business 07231 936 –0 07231 936 0 -
Fax, business - info@uniserv.com -
Data record 1
Place name
The quality of the content data
Page 8© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
One of the challenges is the continuous activity of the
MDM system. The data is never available in isolation
or at standstill. Instead all the data is linked together
and continually on the move.
With regard to the data quality, periodic cleansing
of the customer data is only possible to a very
limited extent. Each update, no matter whether it is
a measure for optimization of the data quality
across all the data or concerns the input of
additional data records (e.g. after the purchase of
further customer data or after the merger of two
companies), is inevitably carried out with the
system in operation.
It is therefore all the more important that all the data
records entered in the master data management
system meet a certain quality standard before they
are entered.
THE UNISERV DATA QUALITY LIFE CYCLE SHOWS WHAT
THIS COULD LOOK LIKE:
1ST STEP :
The relevant master data in the various source
systems must be identified first of all. These data
sources can then be analyzed by means of
profiling tools, which Uniserv also offers.
Problems at the meta data level and in the data
itself can thereby be identified and measures to
remedy them initiated.
2ND STEP :
In a further step, the (initial) cleansing takes
place on the basis of the knowledge gained in
the profiling. All the content data is first of all
validated for correctness here.
The Data Quality Life Cycle
In the case of data from the address environment, this
means that the addresses are validated and corrected
and supplemented as required. Another component of
the initial cleansing could be a duplicate search or the
arrangement of identical or at least similar data records
from the same or additional data sources in groups.
Consolidation of the data records or enhancement
with (external) additional information are also possible.
Various parameters can be used to group or link data
records from different data sources. Identical or similar
names and / or addresses would be conceivable.
Merging data records from different data sources (e.g.
customer data with information from the ERP system)
can also be carried out by means of unambiguous
keys (identifiers).
3RD STEP :
The quality of the data records is validated at the front-
end of the data sources during data entry in a real-time
check (also referred to as a Data Quality Firewall). The
correctness and completeness of the address data is
verified first of all. In a further step, it is also checked
whether the system already contains a data record with
the same contents. The high performance of these
validations is very important in the real-time check. If
they take too much time, they are by-passed. There is a
danger that the master data management system is
«contaminated» by un-validated data.
4TH STEP :
Permanent validation of the data quality in the MDM
system is appropriate even with greatest possible
efforts. Business rules specified in the business
processes can be kept under surveillance through
skilful monitoring. If previously specified threshold
values of the so-called Key Performance Indicators
(KPIs) are exceeded, measures can be implemented
in good time, in order to guarantee a permanently
high quality of the master data.
Page 9© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
5TH STEP :
Especially when larger volumes of data are
repeatedly added to the existing master data,
periodic cleansing measures with the following
objectives are recommended:
–– To detect errors which have entered the system
in spite of the Data Quality Firewall - either
because not all the rules have been mapped
there or the warnings for these rules have been
circumvented by user interventions (e.g. a
suspected duplicate with a low probability).
–– The data is subject to an ageing process,
particularly in the case of customer MDM systems.
Places of destination are merged in local
government reorganizations, postcodes changed,
streets are renamed or individuals move house.
(This often takes place without the person or the
company informing you.) As a result, the error
rate for incorrect records in your MDM system
typically increases by 1-2 percent per month. This
can be prevented by means of this step.
It should be closely examined which master data is
affected. It is also important that existing links between
the data records of different data sources or tables are
not jeopardized in the periodic cleansing measures.
In this sequence, the individual steps of the Data
Quality Life Cycle create a sound basis as a whole
for bringing the quality of the master data of a
company to a high level and keeping it there in the
long term. Nevertheless, master data management
systems are living systems. The concept of «Cycle»
(i.e. recurring) should therefore always be
considered, especially against the background of
changes and amendments to business rules and
the increase or modification of the demands made
on the data.
CLOSED
DATA QUALITY
CYCLE
Profiling
Cleansing
Real-Time CheckMaintaining
Monitoring
Initial clean-up
Implementation of Data Profiling
and investigation of the data
Analysis of the data quality and clean-
sing of customer, transaction,
order, financial, statistical data ...
Securing the data quality
directly at input
Integration of external data.
Provision of data
for external systems.
Application of change reports
from third-party companies. (anti-
ageing)
5.
1.
2.
4.
3.
The process for data quality in master data management consists of the initial cleanup with the steps profiling (1) and cleansing (2 ) followed by a real-time check at data entry
(3) and monitoring (4) and maintaining (5) in a closed loop.
Continuous monitoring of the data
quality and compliance with the
business rules for transaction, order,
financial, statistical data ...
Page 10© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Two points in particular must be considered in
order to ensure high efficiency when handling the
master data:
–– First of all, error-tolerant searching for data
records should be mentioned. A search for the
same or supposedly the same data records is
activated when a new data record is created.
The search can be actively started at the press
of a button. However, the search also runs in
the background without it being actively start-
ed. If the system finds identical data records or
potential duplicates, the similar data records
are displayed to the end user in the form of a
select list, so that he can decide whether the
current customer is actually a new customer or
an existing customer.
–– Another important factor is the speed of this
error-tolerant search. The end user is dependent
on quick response times of the system. If he has
to wait too long for the results of a duplicate
check, it is by-passed and the quality of the
master data will deteriorate.
However, error-tolerant searching for data records
is not only an important criterion in initial data crea-
tion. The search must also take place quickly and
provide reliable results when e.g. data records are
searched for in a service centre.
One characteristic of an MDM system is that
the data stream flows in two directions: data is
transferred to the system (e.g. initial creation of a
customer record) and data is retrieved from the
system by the user. Needless to say, this data
also has to be correctly linked to all the other
data in the respective context.
It must therefore be guaranteed that when the
data of the customer Roland Pfeiffer is retrieved,
the system not only supplies his contact data but
also e.g. the products which he has purchased at
a certain price at a given point in time, any com-
plaints he has made and which advertising mate-
rial he has received (Single View of Customer).
Master data management systems:
efficiency in use
ONE CHARACTERISTIC OF AN MDM
SYSTEM IS THAT THE DATA STREAM
FLOWS IN TWO DIRECTIONS:
DATA IS TRANSFERRED TO THE SYSTEM
(E.G. INITIAL CREATION OF A CUSTOMER
RECORD) AND DATA IS RETRIEVED FROM
THE SYSTEM BY THE USER.
Page 11© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
The Data Quality Monitor supports the moni-
toring of KPIs. As a result, the optimizations
of the source systems can be checked by
means of business rules. It can also be used
in existing master data management systems.
Monitoring jobs can be configured very easily
and automatically started. Furthermore, auto-
matic notification of the status of the monitor-
ing job is also possible.
Where the initial consolidation of different
systems and an initial cleansing of the data
are concerned, the Uniserv Data Quality Batch
Suite provides an excellent option for prepar-
ing data of several source systems for input into
the MDM system.
Where high data quality and the associated
efficiency of the MDM system are concerned,
it is irrelevant which architecture is chosen. The
quality of the data should always be at a high
level, no matter whether the basic approach of
a listing variant or a completely consolidated
master data set or intermediate stages of both
variants were selected. In this respect, the most
interesting question is how quality assurance
measures can be most effectively integrated in
the data stream.
The use of the Data Quality Explorer in the
planning phase of the master data manage-
ment is a good idea. The profiling of the differ-
ent data sources described above can be car-
ried out quickly and simply by means of the DQ
Explorer. Well-developed workflow functions in
a client/server architecture enables long-term
optimization of the source data systems. Any
irregularities can be attached to the respective
data records in the form of notes. Jobs concern-
ing problematic data can also be assigned to
personnel by e-mail. Continuous optimization
and the preparation for the initial transfer of
the data to the MDM system can thereby be
pursued over the term of the project.
Uniserv Data Quality Products:
effective options for integration in
your MDM system
IN THIS RESPECT, THE MOST
INTERESTING QUESTION IS HOW
QUALITY ASSURANCE MEASURES CAN BE
MOST EFFECTIVELY INTEGRATED
IN THE DATA STREAM.
Page 12© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Not only are addresses validated for postal
correctness and, if necessary, corrected in the
DQ Batch Suite, the data of the various source
systems can also be arranged in groups and
sorted in an individual process tailored to the
respective business rules.
After the data has been grouped, it can be
consolidated. It is thereby guaranteed that no
important data is lost and the data sources are
appropriately linked. Needless to say, primary
keys can also be generated for the subsequent
matching on retrieval of the data.
Master data management systems are also
characterised by the fact that data records
are repeatedly changed through initial data
creation, supplementary information or cor-
rections. The quick return of a called data
record also must be guaranteed by the sys-
tem architecture.
A concept for executing these changes and
data calls is the integration of a Service-
Oriented Architecture (SOA). Business rules
can be implemented using applications by
means of SOA. With respect to data qual-
ity and the use of the Uniserv Real-Time
Services, this means that duplicate data
records can be prevented. The automated
correction of address data is guaranteed.
WITH RESPECT TO DATA QUALITY AND
THE USE OF THE UNISERV REAL-TIME
SERVICES, THIS MEANS THAT DUPLICATE
DATA RECORDS CAN BE PREVENTED. THE
AUTOMATED CORRECTION OF ADDRESS
DATA IS GUARANTEED.
Page 13© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
List of references
–– Loshin, D. 2009. Master Data Management. Morgan Kaufmann OMG Press. 274pp. ISBN 978-0-12-374225-4
The audit is the instrumental first step for mak-
ing reliable decisions - and marks your per-
sonal introduction to the project «Data Quality
in your MDM system».
During the audit, the quality of the addresses is
primarily evaluated with the support of the data
quality tools from Uniserv. In a second step, there
is the possibility of getting to the root of the possi-
ble causes of the deficient data quality in a proc-
ess analysis. So the best thing to do is to contact
us right away!
It’s time to get on board:
The Data Quality Audit
The Uniserv DQ Audit can be used to make statements about the status quo of
the in-house data.
For further information
about MDM please visit our web page
www.uniserv.com/MDM or contact us directly:
We are looking forward for advising and sup-
porting you through your project.
Page 14© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.
WHITE PAPER: MDM
Uniserv
Uniserv is the largest specialised supplier of data quality solutions in Europe
with an internationally usable software portfolio and services for the quality as-
surance of data in business intelligence, CRM applications, data warehousing,
eBusiness and direct and database marketing.
With several thousand installations worldwide, Uniserv
supports hundreds of customers in their endeavours to
map the Single View of Customer in their customer data-
base. Uniserv employs more than 110 people at its head-
quarters in Pforzheim and its subsidiary in Paris, France,
and serves a large number of prestigious customers in all
sectors of industry and commerce, such as ADAC, Al-
lianz, BMW, Commerzbank, DBV Winterthur, Deutsche
Bank, Deutsche Börse Group, France Telecom, Green-
peace, GEZ, Heineken, Johnson & Johnson, Nestlé,
Payback, PSA Peugeot Citroën as well as Time Life and
Union Investment.
Further information is available
at www.uniserv.com
Experience:
OVER 40 YEARS
Market position:
LARGEST
EUROPEAN SUPPLIER
Employees:
MORE THAN 110 PEOPLE
DIRECT
MARKETING
BI/BDW
CPM
CRM
ERP
E-COMMERCE
DATA
MIGRATION
PROJECTS
SOA
ON-PREMISE/
ON-DEMAND
MDM/CDI
COMPLIANCE
Contact:
+49 7231 936-0
UNISERV GmbH
Rastatter Straße 13 • 75179 Pforzheim • Germany • T +49 7231 936-0 •
F +49 7231 936-3002 • E info@uniserv.com • www.uniserv.com
© Copyright Uniserv • Pforzheim/Germany • All rights reserved.

Data Quality MDM

  • 1.
    Page 1 WHITE PAPER:MDM WHITE PAPER / Data quality: A key factor for successful master data management Master Data Management systems, also referred to as MDM sys- tems, enable a «single version of truth» to be represented for a wide range of company data. This means that all the relevant data systems of a company and their contents are consolidated, so that searching for information in several different, isolated systems becomes unnecessary. Irrespec- tive of which MDM system architecture has been chosen, both the high quality of the content data and the meta data are key factors for the success of the project. This applies to the implementation phase and the hoped-for increased efficiency of use of the com- pany data. The following White Paper describes the importance of high data quality in the MDM environment. In this respect, all the phases of a system which is being planned or is to be implemented are consid- ered. Furthermore, appropriate suggestions for (a) optimizing the data quality and (b) keeping it at a high level during day-to-day operation are also provided for existing MDM systems. All company and product names and logos used in this document are trade names and/or registered trademarks of the respective companies.
  • 2.
    Page 2© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Contents Master Data Management: what exactly does it mean? Data quality and master data management The quality of the meta data The quality of the content data The Data Quality Life Cycle Master data management systems: efficiency in use Uniserv Data Quality Products: effective options for integration in your MDM system It’s time to get on board: The Data Quality Audit List of references PAGE 3 PAGE 4 PAGE 5 PAGE 7 PAGE 8 PAGE 10 PAGE 11 PAGE 13 PAGE 13
  • 3.
    Page 3© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Master Data Management: what exactly does it mean? One of the great challenges in the corporate environment nowadays is effective master data management (MDM). The precise meaning of this term can be explained best of all by the following definition of David Loshin (2009): The aim is to obtain the so-called «single version of truth», i.e. that only one system has to be con- sulted to display all the relevant data of the entire company for the respective aspect. Searching for information across different systems is not neces- sary. Working with the data is more efficient. From this consideration, it can be recognized that MDM is not just a software solution but more importantly a question of the internal organization and definition of company data. In a final step, This data is synchronized in various ways and made available to the end user. Depending on the complexity of the existing systems and data of a company, the successful implementation of a MDM system can take several months or even years. THE IMPLEMENTATION IS TYPICALLY DIVIDED INTO DIFFER- ENT PHASES: –– first of all, the actual master data must be identi- fied and a list drawn up of the systems in which it is stored. –– The next step is to agree on a Master Data Object Model and ensure that the meta data is consistent. Once these fundamental aspects have been clari- fied, the various requirements for the data are decided on. Firstly, the requirements of the data users in the operative area must be considered. However, these specific demands also have to be considered in the implementation if MDM is to be basis of a Data Warehouse or Business Intelligence analyses. Further important points are issues such as data protection or compliance with anti-terrorism regula- tions. In short, all the legal, operative and analyti- cal requirements for the data must be considered. Moreover, the data should be available in an appropriate quality, so that the content require- ments for the data can be satisfied. Setting up a master data management system always means implementing a company-wide project. Projects of this size demand a great deal of brain- work from all concerned, especially if they involve the creation of the Master Data Object Model or the standardization of the meta data. The current trend is to call in external consultants who can liaise between the various stakeholders and departments and the personnel responsible for the various data sources with a degree of detachment in these phases. « [MDM IS…] ENVISIONING HOW TO ORGANIZE AN ENTERPRISE VIEW OF THE ORGANIZATION’S KEY BUSINESS INFORMATION OBJECTS AND TO GOVERN THEIR QUALITY, USE, AND SYNCHRONIZATION TO OPTIMIZE THE USE OF INFORMATION TO ACHIEVE THE ORGANIZATION’S OPERATIONAL AND STRATEGIC BUSINESS OBJECTS ».
  • 4.
    Page 4© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Data quality and master data management One issue which runs through all phases of master data management is data quality. This concerns very basic aspects of data governance, i.e. specific solutions for the master meta data, standardized data formats, standardized refer- ence keys, databases which are duplicate-free as far as possible, postally correct addresses of customers and suppliers and much more. It also concerns the appropriate data quality measures for each phase of the MDM project and the subsequent productive system. No matter from which aspect a MDM system is considered, a high quality of the data is critical for its success and therefore also for its efficient use in the company. What does high quality of the data mean here precisely? Firstly, it is certainly important that the data contents are correct, in order to pass on the correct information to the end user in the opera- tive or analytical business. However, some basic qualities such as unambiguousness, completeness, up-to-dateness and editability (in addition to others) also play a big part. On the one hand, many of the attributes stated here can be obtained by means of an appropriate system architecture which enables the relevant busi- ness rules to be mapped. On the other hand, the characteristics of the data itself, such as accuracy or objectivity, guarantee that it has a high quality. Two different types of data can be distin- guished in MDM: –– The meta data. This is data which is used to define the attributes of the content data or the actual user data. This also includes correct linking of the fields of different data sources, description of the reference keys, etc. –– The content data. The quality of the con- tents and the syntactic quality are concerned here. The correct address of a contact is an example of this, as are a standardized date format or the duplicate-free database of a variety of master data. Both of the data types stated here are equally cru- cial and essential for the data quality. The high (or low) level of one data type directly affects the level of the other. Regardless of whether an MDM system has been implemented or an MDM project is planned for the future, there are various possibilities for influencing the quality of the data at the different levels. The quality of the meta data and the quality of the con- tent data will be considered separately to provide a clearer overview. NO MATTER FROM WHICH ASPECT A MDM SYSTEM IS CONSIDERED, A HIGH QUALITY OF THE DATA IS CRITICAL FOR ITS SUCCESS AND THEREFORE ALSO FOR ITS EFFICIENT USE IN THE COMPANY.
  • 5.
    Page 5© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM The quality of the meta data As mentioned above, meta data is data which is used to describe the attributes of the content data (user data). This includes the field names as well as value ranges or specified data formats. In a wider sense, this category also includes information which makes statements about data models, e.g. the linking of tables in a database. THE FOLLOWING EXAMPLES PROVIDE AN OVERVIEW OF POSSIBLE PROBLEMS IN THE AREA OF META DATA. –– If a large number of different data sources are involved, it is often noticeable that fields have the same names but contain different data or information. In the opposite case, different data sources have the same data contents and information, but the field name varies. Figure 1 illustrates this. Field Name Data source 1 Data source 2 First name Name Last name Name Street Address field 1 Postcode Address field 2 Place Address field 3 Telephone 1 Phone, business Telephone 2 Mobile Fig. 1: The different field names of the two data sources refer to the same data. The meta data of data source 2 is ambiguous and allows a great deal of room for interpretation. The same applies to the telephone data in data source 1. –– Agreement on data formats is just as important. Dates can be displayed in a variety of formats: System 1: 2010-06-11 System 2: 10-06-11 System 3: 11th June 2010 System 4: 11.6.2010 There are two possibilities for interpretation in the case of the format in System 2: 10th June 2011 or 11th June 2010. –– The complexity of the meta data and the meaning of field contents is illustrated using the example of the field “Customer status”. Meaning in system 1: Customers are those who have received adverti- sing material. Meaning in system 2: Customers are those who have registered on the company website. Meaning in system 3: Customers are those who have paid an invoice. Roland Pfeiffer Rastatter Str.13 75179 Pforzheim 07231 936-1000 07231 936-1000
  • 6.
    Page 6© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Data profiling is recommended, in order to obtain a comprehensive overview of the data, especially of the meta data of the source systems. In this respect, the meta data is identified in a reverse engineering process on the basis of content data. Profiling software provides an overview of the meta data in a very short time and uses a «drill-down» function to provide an overview of the data itself. The number of fields, their name, content type, value range, primary key potential and much more besides can be quickly viewed and ana- lyzed. Ideally, further dependencies between the source data tables can also be identified by means of join and dependency analyses Finally, on-going problems of the data or the filling of the data fields in the source systems can be inves- tigated in the profiling. The problems identified here often cannot be remedied with automatic cleansing at the press of a button but require cus- tomization of the processes instead. After the structure of all data sources which are to fill the master data management system has been ascertained, a Master Meta Data Model must be chosen. This model describes the data in the MDM system. An agreement on the meaning of the field names and the field contents has already been reached across the company. The creation of a Master Meta Data Model is important for the data quality, because it specifies which fields of the data sources fill which fields in the MDM system and which information the fields contain. Representation of the standardized meta data in the MDM and the mapping from the source systems. DATA SOURCE 1 CRM FIRST NAME LAST NAME STREET POST- CODE PLACE MDM SYSTEM FIRST NAME LAST NAME ... PRODUCT NAME ... DATA SOURCE 3 ORDERING NAME 1 NAME 2 PRODUCT AMOUNT ... DATA SOURCE 2 ERP PRODUCT NO. PRODUCT NAME PRICE ... ...
  • 7.
    Page 7© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM The quality of the content data is just as important as the quality and unambiguousness of the meta data in a Master Meta Data Model. Even if the format of the content data matches and the manda- tory fields have been filled, this says nothing about the quality of the data. The following example explains this aspect: The example shown above may appear to be The example shows that even if the meta data is unambiguous, the quality of the content data can leave much to be desired. Contrary to expectations, the ID as the primary key is ambiguous. Although the street line is always filled, the streets are either spelled incorrectly or do not exist. The same applies to the postcodes and the place names. Even if the last field of the table should contain an unambiguous fax number, an e-mail address is found here. The e-mail address was obviously not considered to be part of the master data. No field was provided for this information or syntactic validation did not took place when the data was created. The example shown above may appear to be trivial, but possibilities are already available for automatic cleansing, particularly in the area of addresses, both in batch processing and in online mode. This automated address cleans- ing can be used e.g. in the validation of the correct address. There is also the possibility of arrang- ing duplicate or similar data records in groups, attaching information to the other data record or eliminating clear duplicates, if required, or at least to mark them as such. Master Meta Data Model Data record 1 Data record 2 ID (Primary key) 0815 0815 0815 First name M. Roland Max Last name Diener Pfeiffer Mustermann Street&Housenumber Rastaterstrasse 14 Rastatter Str. 13 Sommer-Strasse Postcode 75197 75197 12345 Forzheim Pforzheim Neustadt Telephone, business 07231 936 –0 07231 936 0 - Fax, business - info@uniserv.com - Data record 1 Place name The quality of the content data
  • 8.
    Page 8© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM One of the challenges is the continuous activity of the MDM system. The data is never available in isolation or at standstill. Instead all the data is linked together and continually on the move. With regard to the data quality, periodic cleansing of the customer data is only possible to a very limited extent. Each update, no matter whether it is a measure for optimization of the data quality across all the data or concerns the input of additional data records (e.g. after the purchase of further customer data or after the merger of two companies), is inevitably carried out with the system in operation. It is therefore all the more important that all the data records entered in the master data management system meet a certain quality standard before they are entered. THE UNISERV DATA QUALITY LIFE CYCLE SHOWS WHAT THIS COULD LOOK LIKE: 1ST STEP : The relevant master data in the various source systems must be identified first of all. These data sources can then be analyzed by means of profiling tools, which Uniserv also offers. Problems at the meta data level and in the data itself can thereby be identified and measures to remedy them initiated. 2ND STEP : In a further step, the (initial) cleansing takes place on the basis of the knowledge gained in the profiling. All the content data is first of all validated for correctness here. The Data Quality Life Cycle In the case of data from the address environment, this means that the addresses are validated and corrected and supplemented as required. Another component of the initial cleansing could be a duplicate search or the arrangement of identical or at least similar data records from the same or additional data sources in groups. Consolidation of the data records or enhancement with (external) additional information are also possible. Various parameters can be used to group or link data records from different data sources. Identical or similar names and / or addresses would be conceivable. Merging data records from different data sources (e.g. customer data with information from the ERP system) can also be carried out by means of unambiguous keys (identifiers). 3RD STEP : The quality of the data records is validated at the front- end of the data sources during data entry in a real-time check (also referred to as a Data Quality Firewall). The correctness and completeness of the address data is verified first of all. In a further step, it is also checked whether the system already contains a data record with the same contents. The high performance of these validations is very important in the real-time check. If they take too much time, they are by-passed. There is a danger that the master data management system is «contaminated» by un-validated data. 4TH STEP : Permanent validation of the data quality in the MDM system is appropriate even with greatest possible efforts. Business rules specified in the business processes can be kept under surveillance through skilful monitoring. If previously specified threshold values of the so-called Key Performance Indicators (KPIs) are exceeded, measures can be implemented in good time, in order to guarantee a permanently high quality of the master data.
  • 9.
    Page 9© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM 5TH STEP : Especially when larger volumes of data are repeatedly added to the existing master data, periodic cleansing measures with the following objectives are recommended: –– To detect errors which have entered the system in spite of the Data Quality Firewall - either because not all the rules have been mapped there or the warnings for these rules have been circumvented by user interventions (e.g. a suspected duplicate with a low probability). –– The data is subject to an ageing process, particularly in the case of customer MDM systems. Places of destination are merged in local government reorganizations, postcodes changed, streets are renamed or individuals move house. (This often takes place without the person or the company informing you.) As a result, the error rate for incorrect records in your MDM system typically increases by 1-2 percent per month. This can be prevented by means of this step. It should be closely examined which master data is affected. It is also important that existing links between the data records of different data sources or tables are not jeopardized in the periodic cleansing measures. In this sequence, the individual steps of the Data Quality Life Cycle create a sound basis as a whole for bringing the quality of the master data of a company to a high level and keeping it there in the long term. Nevertheless, master data management systems are living systems. The concept of «Cycle» (i.e. recurring) should therefore always be considered, especially against the background of changes and amendments to business rules and the increase or modification of the demands made on the data. CLOSED DATA QUALITY CYCLE Profiling Cleansing Real-Time CheckMaintaining Monitoring Initial clean-up Implementation of Data Profiling and investigation of the data Analysis of the data quality and clean- sing of customer, transaction, order, financial, statistical data ... Securing the data quality directly at input Integration of external data. Provision of data for external systems. Application of change reports from third-party companies. (anti- ageing) 5. 1. 2. 4. 3. The process for data quality in master data management consists of the initial cleanup with the steps profiling (1) and cleansing (2 ) followed by a real-time check at data entry (3) and monitoring (4) and maintaining (5) in a closed loop. Continuous monitoring of the data quality and compliance with the business rules for transaction, order, financial, statistical data ...
  • 10.
    Page 10© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Two points in particular must be considered in order to ensure high efficiency when handling the master data: –– First of all, error-tolerant searching for data records should be mentioned. A search for the same or supposedly the same data records is activated when a new data record is created. The search can be actively started at the press of a button. However, the search also runs in the background without it being actively start- ed. If the system finds identical data records or potential duplicates, the similar data records are displayed to the end user in the form of a select list, so that he can decide whether the current customer is actually a new customer or an existing customer. –– Another important factor is the speed of this error-tolerant search. The end user is dependent on quick response times of the system. If he has to wait too long for the results of a duplicate check, it is by-passed and the quality of the master data will deteriorate. However, error-tolerant searching for data records is not only an important criterion in initial data crea- tion. The search must also take place quickly and provide reliable results when e.g. data records are searched for in a service centre. One characteristic of an MDM system is that the data stream flows in two directions: data is transferred to the system (e.g. initial creation of a customer record) and data is retrieved from the system by the user. Needless to say, this data also has to be correctly linked to all the other data in the respective context. It must therefore be guaranteed that when the data of the customer Roland Pfeiffer is retrieved, the system not only supplies his contact data but also e.g. the products which he has purchased at a certain price at a given point in time, any com- plaints he has made and which advertising mate- rial he has received (Single View of Customer). Master data management systems: efficiency in use ONE CHARACTERISTIC OF AN MDM SYSTEM IS THAT THE DATA STREAM FLOWS IN TWO DIRECTIONS: DATA IS TRANSFERRED TO THE SYSTEM (E.G. INITIAL CREATION OF A CUSTOMER RECORD) AND DATA IS RETRIEVED FROM THE SYSTEM BY THE USER.
  • 11.
    Page 11© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM The Data Quality Monitor supports the moni- toring of KPIs. As a result, the optimizations of the source systems can be checked by means of business rules. It can also be used in existing master data management systems. Monitoring jobs can be configured very easily and automatically started. Furthermore, auto- matic notification of the status of the monitor- ing job is also possible. Where the initial consolidation of different systems and an initial cleansing of the data are concerned, the Uniserv Data Quality Batch Suite provides an excellent option for prepar- ing data of several source systems for input into the MDM system. Where high data quality and the associated efficiency of the MDM system are concerned, it is irrelevant which architecture is chosen. The quality of the data should always be at a high level, no matter whether the basic approach of a listing variant or a completely consolidated master data set or intermediate stages of both variants were selected. In this respect, the most interesting question is how quality assurance measures can be most effectively integrated in the data stream. The use of the Data Quality Explorer in the planning phase of the master data manage- ment is a good idea. The profiling of the differ- ent data sources described above can be car- ried out quickly and simply by means of the DQ Explorer. Well-developed workflow functions in a client/server architecture enables long-term optimization of the source data systems. Any irregularities can be attached to the respective data records in the form of notes. Jobs concern- ing problematic data can also be assigned to personnel by e-mail. Continuous optimization and the preparation for the initial transfer of the data to the MDM system can thereby be pursued over the term of the project. Uniserv Data Quality Products: effective options for integration in your MDM system IN THIS RESPECT, THE MOST INTERESTING QUESTION IS HOW QUALITY ASSURANCE MEASURES CAN BE MOST EFFECTIVELY INTEGRATED IN THE DATA STREAM.
  • 12.
    Page 12© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Not only are addresses validated for postal correctness and, if necessary, corrected in the DQ Batch Suite, the data of the various source systems can also be arranged in groups and sorted in an individual process tailored to the respective business rules. After the data has been grouped, it can be consolidated. It is thereby guaranteed that no important data is lost and the data sources are appropriately linked. Needless to say, primary keys can also be generated for the subsequent matching on retrieval of the data. Master data management systems are also characterised by the fact that data records are repeatedly changed through initial data creation, supplementary information or cor- rections. The quick return of a called data record also must be guaranteed by the sys- tem architecture. A concept for executing these changes and data calls is the integration of a Service- Oriented Architecture (SOA). Business rules can be implemented using applications by means of SOA. With respect to data qual- ity and the use of the Uniserv Real-Time Services, this means that duplicate data records can be prevented. The automated correction of address data is guaranteed. WITH RESPECT TO DATA QUALITY AND THE USE OF THE UNISERV REAL-TIME SERVICES, THIS MEANS THAT DUPLICATE DATA RECORDS CAN BE PREVENTED. THE AUTOMATED CORRECTION OF ADDRESS DATA IS GUARANTEED.
  • 13.
    Page 13© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM List of references –– Loshin, D. 2009. Master Data Management. Morgan Kaufmann OMG Press. 274pp. ISBN 978-0-12-374225-4 The audit is the instrumental first step for mak- ing reliable decisions - and marks your per- sonal introduction to the project «Data Quality in your MDM system». During the audit, the quality of the addresses is primarily evaluated with the support of the data quality tools from Uniserv. In a second step, there is the possibility of getting to the root of the possi- ble causes of the deficient data quality in a proc- ess analysis. So the best thing to do is to contact us right away! It’s time to get on board: The Data Quality Audit The Uniserv DQ Audit can be used to make statements about the status quo of the in-house data. For further information about MDM please visit our web page www.uniserv.com/MDM or contact us directly: We are looking forward for advising and sup- porting you through your project.
  • 14.
    Page 14© UniservGmbH / +49 7231 936-1000 / All rights reserved. WHITE PAPER: MDM Uniserv Uniserv is the largest specialised supplier of data quality solutions in Europe with an internationally usable software portfolio and services for the quality as- surance of data in business intelligence, CRM applications, data warehousing, eBusiness and direct and database marketing. With several thousand installations worldwide, Uniserv supports hundreds of customers in their endeavours to map the Single View of Customer in their customer data- base. Uniserv employs more than 110 people at its head- quarters in Pforzheim and its subsidiary in Paris, France, and serves a large number of prestigious customers in all sectors of industry and commerce, such as ADAC, Al- lianz, BMW, Commerzbank, DBV Winterthur, Deutsche Bank, Deutsche Börse Group, France Telecom, Green- peace, GEZ, Heineken, Johnson & Johnson, Nestlé, Payback, PSA Peugeot Citroën as well as Time Life and Union Investment. Further information is available at www.uniserv.com Experience: OVER 40 YEARS Market position: LARGEST EUROPEAN SUPPLIER Employees: MORE THAN 110 PEOPLE DIRECT MARKETING BI/BDW CPM CRM ERP E-COMMERCE DATA MIGRATION PROJECTS SOA ON-PREMISE/ ON-DEMAND MDM/CDI COMPLIANCE Contact: +49 7231 936-0 UNISERV GmbH Rastatter Straße 13 • 75179 Pforzheim • Germany • T +49 7231 936-0 • F +49 7231 936-3002 • E info@uniserv.com • www.uniserv.com © Copyright Uniserv • Pforzheim/Germany • All rights reserved.