Data quality management

Data quality management
In this file, you can ref useful information about data quality management such as data quality
managementforms, tools for data quality management, data quality managementstrategies … If
you need more assistant for data quality management, please leave your comment at the end of
file.
Other useful material for data quality management:
• qualitymanagement123.com/23-free-ebooks-for-quality-management
• qualitymanagement123.com/185-free-quality-management-forms
• qualitymanagement123.com/free-98-ISO-9001-templates-and-forms
• qualitymanagement123.com/top-84-quality-management-KPIs
• qualitymanagement123.com/top-18-quality-management-job-descriptions
• qualitymanagement123.com/86-quality-management-interview-questions-and-answers
I. Contents of data quality management
==================
All companies struggle to manage the cyclical data quality process. A majority of
organizations use only a fraction of their enterprise information to gain the kind of
actionable insight needed to facilitate superior business performance. Additionally, they
fail to realize the substantial cost associated with the presence of subpar, inaccurate and
inconsistent data.
The significant amount of revenue that is lost to bad information compels a shift in data
quality strategies from occasional data cleansing to an ongoing cycle of data quality
created by incorporating governance plans. Data governance is a continuous quality
improvement process, embraced at all levels of the organization, to filter bad information
by defining and enforcing policies and approval procedures for achieving and
maintaining data quality.
Below are five best practices for data governance and quality management. These best
practices are being leveraged by companies that have successfully achieved -- and
benefited from -- peak data quality in their enterprise.
Conduct a Data Quality Assessment
Start tackling your data quality management problems by performing a complete analysis

of the current state of your data. Information with errors, inconsistencies, duplicates or
missing fields can often be difficult to identify and correct. That's because bad data can
be buried deep within legacy systems, or is received from external sources such as third-
party data providers, external applications and social media channels like Facebook and
Twitter.
An independent analysis will provide the organization with an in-depth report that
includes accurate and detailed statistics about the quality of the organization’s data. The
business can then formulate or refine a data quality management strategy tailored to its
unique organizational needs, and develop governance policies that address specific data
management requirements.
Build a Data Quality Firewall
Data is a strategic information asset, and the organization should treat it as such. Like any
other corporate asset, the data contained within the organization's information systems
has financial value. The value of the data increases and correlates to the number of people
who are able to make use of it. Feeding inaccurate data into your data warehouse or
mastering systems will not only make it difficult to obtain clear business insights and
gather actionable information, it will also damage good data.
A virtual data quality firewall detects and blocks bad data at the point it enters the
environment, acting to proactively prevent bad data from polluting enterprise information
sources. A comprehensive data quality management solution that includes a data quality
firewall will dynamically identify invalid or corrupt data as it is generated or as it flows
in from external sources, based on pre-defined business rules.
Unify Data Management and Business Intelligence
Even with the best data governance policies in place, this alone is not enough to protect
data. The sheer volume of data that flows through enterprise systems can make it
particularly challenging to maintain peak data quality at all times. It simply isn't possible
to manage quality record-by-record, or to attempt to govern every piece of data that is
collectedby an organization. The key to success is to identify and prioritize the type and
volume of data that requires data governance.
Business intelligence (BI) solutions allow organizations to determine which data sets are
most likely to be utilized and should be targeted for quality management and governance.
Astute data management processes can then be used to collect that data -- for example,

customer preferences or purchasing information -- and move it to a repositoryfor
cleansing and analysis as a high priority.
Make Business Users Data Stewards
Advanced organizations realize business professionals need to take ownership of the data
they are helping to create and feed into IT systems. This has prompted many companies
to create a data governance role to manage data quality from end-to-end.
The data governance director is typically chosen from a business group, and is the
primary focal point for all data related-needs within that group. Some organizations have
multiple roles for data governance to represent different areas of the business. These data
overseers take a leadership role in resolving data integrity issues, and act as liaisons with
the IT group that manages the underlying information management infrastructure.
Create a Data Governance Board
The primary objective for instituting a data governance board is to mitigate business risks
that arise from highly data-driven decision-making processes and systems in the current
business environment. These boards include business and IT users and are responsible for
setting data policies and standards, ensuring that there is a mechanism for resolving data
related issues, facilitating and enforcing data quality improvement efforts, and taking
proactive measures to stop data-related problems before they occur.
Wrapping up
Successful data governance starts with a solid, well-defined data management strategy,
and relies upon the selectionand implementation of a cutting edge data quality
management solution. The key to effective data quality management is to create data
integrity teams, comprised of a combination of IT staff and business users, with business
users taking the lead and maintaining primary ownership for preserving the quality of any
incoming data.
While data integrity teams will drive the data quality management plan forward, it is also
important to have a comprehensive data quality management solution in place. This will
make the strategy more effective by enabling data governance professionals to profile,
transform and standardize information.
To best support data quality goals, the quality management solution should be Web-
enabled and must be intuitive to use so operational business users can play a vital role in

data governance activities. When data strategy and governance is led from a business
perspective and enabled by a complete solution, true data integrity can be ensured across
the organization.
==================
III. Quality management tools
1. Check sheet
The check sheet is a form (document) used to collect data
in real time at the location where the data is generated.
The data it captures can be quantitative or qualitative.
When the information is quantitative, the check sheet is
sometimes called a tally sheet.
The defining characteristic of a check sheet is that data
are recorded by making marks ("checks") on it. A typical
check sheet is divided into regions, and marks made in
different regions have different significance. Data are
read by observing the location and number of marks on
the sheet.
Check sheets typically employ a heading that answers the
Five Ws:
 Who filled out the check sheet
 What was collected (what each check represents,
an identifying batch or lot number)
 Where the collection took place (facility, room,
apparatus)
 When the collection took place (hour, shift, day
of the week)
 Why the data were collected
2. Control chart

Control charts, also known as Shewhart charts
(after Walter A. Shewhart) or process-behavior
charts, in statistical process control are tools used
to determine if a manufacturing or business
process is in a state of statistical control.
If analysis of the control chart indicates that the
process is currently under control (i.e., is stable,
with variation only coming from sources common
to the process), then no corrections or changes to
process control parameters are needed or desired.
In addition, data from the process can be used to
predict the future performance of the process. If
the chart indicates that the monitored process is
not in control, analysis of the chart can help
determine the sources of variation, as this will
result in degraded process performance.[1] A
process that is stable but operating outside of
desired (specification) limits (e.g., scrap rates
may be in statistical control but above desired
limits) needs to be improved through a deliberate
effort to understand the causes of current
performance and fundamentally improve the
process.
The control chart is one of the seven basic tools of
quality control.[3] Typically control charts are
used for time-series data, though they can be used
for data that have logical comparability (i.e. you
want to compare samples that were taken all at
the same time, or the performance of different
individuals), however the type of chart used to do
this requires consideration.
3. Pareto chart

A Pareto chart, named after Vilfredo Pareto, is a type
of chart that contains both bars and a line graph, where
individual values are represented in descending order
by bars, and the cumulative total is represented by the
line.
The left vertical axis is the frequency of occurrence,
but it can alternatively represent cost or another
important unit of measure. The right vertical axis is
the cumulative percentage of the total number of
occurrences, total cost, or total of the particular unit of
measure. Because the reasons are in decreasing order,
the cumulative function is a concave function. To take
the example above, in order to lower the amount of
late arrivals by 78%, it is sufficient to solve the first
three issues.
The purpose of the Pareto chart is to highlight the
most important among a (typically large) set of
factors. In quality control, it often represents the most
common sources of defects, the highest occurring type
of defect, or the most frequent reasons for customer
complaints, and so on. Wilkinson (2006) devised an
algorithm for producing statistically based acceptance
limits (similar to confidence intervals) for each bar in
the Pareto chart.
4. Scatterplot Method
A scatter plot, scatterplot, or scattergraph is a type of
mathematical diagram using Cartesian coordinates to
display values for two variables for a set of data.
The data is displayed as a collection of points, each
having the value of one variable determining the position
on the horizontal axis and the value of the other variable
determining the position on the vertical axis.[2] This kind
of plot is also called a scatter chart, scattergram, scatter
diagram,[3] or scatter graph.
A scatter plot is used when a variable exists that is under
the control of the experimenter. If a parameter exists that

is systematically incremented and/or decremented by the
other, it is called the control parameter or independent
variable and is customarily plotted along the horizontal
axis. The measured or dependent variable is customarily
plotted along the vertical axis. If no dependent variable
exists, either type of variable can be plotted on either axis
and a scatter plot will illustrate only the degree of
correlation (not causation) between two variables.
A scatter plot can suggest various kinds of correlations
between variables with a certain confidence interval. For
example, weight and height, weight would be on x axis
and height would be on the y axis. Correlations may be
positive (rising), negative (falling), or null (uncorrelated).
If the pattern of dots slopes from lower left to upper right,
it suggests a positive correlation between the variables
being studied. If the pattern of dots slopes from upper left
to lower right, it suggests a negative correlation. A line of
best fit (alternatively called 'trendline') can be drawn in
order to study the correlation between the variables. An
equation for the correlation between the variables can be
determined by established best-fit procedures. For a linear
correlation, the best-fit procedure is known as linear
regression and is guaranteed to generate a correct solution
in a finite time. No universal best-fit procedure is
guaranteed to generate a correct solution for arbitrary
relationships. A scatter plot is also very useful when we
wish to see how two comparable data sets agree with each
other. In this case, an identity line, i.e., a y=x line, or an
1:1 line, is often drawn as a reference. The more the two
data sets agree, the more the scatters tend to concentrate in
the vicinity of the identity line; if the two data sets are
numerically identical, the scatters fall on the identity line
exactly.

5.Ishikawa diagram
Ishikawa diagrams (also called fishbone diagrams,
herringbone diagrams, cause-and-effect diagrams, or
Fishikawa) are causal diagrams created by Kaoru
Ishikawa (1968) that show the causes of a specific
event.[1][2] Common uses of the Ishikawa diagram are
product design and quality defect prevention, to identify
potential factors causing an overall effect. Each cause or
reason for imperfection is a source of variation. Causes
are usually grouped into major categories to identify these
sources of variation. The categories typically include
 People: Anyone involved with the process
 Methods: How the process is performed and the
specific requirements for doing it, such as policies,
procedures, rules, regulations and laws
 Machines: Any equipment, computers, tools, etc.
required to accomplish the job
 Materials: Raw materials, parts, pens, paper, etc.
used to produce the final product
 Measurements: Data generated from the process
that are used to evaluate its quality
 Environment: The conditions, such as location,
time, temperature, and culture in which the process
operates
6. Histogram method

A histogram is a graphical representation of the
distribution of data. It is an estimate of the probability
distribution of a continuous variable (quantitative
variable) and was first introduced by Karl Pearson.[1] To
construct a histogram, the first step is to "bin" the range of
values -- that is, divide the entire range of values into a
series of small intervals -- and then count how many
values fall into each interval. A rectangle is drawn with
height proportional to the count and width equal to the bin
size, so that rectangles abut each other. A histogram may
also be normalized displaying relative frequencies. It then
shows the proportion of cases that fall into each of several
categories, with the sum of the heights equaling 1. The
bins are usually specified as consecutive, non-overlapping
intervals of a variable. The bins (intervals) must be
adjacent, and usually equal size.[2] The rectangles of a
histogram are drawn so that they touch each other to
indicate that the original variable is continuous.[3]
III. Other topics relatedto Data quality management (pdf download)
quality management systems
quality management courses
quality management tools
iso 9001 quality management system
quality management process
quality management system example
quality system management
quality management techniques
quality management standards
quality management policy
quality management strategy
quality management books

Data quality management

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Data quality management

Similar to Data quality management (20)

More from selinasimpson0101

More from selinasimpson0101 (14)

Data quality management