SlideShare a Scribd company logo
1 of 12
Download to read offline
Making Data Quality a Way of Life
By adopting a rigorous operational and governance process for
data quality, organizations can ensure the veracity and business
impact of data big and small.
Executive Summary
Many organizations have contemplated, or are
already on the journey to, making data quality a
way of life. What many of them missed, however,
are the necessary operational contexts of data
quality and its interplay with a well-honed gover-
nance program that allows the functions which
comprise data quality as a way of life.
This white paper explores the necessary opera-
tional activities that make data quality a way of
life. It also examines why no data, including what
is normally deemed unstructured data and big
data, can be excluded from such a program and
the necessary governance interplays that are
required to make data quality efforts less of a
chore and more normal operating procedure.
Data Quality
Data quality comprises the processes that allow
business stewards to extract, capture and origi-
nate value from high-quality data by ensuring,
through a repeatable process, that all data goes
through a reasonability check. More specifically:
• Supplying the highest-quality data to those
who use it in the form of knowledge, learned
inferences, heard inferences and innovations
as a vehicle to capture, originate or extract
value for the organization.
Providing a means for those who count on data
being trustworthy to validate the data through
an always-available audit trail that provides the
lineage of transformations and references to
reasonability exceptions.
Holding the excellence of data quality available
at the successful completion of a development
project to the same standard long after the
development effort is completed.
Done correctly, operational data quality provides
a vehicle to ensure the trustworthiness of infor-
mation, which will allow for more expedient and
effective execution of actions. It enhances the
value of data captured from the marketplace and
thus reduces the risks associated with internal
processes and disruptions originating outside the
organization and optimizes the market disrup-
tions associated with innovations introduced into
the marketplace. The combined impact of value
captured, risks mitigated and the assault of inno-
vations is a material component of the value of
the data assets of the organization.
• Cognizant 20-20 Insights
•
•
cognizant 20-20 insights | november 2014
2cognizant 20-20 insights
To ensure that its data is highly relevant and accu-
rate, an organization must:
•	 Engage a continuous process that certifies the
accuracy of data made available for recurring
and extraordinary business processes.
•	 Ensure that the data made available is current
and just in time for recurring and extraordi-
nary business processes. This means that data
quality processes that hold up the certification
of data negatively impact the value of data
utilized for business processes.
•	 Ensure that the processes employed for cer-
tification of data made available for recurring
and extraordinary business processes are
repeatable and auditable.
•	 Encapsulate the operational data quality
processes with proactive data governance
processes that ensure that the data is relevant
and that identified changes to the status quo
(internal processes and system changes, market
disruptions, etc.) are judiciously reviewed to
ensure that they do not negatively impact the
relevance, timeliness or quality of the data.
Why Operational Data Quality Is
Different than Data Quality Projects
Operational data quality, while related to data
quality efforts that occur during development
efforts, requires a mind-set shift among data
quality practitioners. Some processes that are
prerequisites to enable operational data quality
are often not enabled out of the box by the major
data quality products. The baselining process,
which we will detail later on in this white paper, is
necessary for identifying data quality exceptions,
which are then categorized in the operational
data quality processes.
This concept of operational data quality is a
departure from what many organizations use
their data quality process for, which is often limit-
ed to creating a set of rules to elevate the quality
of information consumed during a development
project by enriching, de-duplicating or cleansing
the information. However, the data quality efforts
are commonly re-deployed after the completion
of the development initiative to help with the next
development initiative. The reasonability checks
to ensure the quality of information such as mes-
sages, queries and files are rarely employed. Thus,
the once pristine data quality associated with the
Data
Insight
Value
Action
Knowledge
Learned
Inference
Heard
Inference
Innovation
Supply
Chain
Regulatory
Expected
Outcome
Value
Chain
Relevant
Act
ionable Trustwo
rthy
Focused
Investor
Consumer
Extracted
Originated
Market
Captured
How Data Drives Value
Figure 1
cognizant 20-20 insights 3
development project often dissipates into less
than trustworthy information.
The vendors that publish data quality suites are
mostly focused on assisting with ensuring pris-
tine data during a delivery initiative. They rarely
have complete solutions, and commonly have no
capabilities for operational data quality controls.
In order to implement operational data qual-
ity capabilities, four major components are
mandatory:
•	 Translate the rules developed for data
quality into a repeatable workflow that can
be executed either prior to or in conjunc-
tion with extract, transform and load (ETL)
processes. Depending on the system, this can
be done utilizing either a message/transac-
tion process or a file (or query)/batch process.
This workflow must be made operational to be
useful. It is important to note that many data
quality solutions available in the marketplace
cannot be made operational and therefore
cannot be used as an operational data quality
engine.
•	 Isolate a place to record exceptions identified
during the operational data quality process.
This can be done by flagging exceptions in the
normal data stream (e.g., if isolating exceptions
would negatively impact the financial per-
formance of the organization, such as not
processing customer orders because of data
quality issues) or holding them for cleansing
and resubmitting.
•	 Develop a series of baselines used to
determine the reasonability of incoming data.
•	 Categorize the baseline exceptions and
dispatch exceptions to the originators (i.e.,
the suppliers of source data that is validated
for reasonability using operational data quality
controls) for exposed transformation errors
and to compliance and management teams for
process exceptions (market disruptions that
require management attention, fraud, etc.).
There is a movement to manage data as a non-
expiring asset of the organization, making this a
primary function of the newly-created position
of chief data officer. If information is trustwor-
thy, it can be wielded with an understanding that
the underlying data is relevant and accurate.
Furthermore, it means that the information’s
derivation can be reviewed at any point in time,
should its derivation be questioned. Highly rel-
evant and accurate information that can be
wielded just in time is just worth more to the
organization because it can be used to generate
greater degrees of value. Measuring the value
of the data asset is the next great challenge
for the business community. Peter Aiken’s Data
Blueprint describes (through data strategy and
other means) the processes to ensure the accu-
racy and relevance of data and to measure the
value of the information exposed to the data blue-
print processes.
When the focus of data quality is on a project or
program either consuming data from or supply-
ing data assets to other programs, the interplay
?Do I have the most
current data?
Is there more relevant
data I should be using?
What was done to it
before I received it?
What is the lineage
of the data?
Does the data mean
what I think it means?
Will my analyses and
decisions be tainted by
bad data?
Can I trust the data?
What is the context
of the data?
Gauging Trustworthiness
Figure 2
cognizant 20-20 insights 4
between the organization’s data governance and
data quality functions comes into focus. At orga-
nizations that treat their data as a key asset, the
data governance function is accountable for pri-
oritizing the means to elevate the value of data
assets, one of which is data quality. The symbio-
sis between data quality and data governance is
especially strong and will be explored later in this
white paper.
The Secret to Making Data Quality
a Way of Life
Data assets used to derive value for a business
entity can and should be mapped to a business
model to assess the organizational risks asso-
ciated with the business model artifacts that
consume or supply data to the business. These
exposures will be one of two types: originated
data sources or transformed data sources that
should be carefully scrutinized as part of a well-
honed data trustworthiness assurance process.
Only then can a business measure and ensure
the creation, capture and origination of value of
data assets used throughout the enterprise (see
Figure 3).
The value of data can be measured by the
incremental benefits it provides to the organiza-
tion — through fine-tuning processes, shepherding
innovations for market disruption, capitalizing on
opportunities presented in the marketplace or
mitigating risks sourced within the organization
(i.e., process misfires, fraud, etc.) or outside the
organization (i.e., competitors cherry-picking cus-
tomer bases, competitive market disrupters, etc.).
Operational data quality taken to its natural end
state is the process for determining which compo-
nents generated from within the organization or
Data Sources
Internal Data Sources
Process Data
Change Data Capture
Transaction Data
Reference Data
Master Data
Acquired Data
Sources
Reference Data
Market Data
Crowd Sourced Data
Social Media Data
Log Sourced
Web Logs
Mobile Logs
Call Center Logs
Workflows
Other Data
Customer Provided
Internet Accessible
RFID Provided
Acquired Inferences
Heard Inferences
Innovations
Data Origination Trustworthiness Assurance
Offering
Information/Knowledge
Capabilities
Non-Tangible Goods
Tangible Goods
Services
Interactions
Customer Facing
Supplier Facing
Employee Facing
Financier Facing
Agent Facing
Media Facing
Regulator Facing
Channel
Brick  Mortar | B:C
Brick  Mortar | B:B
Brick  Mortar | B:B:C
Digital | B:C
Digital | C:C
Digital | B:B:C
Service in Kind
Product in Kind
Donated
Monetize
Licensing
Usage Fee
Sale
Leasing
Subscription
Advertising
Promotion
Data Transformation Trustworthiness Assurance
Data Generation
Combining Known
Log Preparation
CrowdSourcing
Transaction
Master
Priorities
Plans
Operations
Data Creation
Processing
Execute
Track
Annotate
Collaborate
Innovate
Manage
Analytics
Descriptive
Predictive
Prescriptive
Visualization
Disseminate
Collaborate
Detect  Alert
Business Model
Reporting
Statutory
Management
Regulatory
Financier
Customer
Media
Employee
Investor
Key Activities
BusinessModelArtifacts
DataGovernanceInterfaces
Going to the Source
Figure 3
cognizant 20-20 insights 5
collected from external data sources are outside
the expected tolerances. Statistically known as
an outlier, this point can be either accurate good
or bad news or something that requires research.
It does not mean that the data point is bad; it
simply means that what arrived through the
operational data quality processes was an outli-
er and presents either a data quality remediation
opportunity or something requiring management
action. Something that requires management
action is in fact the gateway to data-driven risk
management, or a detector for something that
isn’t meeting expected data tolerances.
There are four data stores or databases required
to perform operational data quality. It is impor-
tant to note that in most cases the data stores
provided by third-party vendors do not provide
out-of-the-box capabilities to execute operational
data quality; in most cases, they are designed to
augment delivery projects.
•	 Data stores:
 To validate the reasonability of incoming
data, the total population of previously pro-
cessed data is required. This will be used to
compute the variance from expected values
(typically two to three standard deviations
from the mean for analyzable information
such as metrics, numeric data, etc.) and the
alignment to employed taxonomies, ontolo-
gies and reference data for non-tabular and
textual data.
 A place to house the acceptability criteria
or baseline criteria. Criteria are used at run-
time to determine if incoming data is rea-
sonable. Examples of baselines include sales
with a variance greater than one sigma, or a
variance of more than one standard devia-
tion from the mean of sales; such instances
should be held for review. It doesn’t mean
that this information is bad; it just means
that it requires further investigation.
 A place to house data that is deemed worthy
of further investigation prior to committing
it to the intended target.
 A place to house baseline statistics, or the
measures used to orchestrate operational
data quality. An example of a baseline sta-
tistic is that a +/-5% tolerance is acceptable
for day-over-day valuation of a managed
asset within a trading desk portfolio; thus,
day-over-day variations greater than +/-5%
would be recorded as an operational data
Making Sense of Data Reasonability
Figure 4
Incoming Data
Listener
(optional)
Baseline Criteria for
Validating Data
Reasonableness
If listener technology
is used, discard
reasonable data, but
keep statistics.
Scorecard
Profiled Quality
Defects
Reasonability
Score
Previously Stored Data
Baseline Profiling
Routines
Suspense for
Suspect Data
Suspended Data
Alert
Baseline
Statistics
Baseline Results
cognizant 20-20 insights 6
quality exception. It is important to note
that the operational data quality team will
be accountable for reviewing the number of
exceptions triggered for each measure as a
way to determine if the rules used to trigger
operational data quality exceptions were too
lax or stringent. Accordingly, in coordination
with the data governance team, they would
make the necessary changes to the baseline
statistics.
•	 Program code used to orchestrate opera-
tional data quality:
 This is a means of interrogating incoming
transactions. It can be done one of three
ways:
»» A batch cycle that interrogates data
reasonableness after the fact through
regularly scheduled routines. This means
of validating requires the ability to flag
data written into the intended target with
some designation indicating that the data
is suspect and under investigation. Batch
operations do not have the stringent per-
formance constraints of online process-
ing environments. Therefore, while they
are important, the impact of data qual-
ity processes on the overall performance
characteristics of production operations
will require less scrutiny, thereby allow-
ing additional in-line or parallel process-
ing to identify and flag or set aside data
suspected of significant data quality
anomalies. The flagging of data (setting a
data quality flag or setting data aside) is
the primary means of identifying suspect
data that needs further investigation.
•	 Generally, when target environments
are complex — such as a big data lake
or a complex data environment such
as SAP, PeopleSoft or a federated data
warehouse — it is less manageable to
flag data designated for reasonable-
ness validation. This is because such
complex environments house data in
ways that require program code to lo-
cate the data (e.g., ABAP code for ac-
cessing data in SAP, or NOSQL inter-
faces for big data). The batch approach
to reasonability does not impair perfor-
mance characteristics of production op-
erations, but will allow decisions to be
made on suspect data until such time
that the reasonability flag can be set.
»» A listener-based technology, which grabs
a copy of data for the target environment
while it is being processed to validate its
reasonability. It then either sets a flag or
writes data to a new data store as close
to the writing of data to the intended
target as practicable. While many data
quality tools do not possess this capabil-
ity out of the box, it can be implemented
with minimal effort. This approach does
not impair the performance characteris-
tics of production operations and mini-
mizes the timeframe in which there is
a risk that decisions can be made using
suspect data.
»» Using stream data quality routines to
validate the quality of information as part
of the job stream and either set a flag to
designate data that needs validation or
write data to a suspense data store. This
approach eliminates the timeframe in
which decisions can be made using sus-
pect data, but introduces a risk that the
performance characteristics of produc-
tion operations will be inconsistent due
to in-stream data quality checks.
»» A means to send out data quality alerts
concerning data items worthy of review.
It is important to prioritize the alerts so
that those who need every data alert get
them, while managers who need to re-
view the alerts (i.e., the CFO for financial
alerts) receive only material alerts and
summary statistics.
»» A means to ensure that data items
flagged for review are actually reviewed
and dispensed. Typically, the data quality
team accountable for operational data
quality produces aging reports of the ex-
ceptions and publishes the categorized
aging through a dashboard or scorecard.
A new class of software has emerged to perform
data quality checks at every data touch-point
(e.g., when data was combined with other data,
moved to another state in the job flow or other-
wise transformed). The rule base, which typically
serves as the means to judge the tolerances for
numeric data variances and the reasonableness
of transactions, is used to score the incoming
data and determine whether it should be reviewed
prior to committing it as an executed transaction
or transmission file.
cognizant 20-20 insights 7
It should be expected that there will be a large
amount of data that is created by the data lineage
process, and that understanding what systems
are touched by a data lineage system through-
out its data flow will be quite difficult in advance
without some level of listener technology (i.e.,
technology that tracks the movement of data
rather than explicitly pre-recording the hops the
data is expected to traverse).
Such oversight is required for new regulatory pro-
cesses in the financial brokerage (i.e., Dodd-Frank)
and insurance spaces (i.e., Solvency II). But there
are additional outside data touch-points in other
industries, such as trading exchanges, where
there is an inherent trust that data is transmit-
ted properly, or royalty-based solutions and other
transaction environments where the lineage of
data needs to be understood (i.e., fraud detection,
validated systems, Basel III, etc.).
Enabling Operational Data Quality
To effect operational data quality, it is critical
that data is pre-validated to assure reasonabili-
ty before being made available for reporting and
analytics. This requires operational interfaces to
assure the reasonability of data prior to when it
arrives in staging environments, data warehouses
and reporting marts.
Such a process will become unwieldy if the busi-
ness lacks an organization that is tasked with
orchestrating data quality validation processes.
This organization should be accountable for:
•	 Completeness of the baselines (baselines are
written for information used in key decision-
making processes).
•	 Assurance that processes used to ensure infor-
mation reasonableness are operating and that
they are not negatively impacting the stability
and performance of the business’s operational
procedures. Without this, it is unlikely that the
business will embark on the operational data
quality journey.
•	 Validation that information flagged as ques-
tionable is dispensed with. Aging of the reason-
ability flags or information held in suspense
will be used as one of the primary vehicles.
•	 A process to review information flagged for
reasonability and bucket exceptions as origina-
tion issues, transformation issues, procedural
issues, procedural exceptions (i.e., fraud) and
market disruptions (new innovations from
competitors, etc.). Note that this requires an
understanding of the business represented by
the data exceptions needed to make this call.
The bucketed exception summary and highlight
lists will be provided to the governance and/or
compliance functions. Governance will utilize
the list to prioritize initiatives while compliance
will use the information to fine-tune compliance
rules and potentially build surveillance cases.
Data Integrity Functionality Data Quality ETL Data Lineage
Validate data conformance to standards and business rules
during loads, inserts, audits, into a system.
✓ ✓ ✓
Identify errors as data moves between systems. ✓ ✓ ✓
Recognize in situ errors caused during manual data updates. ✓
Identify false positives (data conforms, but isn’t actually valid). ✓ ✓
Spot data that is missing. ✓ ✓
Spot data that is abnormal (not in line with norms/trending). ✓ ✓
Maintain system lineage, history and context for each data value
event across all systems for the term of the data (who, what,
where, when).
✓
Consolidate data value consistency and conformance across all
data portfolio sources for each LOB in a dashboard view.
✓
Provide an enterprise data integrity analysis repository. ✓
Provide troubleshooting ability to crawl backward/forward along
the value chain and recreate a point-in-time view.
✓
Certifies data’s conformance to standards and rules. ✓
Figure 5
Data Integrity Scorecard
cognizant 20-20 insights 8
The process, which accommodates both mes-
sage-based and file/query-based operations,
has several important components, which will
be evaluated section by section.
Operational Data Quality Step 1: Ingestion
The data quality process will never have, nor
should it have, the performance character-
istics of operational processes. Therefore, it
should never be in line to operational processes
because they will upset the notion of consistently
favorable performance characteristics of opera-
tional systems.
The ingestion process is either to attach to a
backup or read-only copy of data or to ingest
messages and files into the DQ environment
for analysis.
Operational Data Quality Step 2: Processing
From the data cache, data profiling rules will be
executed using baselining rules that will identify
and dispense with exceptions, or data that fails
reasonability tests.
Data that passes all reasonability tests will be
ignored by the operational data quality processes.
Operational Data Quality Step 3: Categorizing
The categorization process will segregate excep-
tions into:
•	 Market disruptions: items requiring
management attention for potential strategy
refinement.
•	 Procedural exceptions: items requiring
compliance attention for potential fraud and
other exceptions.
•	 Procedural issues.
•	 Transformation errors, requiring the attention
of ETL teams.
•	 Origination errors, requiring the attention of
development teams.
Operational Data Quality Step 4:
Communicating
Once categorized, the exceptions are sent off:
•	 To the supplier of the source exposed to opera-
tional data quality processes.
• Internally Sourced Files
• Internally Sourced
Messages
• Change Data Capture
• Log Centric Data
• Partner Data
(Files  Messages)
• Reference Data • Market Disruptions
• Procedural Exceptions
• Procedural Issues
• Transformation Errors
• Origination Errors
DQ Cache
No
NoINGESTION PROCESSING CATEGORIZING COMMUNICATING
Yes
Yes
Is
Standardization
Required?
Is this
Data
Baselined?
Reference Data
From Reference
Data Load
Processes
Standardization Rules
Data Quality Mart Governance
Councel
Supplier of Message or
Data File for Data Fixes
Compliance
Alerts DQ Issues 
Opportunities
Operational
DQ Process
High Quality
Data to ETL
Initial Profiling
Cleansing 
Baselining
Data Warehouse
or Operational
DataStore
Baselining Exceptions
(fail baselining scoring
algorithms)
Classification
Process
DQ Operations Scorecard
Sources to Be
Baselined
Profiling, Cleansing
Standardizing and
Baselining Rules
Code for
Turnover
Initial Profiling
Cleansing 
Baselining
To DQ Production
Turnover, ETL Processes
To ETL
Processes
High Level Data Quality Processes
Figure 6
cognizant 20-20 insights 9
•	 To the governance counsel for addressing
changes to the priorities that impact the
valuation of the data assets of the organization.
•	 To the compliance team for dispensation of
fraud and other issues.
Scorecards are also constructed for measuring
the continued improvement of data quality and
aging of operational data quality exceptions.
Operational Data Quality: Supporting
Data Stores
The supporting data stores to the operational
data quality process are:
•	 The rules used for operational data quality.
•	 The source exposed to the scrutiny of opera-
tional data quality.
•	 The identified baselining exceptions.
•	 The reference data and standardization rules
utilized.
•	 The quality mart used to measure increased
data quality.
• Internally Sourced Files
• Internally Sourced
Messages
• Change Data Capture
• Log-centric Data
• Partner Data
(Files  Messages)
• Reference Data
DQ Cache
• Market Disruptions
• Procedural Exceptions
• Procedural Issues
• Transformation Errors
• Origination Errors
Classification
Process
DQ Operations
DQ Cache
No
NoPROCESSING
Yes
Yes
Is
Standardization
Required?
Is this
Data
Baselined?
Operational
DQ Process
High Quality
Data to ETL
Initial Profiling
Cleansing 
Baselining
Data Warehouse
or Operational
DataStore
Baselining Exceptions
(Fail Baselining Scoring
Algorithms)
Classification
Process
Sources to Be
Baselined
Profiling, Cleansing
Standardizing and
Baselining Rules
Code for
Turnover
Initial Profiling
Cleansing 
Baselining
Figure 8
Figure 7
Figure 9
Data Sources that Need Quality
Safeguarding
Data Quality Workflow
Lack of Data Quality — the Dangers
cognizant 20-20 insights 10
Getting on the Road to Operational
Data Quality
There are several ways to build a well-honed oper-
ational data quality program — one devised to
make data quality a way of life. Very few of the
components are technical in nature but those
that are have critical specific roles.
Leadership
While leadership is a subset of people, it is an
important component of the operational data
quality roadmap. It is leadership of the data qual-
ity organization that will shepherd the organized
transformation from a project data quality focus
(largely a development effort) to an operational
data quality focus (largely a support, not a main-
tenance, effort). Leadership also will justify the
necessary financing for the execution of opera-
tional data quality controls.
Without proper leadership, ensuring buy-in to
a potentially disruptive process will become an
issue for those who have the highest need for
these operational data quality controls. This is
because there will be additional work for them
to dispense with the exceptions identified by the
operational data controls. (The process appears
disruptive because it creates work for the great-
est sources of suspect data). These exceptions
have in the past been lumped into a general pool
of challenges to data trustworthiness.
Leadership from the data quality organization
will also be accountable for spotlighting recurring
data quality themes so that the data governance
organization can prioritize and sponsor the nec-
essary development efforts to eliminate these
data quality challenges.
Leadership from both the data quality and gov-
ernance programs will be continuously required
to measure and communicate the value of
data assets and their impact in the valuation of
data assets.
People
Trained experts must be available to manage
the codification of the baseline and categoriza-
tion processes in order to execute an operational
data quality program. These individuals will be
involved in several components of the operation-
al data quality program:
•	 Facilitator and caretaker for the baseline rules
used for operational data quality.
Governance
Councel
Supplier of Message or
Data File for Data Fixes
Compliance
DQ Operations
Operational
DQ Scorecard
Suspect Data
Aging Scorecard
Alerts
Data Warehouse
or Operational
DataStore
Baselining Exceptions
(Fail Baselining Scoring
Algorithms)
Profiling, Cleansing
Standardizing and
Baselining Rules
Reference Data Standardization Rules Data Quality Mart
Data Quality Program Flow
Figure 10
Figure 11
Inputs on Data Quality Management
cognizant 20-20 insights 11
•	 Process owner for the categorization of opera-
tional data quality exceptions.
•	 Process owner for the computation of opera-
tional data quality metrics, including those that
measure the positive incremental value of data
assets associated with the operational data
quality program.
•	 Process owner for the operational data quality
scorecard.
•	 Process owner for the aging of operational
data quality exceptions.
•	 Facilitator for the processes associated with
the publication of alerts to source systems,
governance and compliance.
Metrics
The metrics are the message that the operational
data quality program is functioning as expected
and that it is having the intended effects on the
organization’s data assets. Key asset categories
that should be monitored include:
•	 Usage metrics, such as data quality sessions,
volume of data accessed by data quality
facilities, number of fixes made to data
through cleansing algorithms, number of
records enriched through standardization and
other routines, etc. Usage metrics should track
upward over time.
•	 Impact metrics, such as the improvement
in customer information, increased revenue
associated with better quality, cost reductions
due to reduced data validation expenditures,
etc. Impact metrics should track upward
over time.
•	 Effort metrics, such as the hours of training
provided for data quality, the number of pre-
sentations made about data quality, the effort
in obtaining data that requires cleansing, etc.
Effort metrics should track downward per
initiative over time.
•	 Cost metrics, such as the cost of software,
hardware, acquired data and manpower
required to perform the data quality function.
Both direct and indirect costs should be
tracked, and these should track downward per
data quality initiative over time.
•	 Operational effectiveness metrics, which
measure the success of managing data as
an organizational asset and the reduction of
operational data quality exceptions caught per
interfaced application over time.
Process
The processes of data quality will be directed by
the governance council. A key requirement is that
the process employed for operational data quality
fits the organization rather than trying to change
the organization’s culture and operational pro-
cesses to fit the operational data quality program.
Technology
There are minimal technology components
required to orchestrate an operational data qual-
ity program. These include:
•	 It can have direct access to files and messages
used in the company’s operations without
adverselyimpactingtheconsistentlyacceptable
performance of production operations. Many
data quality facilities do not have the capability
of analyzing incoming data without placing it
in an environment augmented for data quality
operations. Such a requirement adds an unac-
ceptable delay on making data available just in
time for regularly occurring and extraordinary
business processes.
•	 It can be integrated as the primary interface
or interfaces to an operational data quality
program without significant uplift in data
quality effort.
•	 It can be operated without manual interven-
tion. Such manual intervention would eliminate
the benefits achieved from an operational data
quality program.
Data
Data is the raw material for an operational data
quality program. Without data that leverages the
operational data quality program, the process is
simply a test case and will have minimal if any
impact on incrementally improving the value of
the organization’s data assets.
Looking Ahead
One closing remark on the roadmap to opera-
tional data quality as a vehicle to making data
quality a way of life is that all of the items on the
roadmap do not happen at once. Also, there will
be organizational strife that hopefully can be
dispersed by the governance council. Without a
functional, active data governance council, reduc-
ing organizational strife associated with intiating
an operational data quality roadmap must be the
responsibility of the leadership of the data qual-
ity program.
A scorecard (as depicted in Figure 12) is tradi-
tionally the communication vehicle of choice to
report on the activity and success of the opera-
tional data quality program and its effectiveness
as a major cog in the management of data as
a non-expiring asset of the organization and
making data quality a way of life.
An Operational Data Quality Scorecard
Operational Data Quality Assurance Program | Data Exception Statistics
Number of Origination Errors Captured by the ODQ Program Average Time Required to Address Data Quality Exceptions
Number of Transformation Errors Captured by the ODQ Program Number of Programs Subscribing to ODQ Program
DRILL EXPLAIN
DRILL EXPLAIN
DRILL EXPLAIN
DRILL EXPLAIN
SAMPLE
Figure 12
About the Author
Mark Albala is the Lead for Cognizant’s Data Quality and Governance Service Line. This service line
provides the expertise in data quality and governance solutions to the portfolio of activities tackled by
the Enterprise Information Management Business Unit. A graduate of Syracuse University, Mark has
held senior thought leadership, advanced technical and trusted advisory roles for organizations focused
on the disciplines of information management for over 20 years. He can be reached at Mark.Albala@
cognizant.com.
About Cognizant
Cognizant (NASDAQ: CTSH) is a leading provider of information technology, consulting, and business process out-
sourcing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered in
Teaneck, New Jersey (U.S.), Cognizant combines a passion for client satisfaction, technology innovation, deep industry
and business process expertise, and a global, collaborative workforce that embodies the future of work. With over
75 development and delivery centers worldwide and approximately 199,700 employees as of September 30, 2014,
Cognizant is a member of the NASDAQ-100, the SP 500, the Forbes Global 2000, and the Fortune 500 and is ranked
among the top performing and fastest growing companies in the world. Visit us online at www.cognizant.com or follow
us on Twitter: Cognizant.
World Headquarters
500 Frank W. Burr Blvd.
Teaneck, NJ 07666 USA
Phone: +1 201 801 0233
Fax: +1 201 801 0243
Toll Free: +1 888 937 3277
Email: inquiry@cognizant.com
European Headquarters
1 Kingdom Street
Paddington Central
London W2 6BD
Phone: +44 (0) 20 7297 7600
Fax: +44 (0) 20 7121 0102
Email: infouk@cognizant.com
India Operations Headquarters
#5/535, Old Mahabalipuram Road
Okkiyam Pettai, Thoraipakkam
Chennai, 600 096 India
Phone: +91 (0) 44 4209 6000
Fax: +91 (0) 44 4209 6060
Email: inquiryindia@cognizant.com
­­© Copyright 2014, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is
subject to change without notice. All other trademarks mentioned herein are the property of their respective owners.

More Related Content

What's hot

Pursuing Customer Inspired Growth
Pursuing Customer Inspired GrowthPursuing Customer Inspired Growth
Pursuing Customer Inspired GrowthKearney
 
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...Vin Malhotra
 
2012 Global Services Compendium - GS100
2012 Global Services Compendium - GS1002012 Global Services Compendium - GS100
2012 Global Services Compendium - GS100divination55555
 
Competitive Intelligence Needs-Assessment: Asking the right questions
Competitive Intelligence Needs-Assessment: Asking the right questions Competitive Intelligence Needs-Assessment: Asking the right questions
Competitive Intelligence Needs-Assessment: Asking the right questions ValueNotes
 
A Data-driven Maturity Model for Modernized, Automated, and Transformed IT
A Data-driven Maturity Model for Modernized, Automated, and Transformed ITA Data-driven Maturity Model for Modernized, Automated, and Transformed IT
A Data-driven Maturity Model for Modernized, Automated, and Transformed ITbalejandre
 
Whitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in EnterpriseWhitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in EnterpriseBRIDGEi2i Analytics Solutions
 
Boosting impact bcg
Boosting impact bcgBoosting impact bcg
Boosting impact bcgAdCMO
 
IoT: Powering the Future of Business and Improving Everyday Life
IoT: Powering the Future of Business and Improving Everyday LifeIoT: Powering the Future of Business and Improving Everyday Life
IoT: Powering the Future of Business and Improving Everyday LifeCognizant
 
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...Altimeter, a Prophet Company
 
The global innovation 1000 strategy-business winter-2013
The global innovation 1000 strategy-business winter-2013The global innovation 1000 strategy-business winter-2013
The global innovation 1000 strategy-business winter-2013Reza Seifollahy
 
Achieve Excellence through Customer Experience
Achieve Excellence through Customer ExperienceAchieve Excellence through Customer Experience
Achieve Excellence through Customer ExperienceNaveen Agarwal
 
Applied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_YhatApplied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_YhatCharlie Hecht
 
Survey Results Age Of Unbounded Data June 03 10
Survey Results Age Of Unbounded Data June 03 10Survey Results Age Of Unbounded Data June 03 10
Survey Results Age Of Unbounded Data June 03 10nhaque
 
"Big data in western europe today" Forrester / Xerox 2015
"Big data in western europe today" Forrester / Xerox 2015"Big data in western europe today" Forrester / Xerox 2015
"Big data in western europe today" Forrester / Xerox 2015yann le gigan
 
PwC's Unlock data possibilities - infographic
PwC's Unlock data possibilities - infographicPwC's Unlock data possibilities - infographic
PwC's Unlock data possibilities - infographicPwC
 

What's hot (18)

Pursuing Customer Inspired Growth
Pursuing Customer Inspired GrowthPursuing Customer Inspired Growth
Pursuing Customer Inspired Growth
 
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...
Technolony Vision 2016 - Primacy Of People First In A Digital World - Vin Mal...
 
2012 Global Services Compendium - GS100
2012 Global Services Compendium - GS1002012 Global Services Compendium - GS100
2012 Global Services Compendium - GS100
 
Competitive Intelligence Needs-Assessment: Asking the right questions
Competitive Intelligence Needs-Assessment: Asking the right questions Competitive Intelligence Needs-Assessment: Asking the right questions
Competitive Intelligence Needs-Assessment: Asking the right questions
 
A Data-driven Maturity Model for Modernized, Automated, and Transformed IT
A Data-driven Maturity Model for Modernized, Automated, and Transformed ITA Data-driven Maturity Model for Modernized, Automated, and Transformed IT
A Data-driven Maturity Model for Modernized, Automated, and Transformed IT
 
Whitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in EnterpriseWhitepaper - Simplifying Analytics Adoption in Enterprise
Whitepaper - Simplifying Analytics Adoption in Enterprise
 
Boosting impact bcg
Boosting impact bcgBoosting impact bcg
Boosting impact bcg
 
Winning with a data-driven strategy
Winning with a data-driven strategyWinning with a data-driven strategy
Winning with a data-driven strategy
 
IoT: Powering the Future of Business and Improving Everyday Life
IoT: Powering the Future of Business and Improving Everyday LifeIoT: Powering the Future of Business and Improving Everyday Life
IoT: Powering the Future of Business and Improving Everyday Life
 
Organizational effectiveness goes digital
Organizational effectiveness goes digital  Organizational effectiveness goes digital
Organizational effectiveness goes digital
 
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...
[Slides] Content Marketing Vendor Landscape: Marketer Needs & Vendor Solution...
 
The global innovation 1000 strategy-business winter-2013
The global innovation 1000 strategy-business winter-2013The global innovation 1000 strategy-business winter-2013
The global innovation 1000 strategy-business winter-2013
 
C-2014-4-Meijer-EN
C-2014-4-Meijer-ENC-2014-4-Meijer-EN
C-2014-4-Meijer-EN
 
Achieve Excellence through Customer Experience
Achieve Excellence through Customer ExperienceAchieve Excellence through Customer Experience
Achieve Excellence through Customer Experience
 
Applied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_YhatApplied_Data_Science_Presented_by_Yhat
Applied_Data_Science_Presented_by_Yhat
 
Survey Results Age Of Unbounded Data June 03 10
Survey Results Age Of Unbounded Data June 03 10Survey Results Age Of Unbounded Data June 03 10
Survey Results Age Of Unbounded Data June 03 10
 
"Big data in western europe today" Forrester / Xerox 2015
"Big data in western europe today" Forrester / Xerox 2015"Big data in western europe today" Forrester / Xerox 2015
"Big data in western europe today" Forrester / Xerox 2015
 
PwC's Unlock data possibilities - infographic
PwC's Unlock data possibilities - infographicPwC's Unlock data possibilities - infographic
PwC's Unlock data possibilities - infographic
 

Similar to Making Data Quality a Way of Life: Operational Contexts and Governance Interplays

Data quality management best practices
Data quality management best practicesData quality management best practices
Data quality management best practicesselinasimpson2201
 
Quality management best practices
Quality management best practicesQuality management best practices
Quality management best practicesselinasimpson2201
 
Data quality management system
Data quality management systemData quality management system
Data quality management systemselinasimpson361
 
Collaborate 2012-accelerated-business-data-validation-and-managemet
Collaborate 2012-accelerated-business-data-validation-and-managemetCollaborate 2012-accelerated-business-data-validation-and-managemet
Collaborate 2012-accelerated-business-data-validation-and-managemetChain Sys Corporation
 
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineQlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineSrikanth Sharma Boddupalli
 
5 Best Practices of Effective Data Quality Management
5 Best Practices of Effective Data Quality Management5 Best Practices of Effective Data Quality Management
5 Best Practices of Effective Data Quality ManagementData Entry India Outsource
 
Overall Approach to Data Quality ROI
Overall Approach to Data Quality ROIOverall Approach to Data Quality ROI
Overall Approach to Data Quality ROIFindWhitePapers
 
eCommerce Product Data Governance: Why Does It Matter?
eCommerce Product Data Governance: Why Does It Matter?eCommerce Product Data Governance: Why Does It Matter?
eCommerce Product Data Governance: Why Does It Matter?Arnav Malhotra
 
My role as chief data officer
My role as chief data officerMy role as chief data officer
My role as chief data officerGed Mirfin
 
5 Pillars Of Effective Data Management In Modern Data Systems.pdf
5 Pillars Of Effective Data Management In Modern Data Systems.pdf5 Pillars Of Effective Data Management In Modern Data Systems.pdf
5 Pillars Of Effective Data Management In Modern Data Systems.pdfaNumak & Company
 
12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality ProjectsInnovative_Systems
 
Overall Approach To Data Quality Roi
Overall Approach To Data Quality RoiOverall Approach To Data Quality Roi
Overall Approach To Data Quality RoiWilliam McKnight
 
Multidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data ManagementMultidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data ManagementCognizant
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckBeth Fitzpatrick
 
Best Practices of Data Governance.pptx
Best Practices of Data Governance.pptxBest Practices of Data Governance.pptx
Best Practices of Data Governance.pptxpreludesyscloudmigra
 
Data Quality: The Cornerstone Of High-Yield Technology Investments
Data Quality: The Cornerstone Of High-Yield Technology InvestmentsData Quality: The Cornerstone Of High-Yield Technology Investments
Data Quality: The Cornerstone Of High-Yield Technology InvestmentsshaileshShetty34
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality PlaybookObservePoint
 
Information Governance: Reducing Costs and Increasing Customer Satisfaction
Information Governance: Reducing Costs and Increasing Customer SatisfactionInformation Governance: Reducing Costs and Increasing Customer Satisfaction
Information Governance: Reducing Costs and Increasing Customer SatisfactionCapgemini
 
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementBeyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementHarley Capewell
 

Similar to Making Data Quality a Way of Life: Operational Contexts and Governance Interplays (20)

Data quality management best practices
Data quality management best practicesData quality management best practices
Data quality management best practices
 
Quality management best practices
Quality management best practicesQuality management best practices
Quality management best practices
 
Data quality management system
Data quality management systemData quality management system
Data quality management system
 
Collaborate 2012-accelerated-business-data-validation-and-managemet
Collaborate 2012-accelerated-business-data-validation-and-managemetCollaborate 2012-accelerated-business-data-validation-and-managemet
Collaborate 2012-accelerated-business-data-validation-and-managemet
 
Data quality management
Data quality managementData quality management
Data quality management
 
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipelineQlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
Qlik wp 2021_q3_data_governance_in_the_modern_data_analytics_pipeline
 
5 Best Practices of Effective Data Quality Management
5 Best Practices of Effective Data Quality Management5 Best Practices of Effective Data Quality Management
5 Best Practices of Effective Data Quality Management
 
Overall Approach to Data Quality ROI
Overall Approach to Data Quality ROIOverall Approach to Data Quality ROI
Overall Approach to Data Quality ROI
 
eCommerce Product Data Governance: Why Does It Matter?
eCommerce Product Data Governance: Why Does It Matter?eCommerce Product Data Governance: Why Does It Matter?
eCommerce Product Data Governance: Why Does It Matter?
 
My role as chief data officer
My role as chief data officerMy role as chief data officer
My role as chief data officer
 
5 Pillars Of Effective Data Management In Modern Data Systems.pdf
5 Pillars Of Effective Data Management In Modern Data Systems.pdf5 Pillars Of Effective Data Management In Modern Data Systems.pdf
5 Pillars Of Effective Data Management In Modern Data Systems.pdf
 
12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects12 Guidelines For Success in Data Quality Projects
12 Guidelines For Success in Data Quality Projects
 
Overall Approach To Data Quality Roi
Overall Approach To Data Quality RoiOverall Approach To Data Quality Roi
Overall Approach To Data Quality Roi
 
Multidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data ManagementMultidimensional Challenges and the Impact of Test Data Management
Multidimensional Challenges and the Impact of Test Data Management
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
 
Best Practices of Data Governance.pptx
Best Practices of Data Governance.pptxBest Practices of Data Governance.pptx
Best Practices of Data Governance.pptx
 
Data Quality: The Cornerstone Of High-Yield Technology Investments
Data Quality: The Cornerstone Of High-Yield Technology InvestmentsData Quality: The Cornerstone Of High-Yield Technology Investments
Data Quality: The Cornerstone Of High-Yield Technology Investments
 
OberservePoint - The Digital Data Quality Playbook
OberservePoint - The Digital Data Quality  PlaybookOberservePoint - The Digital Data Quality  Playbook
OberservePoint - The Digital Data Quality Playbook
 
Information Governance: Reducing Costs and Increasing Customer Satisfaction
Information Governance: Reducing Costs and Increasing Customer SatisfactionInformation Governance: Reducing Costs and Increasing Customer Satisfaction
Information Governance: Reducing Costs and Increasing Customer Satisfaction
 
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality ManagementBeyond Firefighting: A Leaders Guide to Proactive Data Quality Management
Beyond Firefighting: A Leaders Guide to Proactive Data Quality Management
 

More from Cognizant

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Cognizant
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingCognizant
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesCognizant
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition EngineeredCognizant
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...Cognizant
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesCognizant
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateCognizant
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...Cognizant
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Cognizant
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Cognizant
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityCognizant
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersCognizant
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalCognizant
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueCognizant
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachCognizant
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudCognizant
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedCognizant
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the FutureCognizant
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformCognizant
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...Cognizant
 

More from Cognizant (20)

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition Engineered
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for Sustainability
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for Insurers
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to Value
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First Approach
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the Cloud
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the Future
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data Platform
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
 

Making Data Quality a Way of Life: Operational Contexts and Governance Interplays

  • 1. Making Data Quality a Way of Life By adopting a rigorous operational and governance process for data quality, organizations can ensure the veracity and business impact of data big and small. Executive Summary Many organizations have contemplated, or are already on the journey to, making data quality a way of life. What many of them missed, however, are the necessary operational contexts of data quality and its interplay with a well-honed gover- nance program that allows the functions which comprise data quality as a way of life. This white paper explores the necessary opera- tional activities that make data quality a way of life. It also examines why no data, including what is normally deemed unstructured data and big data, can be excluded from such a program and the necessary governance interplays that are required to make data quality efforts less of a chore and more normal operating procedure. Data Quality Data quality comprises the processes that allow business stewards to extract, capture and origi- nate value from high-quality data by ensuring, through a repeatable process, that all data goes through a reasonability check. More specifically: • Supplying the highest-quality data to those who use it in the form of knowledge, learned inferences, heard inferences and innovations as a vehicle to capture, originate or extract value for the organization. Providing a means for those who count on data being trustworthy to validate the data through an always-available audit trail that provides the lineage of transformations and references to reasonability exceptions. Holding the excellence of data quality available at the successful completion of a development project to the same standard long after the development effort is completed. Done correctly, operational data quality provides a vehicle to ensure the trustworthiness of infor- mation, which will allow for more expedient and effective execution of actions. It enhances the value of data captured from the marketplace and thus reduces the risks associated with internal processes and disruptions originating outside the organization and optimizes the market disrup- tions associated with innovations introduced into the marketplace. The combined impact of value captured, risks mitigated and the assault of inno- vations is a material component of the value of the data assets of the organization. • Cognizant 20-20 Insights • • cognizant 20-20 insights | november 2014
  • 2. 2cognizant 20-20 insights To ensure that its data is highly relevant and accu- rate, an organization must: • Engage a continuous process that certifies the accuracy of data made available for recurring and extraordinary business processes. • Ensure that the data made available is current and just in time for recurring and extraordi- nary business processes. This means that data quality processes that hold up the certification of data negatively impact the value of data utilized for business processes. • Ensure that the processes employed for cer- tification of data made available for recurring and extraordinary business processes are repeatable and auditable. • Encapsulate the operational data quality processes with proactive data governance processes that ensure that the data is relevant and that identified changes to the status quo (internal processes and system changes, market disruptions, etc.) are judiciously reviewed to ensure that they do not negatively impact the relevance, timeliness or quality of the data. Why Operational Data Quality Is Different than Data Quality Projects Operational data quality, while related to data quality efforts that occur during development efforts, requires a mind-set shift among data quality practitioners. Some processes that are prerequisites to enable operational data quality are often not enabled out of the box by the major data quality products. The baselining process, which we will detail later on in this white paper, is necessary for identifying data quality exceptions, which are then categorized in the operational data quality processes. This concept of operational data quality is a departure from what many organizations use their data quality process for, which is often limit- ed to creating a set of rules to elevate the quality of information consumed during a development project by enriching, de-duplicating or cleansing the information. However, the data quality efforts are commonly re-deployed after the completion of the development initiative to help with the next development initiative. The reasonability checks to ensure the quality of information such as mes- sages, queries and files are rarely employed. Thus, the once pristine data quality associated with the Data Insight Value Action Knowledge Learned Inference Heard Inference Innovation Supply Chain Regulatory Expected Outcome Value Chain Relevant Act ionable Trustwo rthy Focused Investor Consumer Extracted Originated Market Captured How Data Drives Value Figure 1
  • 3. cognizant 20-20 insights 3 development project often dissipates into less than trustworthy information. The vendors that publish data quality suites are mostly focused on assisting with ensuring pris- tine data during a delivery initiative. They rarely have complete solutions, and commonly have no capabilities for operational data quality controls. In order to implement operational data qual- ity capabilities, four major components are mandatory: • Translate the rules developed for data quality into a repeatable workflow that can be executed either prior to or in conjunc- tion with extract, transform and load (ETL) processes. Depending on the system, this can be done utilizing either a message/transac- tion process or a file (or query)/batch process. This workflow must be made operational to be useful. It is important to note that many data quality solutions available in the marketplace cannot be made operational and therefore cannot be used as an operational data quality engine. • Isolate a place to record exceptions identified during the operational data quality process. This can be done by flagging exceptions in the normal data stream (e.g., if isolating exceptions would negatively impact the financial per- formance of the organization, such as not processing customer orders because of data quality issues) or holding them for cleansing and resubmitting. • Develop a series of baselines used to determine the reasonability of incoming data. • Categorize the baseline exceptions and dispatch exceptions to the originators (i.e., the suppliers of source data that is validated for reasonability using operational data quality controls) for exposed transformation errors and to compliance and management teams for process exceptions (market disruptions that require management attention, fraud, etc.). There is a movement to manage data as a non- expiring asset of the organization, making this a primary function of the newly-created position of chief data officer. If information is trustwor- thy, it can be wielded with an understanding that the underlying data is relevant and accurate. Furthermore, it means that the information’s derivation can be reviewed at any point in time, should its derivation be questioned. Highly rel- evant and accurate information that can be wielded just in time is just worth more to the organization because it can be used to generate greater degrees of value. Measuring the value of the data asset is the next great challenge for the business community. Peter Aiken’s Data Blueprint describes (through data strategy and other means) the processes to ensure the accu- racy and relevance of data and to measure the value of the information exposed to the data blue- print processes. When the focus of data quality is on a project or program either consuming data from or supply- ing data assets to other programs, the interplay ?Do I have the most current data? Is there more relevant data I should be using? What was done to it before I received it? What is the lineage of the data? Does the data mean what I think it means? Will my analyses and decisions be tainted by bad data? Can I trust the data? What is the context of the data? Gauging Trustworthiness Figure 2
  • 4. cognizant 20-20 insights 4 between the organization’s data governance and data quality functions comes into focus. At orga- nizations that treat their data as a key asset, the data governance function is accountable for pri- oritizing the means to elevate the value of data assets, one of which is data quality. The symbio- sis between data quality and data governance is especially strong and will be explored later in this white paper. The Secret to Making Data Quality a Way of Life Data assets used to derive value for a business entity can and should be mapped to a business model to assess the organizational risks asso- ciated with the business model artifacts that consume or supply data to the business. These exposures will be one of two types: originated data sources or transformed data sources that should be carefully scrutinized as part of a well- honed data trustworthiness assurance process. Only then can a business measure and ensure the creation, capture and origination of value of data assets used throughout the enterprise (see Figure 3). The value of data can be measured by the incremental benefits it provides to the organiza- tion — through fine-tuning processes, shepherding innovations for market disruption, capitalizing on opportunities presented in the marketplace or mitigating risks sourced within the organization (i.e., process misfires, fraud, etc.) or outside the organization (i.e., competitors cherry-picking cus- tomer bases, competitive market disrupters, etc.). Operational data quality taken to its natural end state is the process for determining which compo- nents generated from within the organization or Data Sources Internal Data Sources Process Data Change Data Capture Transaction Data Reference Data Master Data Acquired Data Sources Reference Data Market Data Crowd Sourced Data Social Media Data Log Sourced Web Logs Mobile Logs Call Center Logs Workflows Other Data Customer Provided Internet Accessible RFID Provided Acquired Inferences Heard Inferences Innovations Data Origination Trustworthiness Assurance Offering Information/Knowledge Capabilities Non-Tangible Goods Tangible Goods Services Interactions Customer Facing Supplier Facing Employee Facing Financier Facing Agent Facing Media Facing Regulator Facing Channel Brick Mortar | B:C Brick Mortar | B:B Brick Mortar | B:B:C Digital | B:C Digital | C:C Digital | B:B:C Service in Kind Product in Kind Donated Monetize Licensing Usage Fee Sale Leasing Subscription Advertising Promotion Data Transformation Trustworthiness Assurance Data Generation Combining Known Log Preparation CrowdSourcing Transaction Master Priorities Plans Operations Data Creation Processing Execute Track Annotate Collaborate Innovate Manage Analytics Descriptive Predictive Prescriptive Visualization Disseminate Collaborate Detect Alert Business Model Reporting Statutory Management Regulatory Financier Customer Media Employee Investor Key Activities BusinessModelArtifacts DataGovernanceInterfaces Going to the Source Figure 3
  • 5. cognizant 20-20 insights 5 collected from external data sources are outside the expected tolerances. Statistically known as an outlier, this point can be either accurate good or bad news or something that requires research. It does not mean that the data point is bad; it simply means that what arrived through the operational data quality processes was an outli- er and presents either a data quality remediation opportunity or something requiring management action. Something that requires management action is in fact the gateway to data-driven risk management, or a detector for something that isn’t meeting expected data tolerances. There are four data stores or databases required to perform operational data quality. It is impor- tant to note that in most cases the data stores provided by third-party vendors do not provide out-of-the-box capabilities to execute operational data quality; in most cases, they are designed to augment delivery projects. • Data stores: To validate the reasonability of incoming data, the total population of previously pro- cessed data is required. This will be used to compute the variance from expected values (typically two to three standard deviations from the mean for analyzable information such as metrics, numeric data, etc.) and the alignment to employed taxonomies, ontolo- gies and reference data for non-tabular and textual data. A place to house the acceptability criteria or baseline criteria. Criteria are used at run- time to determine if incoming data is rea- sonable. Examples of baselines include sales with a variance greater than one sigma, or a variance of more than one standard devia- tion from the mean of sales; such instances should be held for review. It doesn’t mean that this information is bad; it just means that it requires further investigation. A place to house data that is deemed worthy of further investigation prior to committing it to the intended target. A place to house baseline statistics, or the measures used to orchestrate operational data quality. An example of a baseline sta- tistic is that a +/-5% tolerance is acceptable for day-over-day valuation of a managed asset within a trading desk portfolio; thus, day-over-day variations greater than +/-5% would be recorded as an operational data Making Sense of Data Reasonability Figure 4 Incoming Data Listener (optional) Baseline Criteria for Validating Data Reasonableness If listener technology is used, discard reasonable data, but keep statistics. Scorecard Profiled Quality Defects Reasonability Score Previously Stored Data Baseline Profiling Routines Suspense for Suspect Data Suspended Data Alert Baseline Statistics Baseline Results
  • 6. cognizant 20-20 insights 6 quality exception. It is important to note that the operational data quality team will be accountable for reviewing the number of exceptions triggered for each measure as a way to determine if the rules used to trigger operational data quality exceptions were too lax or stringent. Accordingly, in coordination with the data governance team, they would make the necessary changes to the baseline statistics. • Program code used to orchestrate opera- tional data quality: This is a means of interrogating incoming transactions. It can be done one of three ways: »» A batch cycle that interrogates data reasonableness after the fact through regularly scheduled routines. This means of validating requires the ability to flag data written into the intended target with some designation indicating that the data is suspect and under investigation. Batch operations do not have the stringent per- formance constraints of online process- ing environments. Therefore, while they are important, the impact of data qual- ity processes on the overall performance characteristics of production operations will require less scrutiny, thereby allow- ing additional in-line or parallel process- ing to identify and flag or set aside data suspected of significant data quality anomalies. The flagging of data (setting a data quality flag or setting data aside) is the primary means of identifying suspect data that needs further investigation. • Generally, when target environments are complex — such as a big data lake or a complex data environment such as SAP, PeopleSoft or a federated data warehouse — it is less manageable to flag data designated for reasonable- ness validation. This is because such complex environments house data in ways that require program code to lo- cate the data (e.g., ABAP code for ac- cessing data in SAP, or NOSQL inter- faces for big data). The batch approach to reasonability does not impair perfor- mance characteristics of production op- erations, but will allow decisions to be made on suspect data until such time that the reasonability flag can be set. »» A listener-based technology, which grabs a copy of data for the target environment while it is being processed to validate its reasonability. It then either sets a flag or writes data to a new data store as close to the writing of data to the intended target as practicable. While many data quality tools do not possess this capabil- ity out of the box, it can be implemented with minimal effort. This approach does not impair the performance characteris- tics of production operations and mini- mizes the timeframe in which there is a risk that decisions can be made using suspect data. »» Using stream data quality routines to validate the quality of information as part of the job stream and either set a flag to designate data that needs validation or write data to a suspense data store. This approach eliminates the timeframe in which decisions can be made using sus- pect data, but introduces a risk that the performance characteristics of produc- tion operations will be inconsistent due to in-stream data quality checks. »» A means to send out data quality alerts concerning data items worthy of review. It is important to prioritize the alerts so that those who need every data alert get them, while managers who need to re- view the alerts (i.e., the CFO for financial alerts) receive only material alerts and summary statistics. »» A means to ensure that data items flagged for review are actually reviewed and dispensed. Typically, the data quality team accountable for operational data quality produces aging reports of the ex- ceptions and publishes the categorized aging through a dashboard or scorecard. A new class of software has emerged to perform data quality checks at every data touch-point (e.g., when data was combined with other data, moved to another state in the job flow or other- wise transformed). The rule base, which typically serves as the means to judge the tolerances for numeric data variances and the reasonableness of transactions, is used to score the incoming data and determine whether it should be reviewed prior to committing it as an executed transaction or transmission file.
  • 7. cognizant 20-20 insights 7 It should be expected that there will be a large amount of data that is created by the data lineage process, and that understanding what systems are touched by a data lineage system through- out its data flow will be quite difficult in advance without some level of listener technology (i.e., technology that tracks the movement of data rather than explicitly pre-recording the hops the data is expected to traverse). Such oversight is required for new regulatory pro- cesses in the financial brokerage (i.e., Dodd-Frank) and insurance spaces (i.e., Solvency II). But there are additional outside data touch-points in other industries, such as trading exchanges, where there is an inherent trust that data is transmit- ted properly, or royalty-based solutions and other transaction environments where the lineage of data needs to be understood (i.e., fraud detection, validated systems, Basel III, etc.). Enabling Operational Data Quality To effect operational data quality, it is critical that data is pre-validated to assure reasonabili- ty before being made available for reporting and analytics. This requires operational interfaces to assure the reasonability of data prior to when it arrives in staging environments, data warehouses and reporting marts. Such a process will become unwieldy if the busi- ness lacks an organization that is tasked with orchestrating data quality validation processes. This organization should be accountable for: • Completeness of the baselines (baselines are written for information used in key decision- making processes). • Assurance that processes used to ensure infor- mation reasonableness are operating and that they are not negatively impacting the stability and performance of the business’s operational procedures. Without this, it is unlikely that the business will embark on the operational data quality journey. • Validation that information flagged as ques- tionable is dispensed with. Aging of the reason- ability flags or information held in suspense will be used as one of the primary vehicles. • A process to review information flagged for reasonability and bucket exceptions as origina- tion issues, transformation issues, procedural issues, procedural exceptions (i.e., fraud) and market disruptions (new innovations from competitors, etc.). Note that this requires an understanding of the business represented by the data exceptions needed to make this call. The bucketed exception summary and highlight lists will be provided to the governance and/or compliance functions. Governance will utilize the list to prioritize initiatives while compliance will use the information to fine-tune compliance rules and potentially build surveillance cases. Data Integrity Functionality Data Quality ETL Data Lineage Validate data conformance to standards and business rules during loads, inserts, audits, into a system. ✓ ✓ ✓ Identify errors as data moves between systems. ✓ ✓ ✓ Recognize in situ errors caused during manual data updates. ✓ Identify false positives (data conforms, but isn’t actually valid). ✓ ✓ Spot data that is missing. ✓ ✓ Spot data that is abnormal (not in line with norms/trending). ✓ ✓ Maintain system lineage, history and context for each data value event across all systems for the term of the data (who, what, where, when). ✓ Consolidate data value consistency and conformance across all data portfolio sources for each LOB in a dashboard view. ✓ Provide an enterprise data integrity analysis repository. ✓ Provide troubleshooting ability to crawl backward/forward along the value chain and recreate a point-in-time view. ✓ Certifies data’s conformance to standards and rules. ✓ Figure 5 Data Integrity Scorecard
  • 8. cognizant 20-20 insights 8 The process, which accommodates both mes- sage-based and file/query-based operations, has several important components, which will be evaluated section by section. Operational Data Quality Step 1: Ingestion The data quality process will never have, nor should it have, the performance character- istics of operational processes. Therefore, it should never be in line to operational processes because they will upset the notion of consistently favorable performance characteristics of opera- tional systems. The ingestion process is either to attach to a backup or read-only copy of data or to ingest messages and files into the DQ environment for analysis. Operational Data Quality Step 2: Processing From the data cache, data profiling rules will be executed using baselining rules that will identify and dispense with exceptions, or data that fails reasonability tests. Data that passes all reasonability tests will be ignored by the operational data quality processes. Operational Data Quality Step 3: Categorizing The categorization process will segregate excep- tions into: • Market disruptions: items requiring management attention for potential strategy refinement. • Procedural exceptions: items requiring compliance attention for potential fraud and other exceptions. • Procedural issues. • Transformation errors, requiring the attention of ETL teams. • Origination errors, requiring the attention of development teams. Operational Data Quality Step 4: Communicating Once categorized, the exceptions are sent off: • To the supplier of the source exposed to opera- tional data quality processes. • Internally Sourced Files • Internally Sourced Messages • Change Data Capture • Log Centric Data • Partner Data (Files Messages) • Reference Data • Market Disruptions • Procedural Exceptions • Procedural Issues • Transformation Errors • Origination Errors DQ Cache No NoINGESTION PROCESSING CATEGORIZING COMMUNICATING Yes Yes Is Standardization Required? Is this Data Baselined? Reference Data From Reference Data Load Processes Standardization Rules Data Quality Mart Governance Councel Supplier of Message or Data File for Data Fixes Compliance Alerts DQ Issues Opportunities Operational DQ Process High Quality Data to ETL Initial Profiling Cleansing Baselining Data Warehouse or Operational DataStore Baselining Exceptions (fail baselining scoring algorithms) Classification Process DQ Operations Scorecard Sources to Be Baselined Profiling, Cleansing Standardizing and Baselining Rules Code for Turnover Initial Profiling Cleansing Baselining To DQ Production Turnover, ETL Processes To ETL Processes High Level Data Quality Processes Figure 6
  • 9. cognizant 20-20 insights 9 • To the governance counsel for addressing changes to the priorities that impact the valuation of the data assets of the organization. • To the compliance team for dispensation of fraud and other issues. Scorecards are also constructed for measuring the continued improvement of data quality and aging of operational data quality exceptions. Operational Data Quality: Supporting Data Stores The supporting data stores to the operational data quality process are: • The rules used for operational data quality. • The source exposed to the scrutiny of opera- tional data quality. • The identified baselining exceptions. • The reference data and standardization rules utilized. • The quality mart used to measure increased data quality. • Internally Sourced Files • Internally Sourced Messages • Change Data Capture • Log-centric Data • Partner Data (Files Messages) • Reference Data DQ Cache • Market Disruptions • Procedural Exceptions • Procedural Issues • Transformation Errors • Origination Errors Classification Process DQ Operations DQ Cache No NoPROCESSING Yes Yes Is Standardization Required? Is this Data Baselined? Operational DQ Process High Quality Data to ETL Initial Profiling Cleansing Baselining Data Warehouse or Operational DataStore Baselining Exceptions (Fail Baselining Scoring Algorithms) Classification Process Sources to Be Baselined Profiling, Cleansing Standardizing and Baselining Rules Code for Turnover Initial Profiling Cleansing Baselining Figure 8 Figure 7 Figure 9 Data Sources that Need Quality Safeguarding Data Quality Workflow Lack of Data Quality — the Dangers
  • 10. cognizant 20-20 insights 10 Getting on the Road to Operational Data Quality There are several ways to build a well-honed oper- ational data quality program — one devised to make data quality a way of life. Very few of the components are technical in nature but those that are have critical specific roles. Leadership While leadership is a subset of people, it is an important component of the operational data quality roadmap. It is leadership of the data qual- ity organization that will shepherd the organized transformation from a project data quality focus (largely a development effort) to an operational data quality focus (largely a support, not a main- tenance, effort). Leadership also will justify the necessary financing for the execution of opera- tional data quality controls. Without proper leadership, ensuring buy-in to a potentially disruptive process will become an issue for those who have the highest need for these operational data quality controls. This is because there will be additional work for them to dispense with the exceptions identified by the operational data controls. (The process appears disruptive because it creates work for the great- est sources of suspect data). These exceptions have in the past been lumped into a general pool of challenges to data trustworthiness. Leadership from the data quality organization will also be accountable for spotlighting recurring data quality themes so that the data governance organization can prioritize and sponsor the nec- essary development efforts to eliminate these data quality challenges. Leadership from both the data quality and gov- ernance programs will be continuously required to measure and communicate the value of data assets and their impact in the valuation of data assets. People Trained experts must be available to manage the codification of the baseline and categoriza- tion processes in order to execute an operational data quality program. These individuals will be involved in several components of the operation- al data quality program: • Facilitator and caretaker for the baseline rules used for operational data quality. Governance Councel Supplier of Message or Data File for Data Fixes Compliance DQ Operations Operational DQ Scorecard Suspect Data Aging Scorecard Alerts Data Warehouse or Operational DataStore Baselining Exceptions (Fail Baselining Scoring Algorithms) Profiling, Cleansing Standardizing and Baselining Rules Reference Data Standardization Rules Data Quality Mart Data Quality Program Flow Figure 10 Figure 11 Inputs on Data Quality Management
  • 11. cognizant 20-20 insights 11 • Process owner for the categorization of opera- tional data quality exceptions. • Process owner for the computation of opera- tional data quality metrics, including those that measure the positive incremental value of data assets associated with the operational data quality program. • Process owner for the operational data quality scorecard. • Process owner for the aging of operational data quality exceptions. • Facilitator for the processes associated with the publication of alerts to source systems, governance and compliance. Metrics The metrics are the message that the operational data quality program is functioning as expected and that it is having the intended effects on the organization’s data assets. Key asset categories that should be monitored include: • Usage metrics, such as data quality sessions, volume of data accessed by data quality facilities, number of fixes made to data through cleansing algorithms, number of records enriched through standardization and other routines, etc. Usage metrics should track upward over time. • Impact metrics, such as the improvement in customer information, increased revenue associated with better quality, cost reductions due to reduced data validation expenditures, etc. Impact metrics should track upward over time. • Effort metrics, such as the hours of training provided for data quality, the number of pre- sentations made about data quality, the effort in obtaining data that requires cleansing, etc. Effort metrics should track downward per initiative over time. • Cost metrics, such as the cost of software, hardware, acquired data and manpower required to perform the data quality function. Both direct and indirect costs should be tracked, and these should track downward per data quality initiative over time. • Operational effectiveness metrics, which measure the success of managing data as an organizational asset and the reduction of operational data quality exceptions caught per interfaced application over time. Process The processes of data quality will be directed by the governance council. A key requirement is that the process employed for operational data quality fits the organization rather than trying to change the organization’s culture and operational pro- cesses to fit the operational data quality program. Technology There are minimal technology components required to orchestrate an operational data qual- ity program. These include: • It can have direct access to files and messages used in the company’s operations without adverselyimpactingtheconsistentlyacceptable performance of production operations. Many data quality facilities do not have the capability of analyzing incoming data without placing it in an environment augmented for data quality operations. Such a requirement adds an unac- ceptable delay on making data available just in time for regularly occurring and extraordinary business processes. • It can be integrated as the primary interface or interfaces to an operational data quality program without significant uplift in data quality effort. • It can be operated without manual interven- tion. Such manual intervention would eliminate the benefits achieved from an operational data quality program. Data Data is the raw material for an operational data quality program. Without data that leverages the operational data quality program, the process is simply a test case and will have minimal if any impact on incrementally improving the value of the organization’s data assets. Looking Ahead One closing remark on the roadmap to opera- tional data quality as a vehicle to making data quality a way of life is that all of the items on the roadmap do not happen at once. Also, there will be organizational strife that hopefully can be dispersed by the governance council. Without a functional, active data governance council, reduc- ing organizational strife associated with intiating an operational data quality roadmap must be the responsibility of the leadership of the data qual- ity program.
  • 12. A scorecard (as depicted in Figure 12) is tradi- tionally the communication vehicle of choice to report on the activity and success of the opera- tional data quality program and its effectiveness as a major cog in the management of data as a non-expiring asset of the organization and making data quality a way of life. An Operational Data Quality Scorecard Operational Data Quality Assurance Program | Data Exception Statistics Number of Origination Errors Captured by the ODQ Program Average Time Required to Address Data Quality Exceptions Number of Transformation Errors Captured by the ODQ Program Number of Programs Subscribing to ODQ Program DRILL EXPLAIN DRILL EXPLAIN DRILL EXPLAIN DRILL EXPLAIN SAMPLE Figure 12 About the Author Mark Albala is the Lead for Cognizant’s Data Quality and Governance Service Line. This service line provides the expertise in data quality and governance solutions to the portfolio of activities tackled by the Enterprise Information Management Business Unit. A graduate of Syracuse University, Mark has held senior thought leadership, advanced technical and trusted advisory roles for organizations focused on the disciplines of information management for over 20 years. He can be reached at Mark.Albala@ cognizant.com. About Cognizant Cognizant (NASDAQ: CTSH) is a leading provider of information technology, consulting, and business process out- sourcing services, dedicated to helping the world’s leading companies build stronger businesses. Headquartered in Teaneck, New Jersey (U.S.), Cognizant combines a passion for client satisfaction, technology innovation, deep industry and business process expertise, and a global, collaborative workforce that embodies the future of work. With over 75 development and delivery centers worldwide and approximately 199,700 employees as of September 30, 2014, Cognizant is a member of the NASDAQ-100, the SP 500, the Forbes Global 2000, and the Fortune 500 and is ranked among the top performing and fastest growing companies in the world. Visit us online at www.cognizant.com or follow us on Twitter: Cognizant. World Headquarters 500 Frank W. Burr Blvd. Teaneck, NJ 07666 USA Phone: +1 201 801 0233 Fax: +1 201 801 0243 Toll Free: +1 888 937 3277 Email: inquiry@cognizant.com European Headquarters 1 Kingdom Street Paddington Central London W2 6BD Phone: +44 (0) 20 7297 7600 Fax: +44 (0) 20 7121 0102 Email: infouk@cognizant.com India Operations Headquarters #5/535, Old Mahabalipuram Road Okkiyam Pettai, Thoraipakkam Chennai, 600 096 India Phone: +91 (0) 44 4209 6000 Fax: +91 (0) 44 4209 6060 Email: inquiryindia@cognizant.com ­­© Copyright 2014, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is subject to change without notice. All other trademarks mentioned herein are the property of their respective owners.