SlideShare a Scribd company logo
1 of 16
Download to read offline
Tackling the enterprise
Data Quality challenge
Cognitivo Consulting
January 2020
2
COMPETING IN THE DIGITAL
AGE
In a connected world,
competing effectively
in the digital age
means making the
right decisions at
pace
3
Machine learning
algorithms rely on
data to learn for
themselves
AI could potentially create
$3.5 trillion to $5.8 trillion in
annual value in the global
economy
Source: McKinsey global institute 2018
UNLOCKING THE VALUE OF AI
4
Leaders in the digital
age are able to make
strategic and
operational decisions
based on data, at
scale
THIS IS THE DATA-DRIVEN
ORGANISATION
DATA DRIVEN ORGANISATIONS
5
The quality of your
decisions will be
proportional to the
quality of your data
Data Quality is a foundational
element of achieving digital
success
BUILDING ON STABLE
FOUNDATIONS
6
DQ is a symptom of
poor processes and
systems, which
requires coordination
across the enterprise
DATA QUALITY MUST
SUPPORT PROCESS
ASSURANCE AND
IMPROVEMENT ACROSS
THE ENTERPRISE
DQ MUST BE COORDINATED
ACROSS THE ENTERPRISE
Enterprise Architecture aligned end-to-end DQ approach
7
A successful DQ initiative relies on alignment to existing enterprise and risk management frameworks and assets
Information
Architecture
Business/ Process
Architecture
Integration
Architecture
Application &
Infrastructure
Architecture
Risk based approach to
identify key processes /
use cases in-scope for
DQ improvement
Definition of customer journey’s and
process value chains with customer &
organisational outcomes defined
1 2
Definition of a business
conceptual data model &
business rules based on
in-scope processes
Agreement of definitions
(decomposition of metrics and critical
data elements), sources of truth, RACI
(e.g. owners)
3 4
Document data lineage (data flows)
between key systems for each in-scope
process / use case
5
Catalogue systems and critical data sets and controls environment
within an Information Assets Register or Source Catalogue
6
21
Impact
Likelihood
High
Med
Low
Low Medium High
Inherent Risk
(‘gross’ risk)
DQ Treatment
Process improvement
Risk
Tolerance
<<Party>> <<Item>>
Owns, rents,
buys, sells,
leases
Service
Enters
Provides,
consumes
Uses,
maintains
Creates
conditions
<<Classification>>Arrangement Type of
<<Event>>
Type of
Location
Has
Occurs at
Occurs at
Involved in
Triggers
Consists of
Creates
<<Party>> <<Item>>
Owns, rents,
buys, sells,
leases
Service
Enters
Provides,
consumes
Uses,
maintains
Creates
conditions
<<Classification>>Arrangement Type of
<<Event>>
Type of
Location
Has
Occurs at
Occurs at
Involved in
Triggers
Consists of
Creates
3
Channels
Web Mobile
Broker
Contract
Centre
Branch
CRM
Product
Origination
Fulfilment
Risk /
Capital Mgt
Follow up
Integration (Message/Stream + Batch)
Servicing
Credit
Approval
Settlement
Payments
Finance
Cloud Data Asset
KYC
Sanctions Performance
Channels
Web Mobile
Broker
Contract
Centre
Branch
CRM
Product
Origination
Fulfilment
Risk /
Capital Mgt
Follow up
Integration (Message/Stream + Batch)
Servicing
Credit
Approval
Settlement
Payments
Finance
Cloud Data Asset
KYC
Sanctions Performance
6
5
4
Customer Journey & Associated
Business Value Chain
Operational Risk Matrix
Business Conceptual
Model
Business Metric Decomposition &
Business Definitions
Integration Landscape Data Lineage
Information Asset Register Source Catalogue
EnterpriseDataDecompositionTree
Note: EcoProfit = NPAT - Cost of Equity ($) + IEL(CCA) + Imputation Credits
= NPAT – Cost of Equity (%) x Eco Cap ($)
ROE = NPAT / Book Equity, Book Equity = EcoCap = Total Reg Cap
Credit Risk Capital
Other
Revenue
IEL (CCA)
X
Eco Cap ($)
Expenses
Cost of Capital
Rate (%)
-
NPAT - EL
basis ($)
Cost of Capital
+
Franking
Credits – Tax
Allocated
Expenses
Controllable
Expenses
mRWA
Loss Data
(ELD/ILD)
Economic
Profit ($)
ROE (%)
Tenor
Customer
Asset Class
Credit RWA
Capital Ratio
X
Market Risk
Capital
Reg EL
Op Risk Capital
Investment Stakes, Fixed Assets,
Deferred Acquisition
Exposure At
Default (EAD)
Provisions &
Delinquencies
Collective
Provisions
Retail Pooling &
Segmentation
Probability of
Default
Loss Given Default
Loan Amount /
Limit
Product Features
Individual
Provisions
On/Off Balance
Sheet
Revocability
Industry / ANZSIC
Salient Financials
CCRDomicile Country
SIHeld Collateral
Pricing
(Rates / Fees)
Bank Capital
*Set by Group Treasury
oRWA
Illustrative
X
Capital Buffer
(Stress Test)
TSR
(Total Shareholder Return)
EnterpriseDataDecompositionTree
Note: EcoProfit = NPAT - Cost of Equity ($) + IEL(CCA) + Imputation Credits
= NPAT – Cost of Equity (%) x Eco Cap ($)
ROE = NPAT / Book Equity, Book Equity = EcoCap = Total Reg Cap
Credit Risk Capital
Other
Revenue
IEL (CCA)
X
Eco Cap ($)
Expenses
Cost of Capital
Rate (%)
-
NPAT - EL
basis ($)
Cost of Capital
+
Franking
Credits – Tax
Allocated
Expenses
Controllable
Expenses
mRWA
Loss Data
(ELD/ILD)
Economic
Profit ($)
ROE (%)
Tenor
Customer
Asset Class
Credit RWA
Capital Ratio
X
Market Risk
Capital
Reg EL
Op Risk Capital
Investment Stakes, Fixed Assets,
Deferred Acquisition
Exposure At
Default (EAD)
Provisions &
Delinquencies
Collective
Provisions
Retail Pooling &
Segmentation
Probability of
Default
Loss Given Default
Loan Amount /
Limit
Product Features
Individual
Provisions
On/Off Balance
Sheet
Revocability
Industry / ANZSIC
Salient Financials
CCRDomicile Country
SIHeld Collateral
Pricing
(Rates / Fees)
Bank Capital
*Set by Group Treasury
oRWA
Illustrative
X
Capital Buffer
(Stress Test)
TSR
(Total Shareholder Return)
Principles of Cognitivo’s DQ approach
8
A pragmatic approach that doesn’t “boil-the-ocean” is required to focus on priority user cases while leveraging
organisational assets and AI to scale
Risk & Policy Based – Identify key processes that possess material data risk as
prioritised areas to perform DQ diagnosis and treatment
Process (use-case) Centric – Identify data flows that underpin key processes and
address data quality across the entire system data flow
Metadata Driven – Development or use of a conceptual data model as an
abstraction layer to work with business stakeholders to agree definitions and
business rules that is subsequently mapped to physical data models
Analytics & ML Enabled – use of data science techniques (such as ML, text
analytics, vision) to build industry and organisation specific data matching and data
quality diagnosis techniques
Embedded in Business-As-Usual – Roll out of DQ controls, measurement
(dashboard) as part of the organisation’s quality assurance processes, rather than
constructing new data KPI and consequence management framework
Example DQ use cases to improve key business outcomes
9
Cognitivo has extensive experience in executing data quality programmes within Financial Services, Government and
Accounting business domains
Use Case
• KYC / AML / CTF (Assurance of data feeds)
• CPS220, AIRB Accreditation
• APRA / ABS Regulatory Reporting (e.g. report on interest-only loans)
• Basel III Liquidity (FI/non-FI review)
• APS 120 Securitisation (Loan doc reconciliation)
• FATCA, GATCA
• OTC Reform, MiFID II (Cleanse LEI / SWIFT Code, Legal Form, Country of incorporation etc.)
• APS910 – SCV assurance
• IFRS9 / IFRS17 Assurance
• Staff Benefits Review (Review of former employees still on staff benefits programmes)
• Advice Compliance (SOA, PDS vs fees and charges review)
• …
Compliance
Business Management
• Payroll Assurance
• Financial Management reporting (Line of Business)
• Finance cube, business unit, GL structure review
• …
Customer / Sales
• Customer Contact Details (marketing, product service)
• Consent status
• Customer Age Review
• Customer Address Review (e.g. Suburb / postal code combination)
• Customer segmentation review
• CRM – customer structure review (e.g. customer legal structure, customer groupings)
• ..
DQ Execution
Lifecycle
Data Quality execution lifecycle
10
Cognitivo’s DQ Execution Lifecycle is linked to broader data demand management and IT planning lifecycles
Data Risk Demand
Management
Process / System
Improvement
Diagnosis
Conduct qualitative sizing, define
requirements, business rules and
conduct root cause analysis
Profiling - Size/quantify magnitude of each DQ Issue.
• Profile key data elements for validity / completeness
issues
• Correlate data across systems to identify integrity, this can
include use of techniques financial reconciliation
(checksums)
• Deploy an analytical process to find illogical combinations
of data, outliers etc.
• Higher complexity techniques such as text analytics and
computer vision to correlate with unstructured data
sources
• Machine learning approaches to identify patterns for
acceptable values / ranges
Holistic view of DQ issues & prioritisation
• Organisation-wide DQ issues register with self-assessment process to
periodically assess level of DQ risk
• DQ deep-dives through workshop / interviews for high risk areas
• Prioritise high impact and high occurrence issues to go into ‘fix process’
Correction process
• Obtain correct values and subsequent cleansing / system update.
• Automated through cross-system and 3rd party data source lookup
or derivation
• As a final step, client outreach may be required (e.g. establish call-
centre process)
Cleansing process
• Establish process for bulk update, testing and roll back within core
systems
• For systems where bulk update is not possible, develop RPA and
manual update capabilities.
Systemic Fix
• Make recommendations for systemic fixes through the
organisation’s broader change / fail-fix agenda
1
Monitoring & Reporting
• DQ issues profiled / fixed all trace back to a business
unit, hence DQ metrics can form process/compliance
KPI’s for business owners.
• DQ scorecards can automate existing QA processes /
operational risk controls by quantifying instances
where data entry is missing / incorrect.
• Trend analysis on DQ results for each responsible
business unit
DiagnosisA
6
2 ProfilingB
Correction &
Cleansing
C
Monitoring &
Reporting
D
DQ
Discovery
Scalable Data Quality DevOps
11
Cognitivo’s data quality workflow incorporates analytical tools, business testing and deployment into a DQ DevOps process
De-duplicate customers
within systems (collapse
entities)
Customer Outreach (high-priority/risk)
Cross system & 3rd party
lookup
Document /
correspondence lookup
Customer self service
Client
Applications
Correct
(obtain correct values & validate)
Issue and Workflow Management
Manual testing
values to
update via risk-
based sampling
Database Bulk Update /
Amend
Front-End Data Entry
(Inc. use of RPA – robotic
process automation)
Cleanse
(System Update)
Source Systems &
Processes
Profiling
Monitoring &
Reporting
Business Unit QA
to include DQ
measures
Customer Matching &
Analytical Environment
Issue prioritization
(based on risk i.e.
likelihood and
consequence)
Results
Dashboard
Raise new DQ
rules
To be updated
DQ Rules
Engine
DQ Development Workspace
Ingestion
DQ Production Deployment Business / Quality
Assurance
Root Cause
Analysis
Process &
System Fix
Operational Environment
Automated processes Manual processes
DQ Workspace & Platform Architecture
12
Cognitivo has a DQ Technical reference architecture that can be implemented on any vendor-agnostic cloud or on-premise
environment
To be updated
Data Pipeline
IngestionClient
Source Systems
Client Data
Warehouse(s)
Document
Repository
New Extracts
for checksums
etc
Batch Pipeline
Real Time Pipeline
(API’s / Messaging)
User Interface
DQ Policy & Rules
Configuration
Case Management
Conformed / derived values to
expedite DQ rule execution and
provide a history of values for
outlier / drift detection
Raw source system data to
derive row counts and
perform validity checks
User Interface
Data Lake
Data Science / ML Discovery Environment
Source Data Layer
Linked data
(lightly integrated)
SchedulerRules Store
Self-service data
Ingestion
Data Science Tools /
Workbench
Execution of analytical
workloads
DQ dashboards for consumption
by data stewards and business
stakeholders (data owners)
DQ Rule Execution
(Python)
DQ Rules based on the derived
semantic data model stored in
JSON format within the rules
store
Scheduler to execute DQ
Rules on a periodic basis
DQ Rules Engine
Text Extraction &
OCR
Results Dashboard
(PowerBI / Qlik)
Data Workspace
Provisioning
DQ Profile Result
Store
Semantic /
Conformed Data
(with history)
Data Science
Development &
Collaboration Tools e.g.
Git, Jupyter
Cross-system table linking to
correlate values across matched
customers
Store DQ profiling output
results. Contains historical
values to allow historical trend
analysis of DQ
Management of DQ rules,
tolerances and business owners
of DQ events
Case management tool for
logging, investigating and
remediating DQ issues
Provision of persistent
temporary storage and access to
access controlled data sets
(specific to department, user
and use cases)
Data import tools for un-
managed datasets (used
for discovery purposes)
Text extract and
analytics libraries e.g.
Tesseract
Batch data ingestion
using file (CSV) or
ODBC/JDBC
Real-time integration
Key Selected Technologies
DQ profiling techniques to be employed
13
Cognitivo’s analytical DQ framework deploys a number of analytical tests across structured and unstructured data sources
Test for
Completeness
Record count anomalies
Financial Reconciliation
(check-sums)
Test for Validity
Data Type & Format checks
(Regex pattern match)
Allowable values Reference
data lookup
Null Value check
Test for Accuracy
Illogical combinations of
multiple data fields
(e.g. individual with a business
name)
Single Field based logic
check
(e.g. age > 100)
3rd party cross reference
Cross system value cross-
reference
Reasonable value check
(record anomaly / outlier,
value drift over time)
Test for
Timeliness
Data Ingestion (ETL/ELT)
Synchronisation review
Document Text Extraction &
Cross Reference
Computer Vision
Image recognition & object
classification
Test for
Uniqueness
Duplicates within systems
Cross-system master data
reconciliation
Case Management
DQ Analytical
Engine
Cognitivo’s DQ Platform Capabilities
14
Cognitivo has a DQ application framework can be deployed onto private clouds via containers or accessed as a SaaS offering
Data Steward Portal (UX)
• Create profiling rules
• Diagnose DQ issues through reports and dashboards
• Workflow to approve data changes and case manage remediation
• APIs to integrate with 3rd party applications and check valid data entry based on data quality
rules
Core DQ Engine
• Semantic model of parameters for data stewards to create DQ rules
• DQ rule templates (e.g. regex functions, address validity, ABN format etc.)
• Analytical engine to run complex data accuracy / integrity rules
• API to allow 3rd party and customer automation, extension and access to DQ results
Data Pipeline
• Securely connect on-prem data sources to cloud environments in an encrypted manner
(Gateway)
• Database to store multiple time-stamped sampled extracts from source systems
• Efficient data ingestion pipeline with connectors for key council systems (e.g. Dynamics, ..)
Embedding DQ processes
• Build continuous improvement initiatives within directorates based on DQ analysis (e.g.
asking additional questions when customers call/visit)
• Set DQ KPIs within process metrics (e.g. accuracy of mandatory data capture)
Investigation
(Jupyter)
Reporting
Dashboard
Customer Data Sources
DQ Profiling
Datastore
Data Stewards (Users)
User Interface
Data Quality
Hub
Scheduler DQ Profiler
Connectors
DQ Managed
Parameters
(semantic model)
Gateway
Mobile App
API
DQ Rules Library
Web Interface
Cognitivo DQ Platform Screenshots (1/2)
15
..
John Smith John Smith
Cognitivo DQ Platform Screenshots (2/2)
16
John Smith

More Related Content

What's hot

Improving data quality & complying with BCBS239
Improving data quality & complying with BCBS239Improving data quality & complying with BCBS239
Improving data quality & complying with BCBS239Alrick Dupuis
 
Business Intelligence System and instrumental level multi dimensional database
Business Intelligence System and instrumental level multi dimensional database Business Intelligence System and instrumental level multi dimensional database
Business Intelligence System and instrumental level multi dimensional database Rolta
 
Creating a Business Case for Big Data
Creating a Business Case for Big DataCreating a Business Case for Big Data
Creating a Business Case for Big DataPerficient, Inc.
 
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239Greg Soulsby
 
Aligning finance , risk and compliance
Aligning finance , risk and complianceAligning finance , risk and compliance
Aligning finance , risk and complianceJAMES OKARIMIA
 
IBOR Middle Office Information Delivery
IBOR Middle Office Information DeliveryIBOR Middle Office Information Delivery
IBOR Middle Office Information DeliveryBurak S. Arikan
 
KPMG - BCBS239_Bracing for Change
KPMG - BCBS239_Bracing for ChangeKPMG - BCBS239_Bracing for Change
KPMG - BCBS239_Bracing for ChangeNanda Thiruvengadam
 
James Okarimia Aligning Finance , Risk and Compliance to Meet Regulation
James Okarimia   Aligning Finance , Risk and Compliance to Meet RegulationJames Okarimia   Aligning Finance , Risk and Compliance to Meet Regulation
James Okarimia Aligning Finance , Risk and Compliance to Meet RegulationJAMES OKARIMIA
 
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...JAMES OKARIMIA
 
Trillium Software Building the Business Case for Data Quality
Trillium Software Building the Business Case for Data QualityTrillium Software Building the Business Case for Data Quality
Trillium Software Building the Business Case for Data QualityTrillium Software
 
The Changing Data Quality & Data Governance Landscape
The Changing Data Quality & Data Governance LandscapeThe Changing Data Quality & Data Governance Landscape
The Changing Data Quality & Data Governance LandscapeTrillium Software
 
James Okarimia - Aligning Finance , Risk and Data Analytics in Meeting the R...
James Okarimia -  Aligning Finance , Risk and Data Analytics in Meeting the R...James Okarimia -  Aligning Finance , Risk and Data Analytics in Meeting the R...
James Okarimia - Aligning Finance , Risk and Data Analytics in Meeting the R...JAMES OKARIMIA
 
Legal Entity Risk and Counter-Party Exposure April 2016
Legal Entity Risk and Counter-Party Exposure  April 2016Legal Entity Risk and Counter-Party Exposure  April 2016
Legal Entity Risk and Counter-Party Exposure April 2016bfreeman1987
 
Mis2013 chapter 12 business intelligence and knowledge management
Mis2013   chapter 12 business intelligence and knowledge managementMis2013   chapter 12 business intelligence and knowledge management
Mis2013 chapter 12 business intelligence and knowledge managementAndi Iswoyo
 
Achieving Digital Transformation in Regulatory
Achieving Digital Transformation in RegulatoryAchieving Digital Transformation in Regulatory
Achieving Digital Transformation in RegulatoryCary Smithson
 

What's hot (16)

Improving data quality & complying with BCBS239
Improving data quality & complying with BCBS239Improving data quality & complying with BCBS239
Improving data quality & complying with BCBS239
 
Business Intelligence System and instrumental level multi dimensional database
Business Intelligence System and instrumental level multi dimensional database Business Intelligence System and instrumental level multi dimensional database
Business Intelligence System and instrumental level multi dimensional database
 
Creating a Business Case for Big Data
Creating a Business Case for Big DataCreating a Business Case for Big Data
Creating a Business Case for Big Data
 
Bcbs 239 v4 30 oct
Bcbs 239 v4 30 octBcbs 239 v4 30 oct
Bcbs 239 v4 30 oct
 
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239
London Financial Modelling Group 2015 04 30 - Model driven solutions to BCBS239
 
Aligning finance , risk and compliance
Aligning finance , risk and complianceAligning finance , risk and compliance
Aligning finance , risk and compliance
 
IBOR Middle Office Information Delivery
IBOR Middle Office Information DeliveryIBOR Middle Office Information Delivery
IBOR Middle Office Information Delivery
 
KPMG - BCBS239_Bracing for Change
KPMG - BCBS239_Bracing for ChangeKPMG - BCBS239_Bracing for Change
KPMG - BCBS239_Bracing for Change
 
James Okarimia Aligning Finance , Risk and Compliance to Meet Regulation
James Okarimia   Aligning Finance , Risk and Compliance to Meet RegulationJames Okarimia   Aligning Finance , Risk and Compliance to Meet Regulation
James Okarimia Aligning Finance , Risk and Compliance to Meet Regulation
 
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...
James Okarimia - Aligning Finance, Risk and Data Analytics in Meeting the Req...
 
Trillium Software Building the Business Case for Data Quality
Trillium Software Building the Business Case for Data QualityTrillium Software Building the Business Case for Data Quality
Trillium Software Building the Business Case for Data Quality
 
The Changing Data Quality & Data Governance Landscape
The Changing Data Quality & Data Governance LandscapeThe Changing Data Quality & Data Governance Landscape
The Changing Data Quality & Data Governance Landscape
 
James Okarimia - Aligning Finance , Risk and Data Analytics in Meeting the R...
James Okarimia -  Aligning Finance , Risk and Data Analytics in Meeting the R...James Okarimia -  Aligning Finance , Risk and Data Analytics in Meeting the R...
James Okarimia - Aligning Finance , Risk and Data Analytics in Meeting the R...
 
Legal Entity Risk and Counter-Party Exposure April 2016
Legal Entity Risk and Counter-Party Exposure  April 2016Legal Entity Risk and Counter-Party Exposure  April 2016
Legal Entity Risk and Counter-Party Exposure April 2016
 
Mis2013 chapter 12 business intelligence and knowledge management
Mis2013   chapter 12 business intelligence and knowledge managementMis2013   chapter 12 business intelligence and knowledge management
Mis2013 chapter 12 business intelligence and knowledge management
 
Achieving Digital Transformation in Regulatory
Achieving Digital Transformation in RegulatoryAchieving Digital Transformation in Regulatory
Achieving Digital Transformation in Regulatory
 

Similar to Cognitivo - Tackling the enterprise data quality challenge

SD Basel process automation seminar presentation
SD Basel process automation seminar presentationSD Basel process automation seminar presentation
SD Basel process automation seminar presentationsarojkdas
 
ClearCost Introduction 2015
ClearCost Introduction 2015ClearCost Introduction 2015
ClearCost Introduction 2015Mark S. Mahre
 
TOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxTOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxSabrinaLameiras1
 
Fuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernanceFuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernancePedro Martins
 
Ebookblogv2 120116015321-phpapp01
Ebookblogv2 120116015321-phpapp01Ebookblogv2 120116015321-phpapp01
Ebookblogv2 120116015321-phpapp01Shubhashish Biswas
 
Get Smart About Technical Debt
Get Smart About Technical DebtGet Smart About Technical Debt
Get Smart About Technical DebtCAST
 
Business Intelligence Industry Perspective Session I
Business Intelligence   Industry Perspective Session IBusiness Intelligence   Industry Perspective Session I
Business Intelligence Industry Perspective Session IPrithwis Mukerjee
 
Presentation to HWVP
Presentation to HWVPPresentation to HWVP
Presentation to HWVPpricew
 
Rovi Business Solutions
Rovi Business Solutions Rovi Business Solutions
Rovi Business Solutions William Francis
 
Maclear’s IT GRC Tools – Key Issues and Trends
Maclear’s  IT GRC Tools – Key Issues and TrendsMaclear’s  IT GRC Tools – Key Issues and Trends
Maclear’s IT GRC Tools – Key Issues and TrendsMaclear LLC
 
Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Edgewater
 
Data analytics - Alteryx Spotlight.pdf
Data analytics - Alteryx Spotlight.pdfData analytics - Alteryx Spotlight.pdf
Data analytics - Alteryx Spotlight.pdfssuser43b9f8
 
Capabilities Overview 20100414 V1
Capabilities Overview 20100414 V1Capabilities Overview 20100414 V1
Capabilities Overview 20100414 V1nbcoenen
 
Financial Analytics pafp 11-21-13
Financial Analytics   pafp 11-21-13Financial Analytics   pafp 11-21-13
Financial Analytics pafp 11-21-13gristak
 
Pmac It Project Management 2010
Pmac It Project Management 2010Pmac It Project Management 2010
Pmac It Project Management 2010nseiersen
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityDATAVERSITY
 

Similar to Cognitivo - Tackling the enterprise data quality challenge (20)

Data Governance
Data GovernanceData Governance
Data Governance
 
SD Basel process automation seminar presentation
SD Basel process automation seminar presentationSD Basel process automation seminar presentation
SD Basel process automation seminar presentation
 
ClearCost Introduction 2015
ClearCost Introduction 2015ClearCost Introduction 2015
ClearCost Introduction 2015
 
TOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptxTOP_407070357-Data-Governance-Playbook.pptx
TOP_407070357-Data-Governance-Playbook.pptx
 
Fuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data GovernanceFuel your Data-Driven Ambitions with Data Governance
Fuel your Data-Driven Ambitions with Data Governance
 
Ebookblogv2 120116015321-phpapp01
Ebookblogv2 120116015321-phpapp01Ebookblogv2 120116015321-phpapp01
Ebookblogv2 120116015321-phpapp01
 
Get Smart About Technical Debt
Get Smart About Technical DebtGet Smart About Technical Debt
Get Smart About Technical Debt
 
Business Intelligence Industry Perspective Session I
Business Intelligence   Industry Perspective Session IBusiness Intelligence   Industry Perspective Session I
Business Intelligence Industry Perspective Session I
 
Presentation to HWVP
Presentation to HWVPPresentation to HWVP
Presentation to HWVP
 
Rovi Business Solutions
Rovi Business Solutions Rovi Business Solutions
Rovi Business Solutions
 
Maclear’s IT GRC Tools – Key Issues and Trends
Maclear’s  IT GRC Tools – Key Issues and TrendsMaclear’s  IT GRC Tools – Key Issues and Trends
Maclear’s IT GRC Tools – Key Issues and Trends
 
Risk Product.pptx
Risk Product.pptxRisk Product.pptx
Risk Product.pptx
 
Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?Is Your Data Ready to Drive Your Company's Future?
Is Your Data Ready to Drive Your Company's Future?
 
Data analytics - Alteryx Spotlight.pdf
Data analytics - Alteryx Spotlight.pdfData analytics - Alteryx Spotlight.pdf
Data analytics - Alteryx Spotlight.pdf
 
Capabilities Overview 20100414 V1
Capabilities Overview 20100414 V1Capabilities Overview 20100414 V1
Capabilities Overview 20100414 V1
 
50 Shades of Metrics
50 Shades of Metrics50 Shades of Metrics
50 Shades of Metrics
 
Strategy For Data Quality
Strategy For Data QualityStrategy For Data Quality
Strategy For Data Quality
 
Financial Analytics pafp 11-21-13
Financial Analytics   pafp 11-21-13Financial Analytics   pafp 11-21-13
Financial Analytics pafp 11-21-13
 
Pmac It Project Management 2010
Pmac It Project Management 2010Pmac It Project Management 2010
Pmac It Project Management 2010
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 

Recently uploaded

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 

Recently uploaded (20)

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 

Cognitivo - Tackling the enterprise data quality challenge

  • 1. Tackling the enterprise Data Quality challenge Cognitivo Consulting January 2020
  • 2. 2 COMPETING IN THE DIGITAL AGE In a connected world, competing effectively in the digital age means making the right decisions at pace
  • 3. 3 Machine learning algorithms rely on data to learn for themselves AI could potentially create $3.5 trillion to $5.8 trillion in annual value in the global economy Source: McKinsey global institute 2018 UNLOCKING THE VALUE OF AI
  • 4. 4 Leaders in the digital age are able to make strategic and operational decisions based on data, at scale THIS IS THE DATA-DRIVEN ORGANISATION DATA DRIVEN ORGANISATIONS
  • 5. 5 The quality of your decisions will be proportional to the quality of your data Data Quality is a foundational element of achieving digital success BUILDING ON STABLE FOUNDATIONS
  • 6. 6 DQ is a symptom of poor processes and systems, which requires coordination across the enterprise DATA QUALITY MUST SUPPORT PROCESS ASSURANCE AND IMPROVEMENT ACROSS THE ENTERPRISE DQ MUST BE COORDINATED ACROSS THE ENTERPRISE
  • 7. Enterprise Architecture aligned end-to-end DQ approach 7 A successful DQ initiative relies on alignment to existing enterprise and risk management frameworks and assets Information Architecture Business/ Process Architecture Integration Architecture Application & Infrastructure Architecture Risk based approach to identify key processes / use cases in-scope for DQ improvement Definition of customer journey’s and process value chains with customer & organisational outcomes defined 1 2 Definition of a business conceptual data model & business rules based on in-scope processes Agreement of definitions (decomposition of metrics and critical data elements), sources of truth, RACI (e.g. owners) 3 4 Document data lineage (data flows) between key systems for each in-scope process / use case 5 Catalogue systems and critical data sets and controls environment within an Information Assets Register or Source Catalogue 6 21 Impact Likelihood High Med Low Low Medium High Inherent Risk (‘gross’ risk) DQ Treatment Process improvement Risk Tolerance <<Party>> <<Item>> Owns, rents, buys, sells, leases Service Enters Provides, consumes Uses, maintains Creates conditions <<Classification>>Arrangement Type of <<Event>> Type of Location Has Occurs at Occurs at Involved in Triggers Consists of Creates <<Party>> <<Item>> Owns, rents, buys, sells, leases Service Enters Provides, consumes Uses, maintains Creates conditions <<Classification>>Arrangement Type of <<Event>> Type of Location Has Occurs at Occurs at Involved in Triggers Consists of Creates 3 Channels Web Mobile Broker Contract Centre Branch CRM Product Origination Fulfilment Risk / Capital Mgt Follow up Integration (Message/Stream + Batch) Servicing Credit Approval Settlement Payments Finance Cloud Data Asset KYC Sanctions Performance Channels Web Mobile Broker Contract Centre Branch CRM Product Origination Fulfilment Risk / Capital Mgt Follow up Integration (Message/Stream + Batch) Servicing Credit Approval Settlement Payments Finance Cloud Data Asset KYC Sanctions Performance 6 5 4 Customer Journey & Associated Business Value Chain Operational Risk Matrix Business Conceptual Model Business Metric Decomposition & Business Definitions Integration Landscape Data Lineage Information Asset Register Source Catalogue EnterpriseDataDecompositionTree Note: EcoProfit = NPAT - Cost of Equity ($) + IEL(CCA) + Imputation Credits = NPAT – Cost of Equity (%) x Eco Cap ($) ROE = NPAT / Book Equity, Book Equity = EcoCap = Total Reg Cap Credit Risk Capital Other Revenue IEL (CCA) X Eco Cap ($) Expenses Cost of Capital Rate (%) - NPAT - EL basis ($) Cost of Capital + Franking Credits – Tax Allocated Expenses Controllable Expenses mRWA Loss Data (ELD/ILD) Economic Profit ($) ROE (%) Tenor Customer Asset Class Credit RWA Capital Ratio X Market Risk Capital Reg EL Op Risk Capital Investment Stakes, Fixed Assets, Deferred Acquisition Exposure At Default (EAD) Provisions & Delinquencies Collective Provisions Retail Pooling & Segmentation Probability of Default Loss Given Default Loan Amount / Limit Product Features Individual Provisions On/Off Balance Sheet Revocability Industry / ANZSIC Salient Financials CCRDomicile Country SIHeld Collateral Pricing (Rates / Fees) Bank Capital *Set by Group Treasury oRWA Illustrative X Capital Buffer (Stress Test) TSR (Total Shareholder Return) EnterpriseDataDecompositionTree Note: EcoProfit = NPAT - Cost of Equity ($) + IEL(CCA) + Imputation Credits = NPAT – Cost of Equity (%) x Eco Cap ($) ROE = NPAT / Book Equity, Book Equity = EcoCap = Total Reg Cap Credit Risk Capital Other Revenue IEL (CCA) X Eco Cap ($) Expenses Cost of Capital Rate (%) - NPAT - EL basis ($) Cost of Capital + Franking Credits – Tax Allocated Expenses Controllable Expenses mRWA Loss Data (ELD/ILD) Economic Profit ($) ROE (%) Tenor Customer Asset Class Credit RWA Capital Ratio X Market Risk Capital Reg EL Op Risk Capital Investment Stakes, Fixed Assets, Deferred Acquisition Exposure At Default (EAD) Provisions & Delinquencies Collective Provisions Retail Pooling & Segmentation Probability of Default Loss Given Default Loan Amount / Limit Product Features Individual Provisions On/Off Balance Sheet Revocability Industry / ANZSIC Salient Financials CCRDomicile Country SIHeld Collateral Pricing (Rates / Fees) Bank Capital *Set by Group Treasury oRWA Illustrative X Capital Buffer (Stress Test) TSR (Total Shareholder Return)
  • 8. Principles of Cognitivo’s DQ approach 8 A pragmatic approach that doesn’t “boil-the-ocean” is required to focus on priority user cases while leveraging organisational assets and AI to scale Risk & Policy Based – Identify key processes that possess material data risk as prioritised areas to perform DQ diagnosis and treatment Process (use-case) Centric – Identify data flows that underpin key processes and address data quality across the entire system data flow Metadata Driven – Development or use of a conceptual data model as an abstraction layer to work with business stakeholders to agree definitions and business rules that is subsequently mapped to physical data models Analytics & ML Enabled – use of data science techniques (such as ML, text analytics, vision) to build industry and organisation specific data matching and data quality diagnosis techniques Embedded in Business-As-Usual – Roll out of DQ controls, measurement (dashboard) as part of the organisation’s quality assurance processes, rather than constructing new data KPI and consequence management framework
  • 9. Example DQ use cases to improve key business outcomes 9 Cognitivo has extensive experience in executing data quality programmes within Financial Services, Government and Accounting business domains Use Case • KYC / AML / CTF (Assurance of data feeds) • CPS220, AIRB Accreditation • APRA / ABS Regulatory Reporting (e.g. report on interest-only loans) • Basel III Liquidity (FI/non-FI review) • APS 120 Securitisation (Loan doc reconciliation) • FATCA, GATCA • OTC Reform, MiFID II (Cleanse LEI / SWIFT Code, Legal Form, Country of incorporation etc.) • APS910 – SCV assurance • IFRS9 / IFRS17 Assurance • Staff Benefits Review (Review of former employees still on staff benefits programmes) • Advice Compliance (SOA, PDS vs fees and charges review) • … Compliance Business Management • Payroll Assurance • Financial Management reporting (Line of Business) • Finance cube, business unit, GL structure review • … Customer / Sales • Customer Contact Details (marketing, product service) • Consent status • Customer Age Review • Customer Address Review (e.g. Suburb / postal code combination) • Customer segmentation review • CRM – customer structure review (e.g. customer legal structure, customer groupings) • ..
  • 10. DQ Execution Lifecycle Data Quality execution lifecycle 10 Cognitivo’s DQ Execution Lifecycle is linked to broader data demand management and IT planning lifecycles Data Risk Demand Management Process / System Improvement Diagnosis Conduct qualitative sizing, define requirements, business rules and conduct root cause analysis Profiling - Size/quantify magnitude of each DQ Issue. • Profile key data elements for validity / completeness issues • Correlate data across systems to identify integrity, this can include use of techniques financial reconciliation (checksums) • Deploy an analytical process to find illogical combinations of data, outliers etc. • Higher complexity techniques such as text analytics and computer vision to correlate with unstructured data sources • Machine learning approaches to identify patterns for acceptable values / ranges Holistic view of DQ issues & prioritisation • Organisation-wide DQ issues register with self-assessment process to periodically assess level of DQ risk • DQ deep-dives through workshop / interviews for high risk areas • Prioritise high impact and high occurrence issues to go into ‘fix process’ Correction process • Obtain correct values and subsequent cleansing / system update. • Automated through cross-system and 3rd party data source lookup or derivation • As a final step, client outreach may be required (e.g. establish call- centre process) Cleansing process • Establish process for bulk update, testing and roll back within core systems • For systems where bulk update is not possible, develop RPA and manual update capabilities. Systemic Fix • Make recommendations for systemic fixes through the organisation’s broader change / fail-fix agenda 1 Monitoring & Reporting • DQ issues profiled / fixed all trace back to a business unit, hence DQ metrics can form process/compliance KPI’s for business owners. • DQ scorecards can automate existing QA processes / operational risk controls by quantifying instances where data entry is missing / incorrect. • Trend analysis on DQ results for each responsible business unit DiagnosisA 6 2 ProfilingB Correction & Cleansing C Monitoring & Reporting D
  • 11. DQ Discovery Scalable Data Quality DevOps 11 Cognitivo’s data quality workflow incorporates analytical tools, business testing and deployment into a DQ DevOps process De-duplicate customers within systems (collapse entities) Customer Outreach (high-priority/risk) Cross system & 3rd party lookup Document / correspondence lookup Customer self service Client Applications Correct (obtain correct values & validate) Issue and Workflow Management Manual testing values to update via risk- based sampling Database Bulk Update / Amend Front-End Data Entry (Inc. use of RPA – robotic process automation) Cleanse (System Update) Source Systems & Processes Profiling Monitoring & Reporting Business Unit QA to include DQ measures Customer Matching & Analytical Environment Issue prioritization (based on risk i.e. likelihood and consequence) Results Dashboard Raise new DQ rules To be updated DQ Rules Engine DQ Development Workspace Ingestion DQ Production Deployment Business / Quality Assurance Root Cause Analysis Process & System Fix Operational Environment Automated processes Manual processes
  • 12. DQ Workspace & Platform Architecture 12 Cognitivo has a DQ Technical reference architecture that can be implemented on any vendor-agnostic cloud or on-premise environment To be updated Data Pipeline IngestionClient Source Systems Client Data Warehouse(s) Document Repository New Extracts for checksums etc Batch Pipeline Real Time Pipeline (API’s / Messaging) User Interface DQ Policy & Rules Configuration Case Management Conformed / derived values to expedite DQ rule execution and provide a history of values for outlier / drift detection Raw source system data to derive row counts and perform validity checks User Interface Data Lake Data Science / ML Discovery Environment Source Data Layer Linked data (lightly integrated) SchedulerRules Store Self-service data Ingestion Data Science Tools / Workbench Execution of analytical workloads DQ dashboards for consumption by data stewards and business stakeholders (data owners) DQ Rule Execution (Python) DQ Rules based on the derived semantic data model stored in JSON format within the rules store Scheduler to execute DQ Rules on a periodic basis DQ Rules Engine Text Extraction & OCR Results Dashboard (PowerBI / Qlik) Data Workspace Provisioning DQ Profile Result Store Semantic / Conformed Data (with history) Data Science Development & Collaboration Tools e.g. Git, Jupyter Cross-system table linking to correlate values across matched customers Store DQ profiling output results. Contains historical values to allow historical trend analysis of DQ Management of DQ rules, tolerances and business owners of DQ events Case management tool for logging, investigating and remediating DQ issues Provision of persistent temporary storage and access to access controlled data sets (specific to department, user and use cases) Data import tools for un- managed datasets (used for discovery purposes) Text extract and analytics libraries e.g. Tesseract Batch data ingestion using file (CSV) or ODBC/JDBC Real-time integration Key Selected Technologies
  • 13. DQ profiling techniques to be employed 13 Cognitivo’s analytical DQ framework deploys a number of analytical tests across structured and unstructured data sources Test for Completeness Record count anomalies Financial Reconciliation (check-sums) Test for Validity Data Type & Format checks (Regex pattern match) Allowable values Reference data lookup Null Value check Test for Accuracy Illogical combinations of multiple data fields (e.g. individual with a business name) Single Field based logic check (e.g. age > 100) 3rd party cross reference Cross system value cross- reference Reasonable value check (record anomaly / outlier, value drift over time) Test for Timeliness Data Ingestion (ETL/ELT) Synchronisation review Document Text Extraction & Cross Reference Computer Vision Image recognition & object classification Test for Uniqueness Duplicates within systems Cross-system master data reconciliation
  • 14. Case Management DQ Analytical Engine Cognitivo’s DQ Platform Capabilities 14 Cognitivo has a DQ application framework can be deployed onto private clouds via containers or accessed as a SaaS offering Data Steward Portal (UX) • Create profiling rules • Diagnose DQ issues through reports and dashboards • Workflow to approve data changes and case manage remediation • APIs to integrate with 3rd party applications and check valid data entry based on data quality rules Core DQ Engine • Semantic model of parameters for data stewards to create DQ rules • DQ rule templates (e.g. regex functions, address validity, ABN format etc.) • Analytical engine to run complex data accuracy / integrity rules • API to allow 3rd party and customer automation, extension and access to DQ results Data Pipeline • Securely connect on-prem data sources to cloud environments in an encrypted manner (Gateway) • Database to store multiple time-stamped sampled extracts from source systems • Efficient data ingestion pipeline with connectors for key council systems (e.g. Dynamics, ..) Embedding DQ processes • Build continuous improvement initiatives within directorates based on DQ analysis (e.g. asking additional questions when customers call/visit) • Set DQ KPIs within process metrics (e.g. accuracy of mandatory data capture) Investigation (Jupyter) Reporting Dashboard Customer Data Sources DQ Profiling Datastore Data Stewards (Users) User Interface Data Quality Hub Scheduler DQ Profiler Connectors DQ Managed Parameters (semantic model) Gateway Mobile App API DQ Rules Library Web Interface
  • 15. Cognitivo DQ Platform Screenshots (1/2) 15 .. John Smith John Smith
  • 16. Cognitivo DQ Platform Screenshots (2/2) 16 John Smith