DATA QUALITY:Getting Investment for a Weird Area Zeeman van der Merwe Manager: Information Strategy & Planning, ACC Zeeman.vanderMerwe@ACC.co.nz 16 February 2012 1
Why a weird area … ?• Everyone talks about it• Very few understand what it is about• Lots of interpretation and confusion• Mysticism ?• How organisations understand & interpret data quality has bearing• It is more about evolution• This is a journey … that began in 2008• With an assessment … 2
Background: ACC in 2008 • Inconsistent statistics for Ministerials • Inconsistent data terminology • Duplicate/inconsistent datasets • Inconsistent business rules • Unmanaged data in Data Warehouse • Uncoordinated Data Analyst community • Minimal standards & processes Common Interpretation: Data Quality is bad 5
Why Data Governance ?• Oversees all aspects of data and information management• Exercise authority and control (planning, monitoring, and enforcement) over the management of ACC’s data and information assets.To be achieved by:• Defining, and communicating strategies, policies, standards, architecture, procedures, and metrics relating to data and information• Developing regulatory procedures for data and information• Overseeing data management projects• Providing governance and oversight of ACC data and information related issues• Communicating the value of ACC’s data and information assetsSource: ACC Data Governance Terms of Reference (Version 1) 7
THE DATA USER’S BILL OF RIGHTSData users have the right to know what the data means 1. The right to know the definition of the data. 2. The right to know where the data came from. 3. The right to know how the data was calculated or manipulated.Data users have the right to know how risks to the data have(or have not) been managed 4. The right to know what Security risks werent eliminated. 5. The right to know what Quality risks werent eliminated. 6. The right to know what Privacy risks werent eliminated. 7. The right to know what Compliance requirements influenced data processing and usage.Data users have the right to know who made decisions aboutmanaging the data, according to what rules 8. The right to know who made data-related decisions. 9. The right to know what decision-making checks-and-balances were in place. 10. The right to know how issues have been and will be resolved.
• Data Governance The execution and enforcement of authority over the management of data assets and the performance of data functions• Data Stewardship The formalization of accountability for the management of data resourcesSteward: Old English “Sty Ward”; “Keeper of the sty” 9
Data Management • Enterprise Data Modelling • Value Chain Analysis • Related Data Architecture • Specification • Analysis • Analysis • Data Modelling • Measurement Data • Database Design • Improvement Architecture • Implementation Data Management Quality Data • Architecture Management Development • Acquisition • Integration • Recovery • Control • Tuning • Delivery Data Database • Retention • Purging Meta Data Governance Operations Management • Strategy Management • Organisation & Roles • Policies & Standards Document & Content • Projects & Services Data Management • Issues Security • Valuation Management • Acquisition & Storage • Standards • Backup & Recovery • Classification • Content Mgmt Data Reference & • Administration • Retrieval Warehousing Master Data • Authentication • Retention & Business • Auditing Management Intelligence Management • External Codes • Internal Codes • Architecture • Customer Data • Implementation • Product Data • Training & Support • Dimension Mgmt • Monitoring & Tuning 11
Define/Recommend Data Governance Develop • Approach Assessment Data Governance • High Level Structures Assess maturity Strategy • Roles/Responsibilities • Executive awareness • Change Management • Data “ownership” • Budget • Data “stewardship” • Organisation structures • Educate/Present • Processes & procedures Mandate • Reporting line Establish requirements Data Governance • Agree strategy • Get mandate to implement Strategy • Identify Business & Technical Sponsor • Identify/Appoint Implementation Team • Appoint Principal Data Steward Create Establish DevelopData Excellence Data Governance Data Stewardship Awareness Structures Functions Change Management Plan Identify members • Roles • Importance • Subject Areas • Areas • Commitment • Responsibilities Messages Define/Agree Launch • This is serious Data Governance • Responsibilities “Recruit”Data Excellence • Long term Committee • Operations Data Stewards • Everyone responsible • Sanctions By area Educate • Responsibility The Data Council Training & Organisation • Impact on others /Mentoring 12
DMBOK Data Governance Issue Resolution Executive < 5% Data Governance Council Data Stewards Strategic Data Stewardship Coordinating < 20% Steering Committee Data Stewards Tactical Business< 80 - 85% Data Stewardship Teams Data Stewards Operational Data Governance Office 13
History in ACC • 2006 Information Governance Committee – Failed: no terms of reference • BI Strategy identified data governance – CSF for Strategy • Data Quality Effectiveness Review • Data Quality Working Group • Recommended Data Governance – Received Mandate – Formed Data Governance Committee – Identified and appointed Data Stewards – Formed Data Council All because of data quality! 14
ACC Executive Mandate • Defining & implementing a data governance framework across ACC • Management of data • Oversee the implementation of the ACC Business Intelligence strategy where it supports the aims and objectives of data governance • Responsible for all data quality initiatives and resolution of data issues across ACC • Incorporating existing initiatives where appropriate 15
Data Governance Structures ACC Board ACC Board Chief Executive Officer Chief Executive Officer Executive Management Team Executive Management Team Data Governance Data Governance Data Governance Committee Overall responsibility for Data Governance Committee data governance Office Office Coordinates and supports data governance in ACC Reviews and ratifies Data Council Data Council recommendations “Data in Action” “Data in Action” Data Quality Data Quality Data Related Data Related Workgroup(s) Workgroup(s) Coordination Group Coordination Group Steering Committees Steering Committees Resolves data related issues Coordinates data management Governs data related projects initiatives across all projects 16
ACC Data QualityGuiding Principles 1. ACC will manage data as a core organisational asset with decisions made based on value and the greater good of ACC and its stakeholders 2. The Data Council is mandated as the only forum for ratifying semantics, definitions and business rules for the use of data 3. Use industry and international data standards whenever, and as current, as possible 4. Data quality will be measured across the value chain and all data consumers will have a voice in specifying data quality service level agreements 5. Business process owners will agree to and abide by data quality Service Level Agreements 6. Data masters will be the primary source for any further use of that data 7. Validate data instances and data sets against defined business rules 8. Apply data corrections at the original source, if possible 9. Data entry and system integration will be automated whenever possible with validation applied on entry 10. System user interfaces will be designed to assist and encourage data quality 11. Reference data is current and relevant 12. All data entities will be semantically unique and defined 13. Metadata will be available to all data consumers 14. All report development will be peer reviewed 17
Data Quality Initiatives Improving Client Information • Reducing number of client duplicates created – 350,000 duplicates (60,000 P.A.) – Staff were trained and Eos improved – Rate reduced from 18% to 8% • Improving NHI, Date of Death (DOD) – Collaboration with MOH – Improve valid NHI from 49% to 81% – Provide DOD for 345,000 clients 18
RFI: Data Quality Tools Data Quality Tools & Processes • To be used for: – Analysis of data quality – Address cleansing, standardisation and validation – Duplicate identification and elimination – Monitoring and managing data quality • First application Client address standardisation and verification 19
Data Quality Managementis a process … SOURCE: SAS/Dataflux 20
Evaluation Criteria (Gartner) 1. Connectivity/Adapters 2. Data Quality Assessment and Visualization 3. Parsing 4. Standardization and Cleansing 5. Matching/Relationship Identification 6. Monitoring 7. Subject Area — Specific Support 8. Address Validation/Geocoding 9. International Support 10. Data Quality Workflow 11. Enrichment 12. Metadata 13. Configuration Environment 14. Deployment Modes and Runtime Environment 15. Operations and Administration 16. Architecture and Integration 17. Service — Enablement Vendor/product image, relationship with ACC & Support Total Cost of Ownership 21
The Business Case • Financial Savings – NZ Post – Rework – Error recovery • Reputational Risk Mitigation – Minister/Public/Clients • Process Support – Data cleansing/verification – Data matching/lookup (Master Data) • Ease of development/reuse/monitoring This is on paper … 22
Not on paper • Data governance was working – Used as a ―sanctioning committee‖ • Data quality must be addressed – Tools will help to do this – We must have tools • Confusion – Data Quality vs Text mining – What do the tools actually do? – How do you use them (development) – How can they help? • Current Image: Anything to do with data can be fixed with the tools! 23
Summary• Procurement highly influenced by: – Non-paper justification – Reputation of Data Governance – Data quality is a ―big‖ problem• Lots of confusion – Needs lots of education/mitigation – Evaluation panel needed lots of coaching• The organisation was ―ready‖• The business case … well … 24
Useful Sites • Data Management Association (DAMA) www.dama.org • The Data Administrator Newsletter www.tdan.com • The Data Governance Institute (DGI) www.datagovernance.com 25
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.