This document discusses data quality management (DQM). It covers DQM concepts and activities, including developing data quality awareness, defining data quality requirements, profiling and assessing data quality, and defining metrics. The key DQM approach is the Deming cycle of planning, deploying, monitoring, and acting to continuously improve data quality. Data quality requirements are identified by reviewing business policies and rules to understand dimensions like accuracy, completeness, consistency and more.
This document discusses data security management. It outlines key concepts and activities including understanding business and regulatory requirements, defining security policies, standards, controls and procedures, managing users, passwords and permissions. The goal is to protect information through proper authentication, authorization, access and auditing in alignment with privacy needs and regulations.
The document discusses data governance concepts and activities. It defines data governance as the exercise of authority and control over data asset management. It describes the key roles and organizations involved in data governance, including the data governance council, data stewardship committees, and data stewardship teams. It also outlines the main activities of a data governance function, such as developing a data strategy, policies, standards, and procedures. The document provides details on how issues are managed and how data governance interacts with and oversees data management projects.
RWDG Slides: What is a Data Steward to do?DATAVERSITY
Most people recognize that Data Stewards play an essential role in their Data Governance and Information Governance programs. However, the manner in which Data Stewards are used is not the same from organization to organization. How you use Data Stewards depends on your goals for Data Governance.
Join Bob Seiner for this month’s RWDG webinar where he will share different ways to activate Data Stewards based on the purpose of your program. Bob will talk about options to extend existing Data Steward activity and how to build new functionality into the role of your Data Stewards.
In this webinar, Bob will discuss:
- The crucial role of the Data Steward in Data Governance
- Different types of Data Stewards and what they do
- Aligning Data Steward activities with program goals
- Improving existing Data Steward actions
- Finding new ways to use your Data Stewards
The document discusses meta-data management. It defines meta-data as "data about data" that describes other data. Meta-data management involves understanding requirements, defining architectures, implementing standards, creating and maintaining meta-data, and managing meta-data repositories. The document outlines the concepts, types, sources, and activities involved in effective meta-data management.
Chapter 4: Data Architecture ManagementAhmed Alorage
This document provides an overview of data architecture management. It defines data architecture as an integrated set of specifications that define data requirements, guide integration, and align data investments with business strategy. The key concepts discussed include enterprise architecture, architectural frameworks like Zachman, and the roles and activities of data architects. Data architecture management is presented as the process of defining a blueprint for managing data assets through specifications like enterprise data models and information value chain analysis.
The document discusses data development and data modeling concepts. It describes data development as defining data requirements, designing data solutions, and implementing components like databases, reports, and interfaces. Effective data development requires collaboration between business experts, data architects, analysts and developers. It also outlines the key activities in data modeling including analyzing information needs, developing conceptual, logical and physical data models, designing databases and information products, and implementing and testing the data solution.
Chapter 9: Data Warehousing and Business Intelligence ManagementAhmed Alorage
The document discusses concepts related to data warehousing and business intelligence management. It provides an overview of key terms and components, including Inmon and Kimball's approaches to data warehouse architecture. Inmon defined the classic characteristics of a data warehouse and his "Corporate Information Factory" model, which includes raw operational data, an operational data store, data warehouse, and data marts. Kimball emphasized dimensional modeling and his "DW chess pieces" components to structure data for analysis. The document then covers typical activities involved in data warehousing and business intelligence management.
This document discusses data security management. It outlines key concepts and activities including understanding business and regulatory requirements, defining security policies, standards, controls and procedures, managing users, passwords and permissions. The goal is to protect information through proper authentication, authorization, access and auditing in alignment with privacy needs and regulations.
The document discusses data governance concepts and activities. It defines data governance as the exercise of authority and control over data asset management. It describes the key roles and organizations involved in data governance, including the data governance council, data stewardship committees, and data stewardship teams. It also outlines the main activities of a data governance function, such as developing a data strategy, policies, standards, and procedures. The document provides details on how issues are managed and how data governance interacts with and oversees data management projects.
RWDG Slides: What is a Data Steward to do?DATAVERSITY
Most people recognize that Data Stewards play an essential role in their Data Governance and Information Governance programs. However, the manner in which Data Stewards are used is not the same from organization to organization. How you use Data Stewards depends on your goals for Data Governance.
Join Bob Seiner for this month’s RWDG webinar where he will share different ways to activate Data Stewards based on the purpose of your program. Bob will talk about options to extend existing Data Steward activity and how to build new functionality into the role of your Data Stewards.
In this webinar, Bob will discuss:
- The crucial role of the Data Steward in Data Governance
- Different types of Data Stewards and what they do
- Aligning Data Steward activities with program goals
- Improving existing Data Steward actions
- Finding new ways to use your Data Stewards
The document discusses meta-data management. It defines meta-data as "data about data" that describes other data. Meta-data management involves understanding requirements, defining architectures, implementing standards, creating and maintaining meta-data, and managing meta-data repositories. The document outlines the concepts, types, sources, and activities involved in effective meta-data management.
Chapter 4: Data Architecture ManagementAhmed Alorage
This document provides an overview of data architecture management. It defines data architecture as an integrated set of specifications that define data requirements, guide integration, and align data investments with business strategy. The key concepts discussed include enterprise architecture, architectural frameworks like Zachman, and the roles and activities of data architects. Data architecture management is presented as the process of defining a blueprint for managing data assets through specifications like enterprise data models and information value chain analysis.
The document discusses data development and data modeling concepts. It describes data development as defining data requirements, designing data solutions, and implementing components like databases, reports, and interfaces. Effective data development requires collaboration between business experts, data architects, analysts and developers. It also outlines the key activities in data modeling including analyzing information needs, developing conceptual, logical and physical data models, designing databases and information products, and implementing and testing the data solution.
Chapter 9: Data Warehousing and Business Intelligence ManagementAhmed Alorage
The document discusses concepts related to data warehousing and business intelligence management. It provides an overview of key terms and components, including Inmon and Kimball's approaches to data warehouse architecture. Inmon defined the classic characteristics of a data warehouse and his "Corporate Information Factory" model, which includes raw operational data, an operational data store, data warehouse, and data marts. Kimball emphasized dimensional modeling and his "DW chess pieces" components to structure data for analysis. The document then covers typical activities involved in data warehousing and business intelligence management.
Chapter 1: The Importance of Data AssetsAhmed Alorage
The document summarizes Chapter 1 of the DAMA-DMBOK Guide, which discusses data as a vital enterprise asset and introduces key concepts in data management. It defines data, information, and knowledge; describes the data lifecycle and data management functions; and explains that data management is a shared responsibility between data stewards and professionals. It also provides overviews of the DAMA organization and the goals and audiences of the DAMA-DMBOK Guide.
Chapter 8: Reference and Master Data Management Ahmed Alorage
The document discusses reference and master data management. It defines reference data as data used to classify or categorize other data, using predefined valid values. Master data provides context for business transactions and includes data about key entities like parties, products, locations. The objectives are to maintain consistent reference and master data across systems through activities like defining golden records, match rules, hierarchies and distributing reference and master data.
Chapter 13: Professional DevelopmentAhmed Alorage
This document discusses professional development for data management professionals. It covers characteristics of a profession including certification, continuing education, ethics, and notable professionals. Specifically, it outlines the Certified Data Management Professional (CDMP) certification process, including required exams in core IS and data specialty areas. It also discusses ways to prepare for exams, accepted substitute vendor certifications, continuing education requirements to maintain certification, and emphasizes the importance of maintaining high ethical standards when working with data.
This document discusses data governance and data architecture. It introduces data governance as the processes for managing data, including deciding data rights, making data decisions, and implementing those decisions. It describes how data architecture relates to data governance by providing patterns and structures for governing data. The document presents some common data architecture patterns, including a publish/subscribe pattern where a publisher pushes data to a hub and subscribers pull data from the hub. It also discusses how data architecture can support data governance goals through approaches like a subject area data model.
Chapter 10: Document and Content Management Ahmed Alorage
This document discusses document and content management. It covers concepts like document management, which involves storing, tracking, and controlling electronic and paper documents, and content management, which organizes and structures access to information content. The key activities covered are planning and policies for managing documents, implementing document management systems for storage, access and security, backup and recovery of documents, retention and disposition according to policies and regulations, and auditing document management. The document provides details on each of these concepts and activities.
Data Governance Takes a Village (So Why is Everyone Hiding?)DATAVERSITY
Data governance represents both an obstacle and opportunity for enterprises everywhere. And many individuals may hesitate to embrace the change. Yet if led well, a governance initiative has the potential to launch a data community that drives innovation and data-driven decision-making for the wider business. (And yes, it can even be fun!). So how do you build a roadmap to success?
This session will gather four governance experts, including Mary Williams, Associate Director, Enterprise Data Governance at Exact Sciences, and Bob Seiner, author of Non-Invasive Data Governance, for a roundtable discussion about the challenges and opportunities of leading a governance initiative that people embrace. Join this webinar to learn:
- How to build an internal case for data governance and a data catalog
- Tips for picking a use case that builds confidence in your program
- How to mature your program and build your data community
Tackling data quality problems requires more than a series of tactical, one off improvement projects. By their nature, many data quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process and technology. Join Donna Burbank and Nigel Turner as they provide practical ways to control data quality issues in your organization.
Mr. Hery Purnama is an IT consultant and trainer in Bandung, Indonesia with over 20 years of experience in various IT projects. He specializes in areas like system development, data science, IoT, project management, IT service management, information security, and enterprise architecture. He holds several international certifications and provides training on topics such as CDMP (Certified Data Management Professional), COBIT, and TOGAF.
The document discusses an overview and exam requirements for the CDMP certification. It covers the 14 topics tested in the 100 question exam, including data governance, data modeling, data security, and big data. Tips are provided for exam registration and practice questions are available online.
Strategic Business Requirements for Master Data Management SystemsBoris Otto
This presentation describes strategic business requirements of master data management (MDM) systems. The requirements were developed in a consortium research approach by the Institute of Information Management at the University of St. Gallen, Switzerland, and 20 multinational enterprises.
The presentation was given at the 17th Amercias Conference on Information Systems (AMCIS 2011) in Detroit, MI.
The research paper on which this presentation is based on can be found here: http://www.alexandria.unisg.ch/Publikationen/Zitation/Boris_Otto/177697
Data Governance and MDM | Profisse, Microsoft, and CCGCCG
CCG will introduce a methodology and framework for DG that allows organizations to assess DG faster, deriving actionable insights that can be quickly implemented with minimal disruption. CCG will also review how Microsoft Azure Solutions can be leveraged to build a strong foundation for governed data insights. In addition, Profisee will introduce a popular component of data governance, MDM.
Data-Ed Slides: Best Practices in Data Stewardship (Technical)DATAVERSITY
In order to find value in your organization's data assets, heroic data stewards are tasked with saving the day- every single day! These heroes adhere to a data governance framework and work to ensure that data is: captured right the first time, validated through automated means, and integrated into business processes. Whether its data profiling or in depth root cause analysis, data stewards can be counted on to ensure the organization's mission critical data is reliable. In this webinar we will approach this framework, and punctuate important facets of a data steward’s role.
Learning Objectives:
- Understand the business need for a data governance framework
- Learn why embedded data quality principles are an important part of system/process design
- Identify opportunities to help drive your organization to a data driven culture
RWDG: Measuring Data Governance PerformanceDATAVERSITY
This document discusses ways to measure the performance of a data governance program. It describes measuring the acceptability of the program within the organization, such as the number of groups participating and customer satisfaction. It also describes measuring the business value of the program, like improvements in data documentation, understanding, quality and protection. The document provides examples of specific metrics that can be used, such as the number of critical data elements standardized or dollars saved/earned due to governance. It also discusses reporting metrics at different levels of a data governance framework.
Tackling Data Quality problems requires more than a series of tactical, one-off improvement projects. By their nature, many Data Quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process, and technology. Join Nigel Turner and Donna Burbank as they provide practical ways to control Data Quality issues in your organization.
To take a “ready, aim, fire” tactic to implement Data Governance, many organizations assess themselves against industry best practices. The process is not difficult or time-consuming and can directly assure that your activities target your specific needs. Best practices are always a strong place to start.
Join Bob Seiner for this popular RWDG topic, where he will provide the information you need to set your program in the best possible direction. Bob will walk you through the steps of conducting an assessment and share with you a set of typical results from taking this action. You may be surprised at how easy it is to organize the assessment and may hear results that stimulate the actions that you need to take.
In this webinar, Bob will share:
- The value of performing a Data Governance best practice assessment
- A practical list of industry Data Governance best practices
- Criteria to determine if a practice is best practice
- Steps to follow to complete an assessment
- Typical recommendations and actions that result from an assessment
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
Organizations must realize what it means to utilize data quality management in support of business strategy. This webinar will demonstrate how chronic business challenges can often be attributed to the root problem of poor data quality. Showing how data quality should be engineered provides a useful framework in which to develop an effective approach. Establishing this framework allows organizations to more efficiently identify business and data problems caused by structural issues versus practice-oriented defects; giving them the skillset to prevent these problems from re-occurring.
Learning Objectives:
Understanding foundational data quality concepts based on the DAMA DMBOK
Utilizing data quality engineering in support of business strategy
Case Studies illustrating data quality success
Data quality guiding principles & best practices
Steps for improving data quality at your organization
DAS Slides: Data Quality Best PracticesDATAVERSITY
Tackling Data Quality problems requires more than a series of tactical, one-off improvement projects. By their nature, many Data Quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process, and technology. Join Nigel Turner and Donna Burbank as they provide practical ways to control Data Quality issues in your organization.
How to Build & Sustain a Data Governance Operating Model DATUM LLC
Learn how to execute a data governance strategy through creation of a successful business case and operating model.
Originally presented to an audience of 400+ at the Master Data Management & Data Governance Summit.
Visit www.datumstrategy.com for more!
This presentation reports on data governance best practices. Based on a definition of fundamental terms and the business rationale for data governance, a set of case studies from leading companies is presented. The content of this presentation is a result of the Competence Center Corporate Data Quality (CC CDQ) at the University of St. Gallen, Switzerland.
Data Modeling, Data Governance, & Data QualityDATAVERSITY
Data Governance is often referred to as the people, processes, and policies around data and information, and these aspects are critical to the success of any data governance implementation. But just as critical is the technical infrastructure that supports the diverse data environments that run the business. Data models can be the critical link between business definitions and rules and the technical data systems that support them. Without the valuable metadata these models provide, data governance often lacks the “teeth” to be applied in operational and reporting systems.
Join Donna Burbank and her guest, Nigel Turner, as they discuss how data models & metadata-driven data governance can be applied in your organization in order to achieve improved data quality.
How to Strengthen Enterprise Data Governance with Data QualityDATAVERSITY
If your organization is in a highly-regulated industry – or relies on data for competitive advantage – data governance is undoubtedly a top priority. Whether you’re focused on “defensive” data governance (supporting regulatory compliance and risk management) or “offensive” data governance (extracting the maximum value from your data assets, and minimizing the cost of bad data), data quality plays a critical role in ensuring success.
Join our webinar to learn how enterprise data quality drives stronger data governance, including:
The overlaps between data governance and data quality
The “data” dependencies of data governance – and how data quality addresses them
Key considerations for deploying data quality for data governance
This document discusses the fundamentals of data quality management. It begins by introducing the speaker, Laura Sebastian-Coleman, and providing an abstract and agenda for the presentation. The abstract states that while organizations rely on data, traditional data management requires many skills and a strategic perspective. Technology changes have increased data volume, velocity and variety, but veracity is still a challenge. Both traditional and big data must be managed together. The presentation will revisit data quality management fundamentals and how to apply them to traditional and big data environments. Attendees will learn how to assess their data environment and provide reliable data to stakeholders.
In today's competitive market, many organizations are unaware of the quantity of poor-quality data in their systems. Some organizations assume that their data is of adequate quality, although they have conducted no metrical or statistical analysis to support the assumption. Others know that their performance is hampered by poor-quality data, but they cannot measure the problem.
Chapter 1: The Importance of Data AssetsAhmed Alorage
The document summarizes Chapter 1 of the DAMA-DMBOK Guide, which discusses data as a vital enterprise asset and introduces key concepts in data management. It defines data, information, and knowledge; describes the data lifecycle and data management functions; and explains that data management is a shared responsibility between data stewards and professionals. It also provides overviews of the DAMA organization and the goals and audiences of the DAMA-DMBOK Guide.
Chapter 8: Reference and Master Data Management Ahmed Alorage
The document discusses reference and master data management. It defines reference data as data used to classify or categorize other data, using predefined valid values. Master data provides context for business transactions and includes data about key entities like parties, products, locations. The objectives are to maintain consistent reference and master data across systems through activities like defining golden records, match rules, hierarchies and distributing reference and master data.
Chapter 13: Professional DevelopmentAhmed Alorage
This document discusses professional development for data management professionals. It covers characteristics of a profession including certification, continuing education, ethics, and notable professionals. Specifically, it outlines the Certified Data Management Professional (CDMP) certification process, including required exams in core IS and data specialty areas. It also discusses ways to prepare for exams, accepted substitute vendor certifications, continuing education requirements to maintain certification, and emphasizes the importance of maintaining high ethical standards when working with data.
This document discusses data governance and data architecture. It introduces data governance as the processes for managing data, including deciding data rights, making data decisions, and implementing those decisions. It describes how data architecture relates to data governance by providing patterns and structures for governing data. The document presents some common data architecture patterns, including a publish/subscribe pattern where a publisher pushes data to a hub and subscribers pull data from the hub. It also discusses how data architecture can support data governance goals through approaches like a subject area data model.
Chapter 10: Document and Content Management Ahmed Alorage
This document discusses document and content management. It covers concepts like document management, which involves storing, tracking, and controlling electronic and paper documents, and content management, which organizes and structures access to information content. The key activities covered are planning and policies for managing documents, implementing document management systems for storage, access and security, backup and recovery of documents, retention and disposition according to policies and regulations, and auditing document management. The document provides details on each of these concepts and activities.
Data Governance Takes a Village (So Why is Everyone Hiding?)DATAVERSITY
Data governance represents both an obstacle and opportunity for enterprises everywhere. And many individuals may hesitate to embrace the change. Yet if led well, a governance initiative has the potential to launch a data community that drives innovation and data-driven decision-making for the wider business. (And yes, it can even be fun!). So how do you build a roadmap to success?
This session will gather four governance experts, including Mary Williams, Associate Director, Enterprise Data Governance at Exact Sciences, and Bob Seiner, author of Non-Invasive Data Governance, for a roundtable discussion about the challenges and opportunities of leading a governance initiative that people embrace. Join this webinar to learn:
- How to build an internal case for data governance and a data catalog
- Tips for picking a use case that builds confidence in your program
- How to mature your program and build your data community
Tackling data quality problems requires more than a series of tactical, one off improvement projects. By their nature, many data quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process and technology. Join Donna Burbank and Nigel Turner as they provide practical ways to control data quality issues in your organization.
Mr. Hery Purnama is an IT consultant and trainer in Bandung, Indonesia with over 20 years of experience in various IT projects. He specializes in areas like system development, data science, IoT, project management, IT service management, information security, and enterprise architecture. He holds several international certifications and provides training on topics such as CDMP (Certified Data Management Professional), COBIT, and TOGAF.
The document discusses an overview and exam requirements for the CDMP certification. It covers the 14 topics tested in the 100 question exam, including data governance, data modeling, data security, and big data. Tips are provided for exam registration and practice questions are available online.
Strategic Business Requirements for Master Data Management SystemsBoris Otto
This presentation describes strategic business requirements of master data management (MDM) systems. The requirements were developed in a consortium research approach by the Institute of Information Management at the University of St. Gallen, Switzerland, and 20 multinational enterprises.
The presentation was given at the 17th Amercias Conference on Information Systems (AMCIS 2011) in Detroit, MI.
The research paper on which this presentation is based on can be found here: http://www.alexandria.unisg.ch/Publikationen/Zitation/Boris_Otto/177697
Data Governance and MDM | Profisse, Microsoft, and CCGCCG
CCG will introduce a methodology and framework for DG that allows organizations to assess DG faster, deriving actionable insights that can be quickly implemented with minimal disruption. CCG will also review how Microsoft Azure Solutions can be leveraged to build a strong foundation for governed data insights. In addition, Profisee will introduce a popular component of data governance, MDM.
Data-Ed Slides: Best Practices in Data Stewardship (Technical)DATAVERSITY
In order to find value in your organization's data assets, heroic data stewards are tasked with saving the day- every single day! These heroes adhere to a data governance framework and work to ensure that data is: captured right the first time, validated through automated means, and integrated into business processes. Whether its data profiling or in depth root cause analysis, data stewards can be counted on to ensure the organization's mission critical data is reliable. In this webinar we will approach this framework, and punctuate important facets of a data steward’s role.
Learning Objectives:
- Understand the business need for a data governance framework
- Learn why embedded data quality principles are an important part of system/process design
- Identify opportunities to help drive your organization to a data driven culture
RWDG: Measuring Data Governance PerformanceDATAVERSITY
This document discusses ways to measure the performance of a data governance program. It describes measuring the acceptability of the program within the organization, such as the number of groups participating and customer satisfaction. It also describes measuring the business value of the program, like improvements in data documentation, understanding, quality and protection. The document provides examples of specific metrics that can be used, such as the number of critical data elements standardized or dollars saved/earned due to governance. It also discusses reporting metrics at different levels of a data governance framework.
Tackling Data Quality problems requires more than a series of tactical, one-off improvement projects. By their nature, many Data Quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process, and technology. Join Nigel Turner and Donna Burbank as they provide practical ways to control Data Quality issues in your organization.
To take a “ready, aim, fire” tactic to implement Data Governance, many organizations assess themselves against industry best practices. The process is not difficult or time-consuming and can directly assure that your activities target your specific needs. Best practices are always a strong place to start.
Join Bob Seiner for this popular RWDG topic, where he will provide the information you need to set your program in the best possible direction. Bob will walk you through the steps of conducting an assessment and share with you a set of typical results from taking this action. You may be surprised at how easy it is to organize the assessment and may hear results that stimulate the actions that you need to take.
In this webinar, Bob will share:
- The value of performing a Data Governance best practice assessment
- A practical list of industry Data Governance best practices
- Criteria to determine if a practice is best practice
- Steps to follow to complete an assessment
- Typical recommendations and actions that result from an assessment
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
Organizations must realize what it means to utilize data quality management in support of business strategy. This webinar will demonstrate how chronic business challenges can often be attributed to the root problem of poor data quality. Showing how data quality should be engineered provides a useful framework in which to develop an effective approach. Establishing this framework allows organizations to more efficiently identify business and data problems caused by structural issues versus practice-oriented defects; giving them the skillset to prevent these problems from re-occurring.
Learning Objectives:
Understanding foundational data quality concepts based on the DAMA DMBOK
Utilizing data quality engineering in support of business strategy
Case Studies illustrating data quality success
Data quality guiding principles & best practices
Steps for improving data quality at your organization
DAS Slides: Data Quality Best PracticesDATAVERSITY
Tackling Data Quality problems requires more than a series of tactical, one-off improvement projects. By their nature, many Data Quality problems extend across and often beyond an organization. Addressing these issues requires a holistic architectural approach combining people, process, and technology. Join Nigel Turner and Donna Burbank as they provide practical ways to control Data Quality issues in your organization.
How to Build & Sustain a Data Governance Operating Model DATUM LLC
Learn how to execute a data governance strategy through creation of a successful business case and operating model.
Originally presented to an audience of 400+ at the Master Data Management & Data Governance Summit.
Visit www.datumstrategy.com for more!
This presentation reports on data governance best practices. Based on a definition of fundamental terms and the business rationale for data governance, a set of case studies from leading companies is presented. The content of this presentation is a result of the Competence Center Corporate Data Quality (CC CDQ) at the University of St. Gallen, Switzerland.
Data Modeling, Data Governance, & Data QualityDATAVERSITY
Data Governance is often referred to as the people, processes, and policies around data and information, and these aspects are critical to the success of any data governance implementation. But just as critical is the technical infrastructure that supports the diverse data environments that run the business. Data models can be the critical link between business definitions and rules and the technical data systems that support them. Without the valuable metadata these models provide, data governance often lacks the “teeth” to be applied in operational and reporting systems.
Join Donna Burbank and her guest, Nigel Turner, as they discuss how data models & metadata-driven data governance can be applied in your organization in order to achieve improved data quality.
How to Strengthen Enterprise Data Governance with Data QualityDATAVERSITY
If your organization is in a highly-regulated industry – or relies on data for competitive advantage – data governance is undoubtedly a top priority. Whether you’re focused on “defensive” data governance (supporting regulatory compliance and risk management) or “offensive” data governance (extracting the maximum value from your data assets, and minimizing the cost of bad data), data quality plays a critical role in ensuring success.
Join our webinar to learn how enterprise data quality drives stronger data governance, including:
The overlaps between data governance and data quality
The “data” dependencies of data governance – and how data quality addresses them
Key considerations for deploying data quality for data governance
This document discusses the fundamentals of data quality management. It begins by introducing the speaker, Laura Sebastian-Coleman, and providing an abstract and agenda for the presentation. The abstract states that while organizations rely on data, traditional data management requires many skills and a strategic perspective. Technology changes have increased data volume, velocity and variety, but veracity is still a challenge. Both traditional and big data must be managed together. The presentation will revisit data quality management fundamentals and how to apply them to traditional and big data environments. Attendees will learn how to assess their data environment and provide reliable data to stakeholders.
In today's competitive market, many organizations are unaware of the quantity of poor-quality data in their systems. Some organizations assume that their data is of adequate quality, although they have conducted no metrical or statistical analysis to support the assumption. Others know that their performance is hampered by poor-quality data, but they cannot measure the problem.
Data governance and data quality are often described as two sides of the same coin. Data governance provides a data framework relevant to business needs, and data quality provides visibility into the health of the data. If you only have a data governance tool, you’re missing half the picture.
Trillium Discovery seamlessly integrates with Collibra for a complete, closed-loop data governance solution. Build your data quality rules in Collibra, and they are automatically passed to Trillium for data quality processing. The data quality results and metrics are then passed back to Collibra – allowing data stewards and business users to see the health of the data right within their Collibra dashboard.
View this webinar on-demand to see how you can leverage this integration in your organization to readily build, apply, and execute business rules based on data governance policies within Collibra.
The document outlines 7 principles of data quality management:
1) Have a business-need focus to meet requirements.
2) Have leadership alignment on common strategies and policies.
3) Engage stakeholders across the organization to share responsibility.
4) Take a process approach to understand interconnected activities.
5) Have continuous improvement as an ongoing focus.
6) Make data-based decisions more often.
7) Manage relationships with vendors, producers, and consumers.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
The document discusses an overview of enterprise data governance. It describes the goals of data governance as making data usable, consistent, open, available and reliable across an organization. It outlines the roles and responsibilities involved in data governance including an oversight committee, data stewards, data custodians and various initiatives around master data management, data quality, naming conventions, metadata management and more. The document also discusses why organizations implement data governance and how to effectively implement a data governance program.
This document discusses implementing a data governance program to address various data challenges. It outlines current data issues like missing information, duplicate data and integration difficulties. A data governance program is proposed to establish policies, processes, roles and data ownership to improve data quality. The presentation recommends starting with a small pilot project, then expanding organization-wide. It provides examples of creating a data dictionary to define data elements and assigning data owners.
Oracle Application User Group sponsored Collaborate 2009 Presentation 'Building a Practical Strategy for Managing Data Quality' by Alex Fiteni CPA, CMA
What is the value of data? Data governance must look beyond master data to deliver real value.
Visit www,masterdata.co.za/index.php/data-governance-solutions
Akili provides data integration and management services for oil and gas companies. They leverage over 25 years of experience and experts in SAP, BI platforms, financial systems, and oil and gas data. Akili helps customers address challenges around data quality, reliability, disparate systems and gaining a single view of data. They provide predefined solutions and accelerators using industry standards from PPDM (Professional Petroleum Data Management). Akili's approach involves assessing an organization's data maturity, developing a data integration strategy, addressing governance, master data and tools to integrate data from multiple sources and systems into meaningful business information.
Master Data Management aims to manage shared core business data across systems to reduce risks from data redundancy and inconsistencies. It provides a single view of critical data like customers, products and locations. The goals are ensuring accurate and current shared data while reducing risks from duplicate identifiers. Master Data Management requires data governance and managing the "who, what, where" of business transactions. It includes processes for data acquisition, standardization, matching, merging and sharing a unified view of master data across the organization. Success is measured through metrics like data quality levels and tracking data changes and lineage.
Strategic Data Assessment Services Step by Step Measuring Of Data Quality.pdfEW Solutions
Experts working with data governance assessment often need to make changes in the collection, representation and maintenance of data processes to meet the market standards. Measuring data quality helps ensure accuracy and effectiveness of decision-making.
How Ally Financial Achieved Regulatory Compliance with the Data Management Ma...DATAVERSITY
A Data Management Maturity Model Case Study
Ally Financial Inc., previously known as GMAC Inc., is a bank holding company headquartered in Detroit, Michigan. Ally has more than 15 million customers worldwide, serving over 16,000 auto dealers in the US. In 2009 Ally Bank was launched – at present it has over 784,000 customers, a satisfaction score of over 90%, and has been named the “Best Online Bank” by Money magazine for the last four years.
Ally was an early adopter of the DMM, conducting a broad-based evaluation of its data management practices, and creating a strategy and sequence plan for improvements based on the results. Ally’s implementation of an integrated, organization-wide data management program including data governance, a robust data quality program, and managed data standards, resulted in a “Satisfactory” rating on its latest regulatory audit.
In this webinar, you will learn:
How Ally employed the DMM to evaluate its data management practices
Who was involved / lessons learned
How Ally prioritized and sequenced data management improvement initiatives
How the data management program has been enhanced and expanded
Business impacts and benefits realized
Major initiatives completed and underway
How Ally is leveraging DMM 1.0 to proactively prepare for BCBS 239 compliance.
You Need a Data Catalog. Do You Know Why?Precisely
The data catalog has become a popular discussion topic within data management and data governance circles. A data catalog is a central repository that contains metadata for describing data sets, how they are defined, and where to find them. TDWI research indicates that implementing a data catalog is a top priority among organizations we survey. The data catalog can also play an important part in the governance process. It provides features that help ensure data quality, compliance, and that trusted data is used for analysis. Without an in-depth knowledge of data and associated metadata, organizations cannot truly safeguard and govern their data.
Join this on-demand webinar to learn more about the data catalog and its role in data governance efforts.
Topics include:
· Data management challenges and priorities
· The modern data catalog – what it is and why it is important
· The role of the modern data catalog in your data quality and governance programs
· The kinds of information that should be in your data catalog and why
Optimize Your Healthcare Data Quality Investment: Three Ways to Accelerate Ti...Health Catalyst
Healthcare organizations increasingly rely on data to inform strategic decisions. This growing dependence makes ensuring data across the organization is fit for purpose more critical than ever. Decision-making challenges associated with pandemic-driven urgency, variety of data, and lack of resources have further highlighted the critical importance of healthcare data quality and prompted more focus and investment. However, many data quality initiatives are too narrow in focus and reactive in nature or take longer than expected to demonstrate value. This leaves organizations unprepared for future events, like COVID-19, that require a rapid enterprise-wide analytic response.
What are some actionable ways you can help your organization guard against the data quality challenges uncovered this past year and better prepare to respond in the future? Join Taylor Larsen, Director of Data Quality for Health Catalyst, to learn more.
What You’ll Learn
- How data profiling and data quality assessments, in combination with your data catalog, can increase data quality transparency, expedite root cause analysis, and close data quality monitoring gaps.
- How to leverage AI to reduce data quality monitoring configuration and maintenance time and improve accuracy.
- How defining data quality based on its measurable utility (i.e., data represents information that supports better decisions) can provide a scalable way to ensure data are fit for purpose and avoid cost outstripping return.
Data Virtualization for Business Consumption (Australia)Denodo
This document discusses data virtualization and its benefits for business users. It summarizes that data virtualization can create a connected data landscape that is easily shared, empower business users with self-service BI tools, develop trusted high quality data, and support flexibility. It notes data virtualization provides a logical data layer that improves decision making, broadens data usage, and offers performant access to integrated data without moving or replicating source systems.
Unlocking the Power of Data: Key Benefits of Master Data ManagementCrif Gulf
Master Data Management (MDM) offers numerous benefits to organizations seeking to improve data quality, consistency, and reliability across their systems. Master Data Management offers organizations improved data quality, enhanced decision-making capabilities, increased operational efficiency, better customer experiences, regulatory compliance, data integration, and support for digital transformation initiatives.
The Data Quality Assessment Manager is a Data Quality product specifically designed to manage data quality assessments, manage data quality scores, review and correct quality issues and manage the workflow across all stakeholders involved in a data quality assessment. DQAM is the industry’s first platform designed to put data quality in the hands of data stewards and business owners who know and understand the data the best.
Similar to Chapter 12: Data Quality Management (20)
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
State of Artificial intelligence Report 2023kuntobimo2016
Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its sixth year. Consider this report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future.
We consider the following key dimensions in our report:
Research: Technology breakthroughs and their capabilities.
Industry: Areas of commercial application for AI and its business impact.
Politics: Regulation of AI, its economic implications and the evolving geopolitics of AI.
Safety: Identifying and mitigating catastrophic risks that highly-capable future AI systems could pose to us.
Predictions: What we believe will happen in the next 12 months and a 2022 performance review to keep us honest.
2. Objectives:
• 12.1 Introduction
• 12.2 Concepts and Activities
• 12.2.1 Data Quality Management Approach
• 12.2.2 Develop and Promote Data Quality Awareness
• 12.2.3 Define Data Quality Requirements
• 12.2.4 Profile, Analyze and Assess Data Quality
• 12.2.5 Define Data Quality Metrics
• 12.2.6 Define Data Quality Business Rules
• 12.2.7 Test and Validate Data Quality Requirements
• 12.2.8 Set and Evaluate Data Quality Service Levels
• 12.2.9 Continuously Measure and Monitor Data Quality
• 12.2.10 Manage Data Quality Issues
• 12.2.11 Clean and Correct Data Quality Defects
• 12.2.12 Design and Implement Operational DQM Procedures
3. Objectives:
• 12.3 Data Quality Tools
• 12.3.1 Data Profiling
• 12.3.2 Parsing and Standardization
• 12.3.3 Data Transformation
• 12.3.4 Identify Resolution and Matching
• 12.3.5 Enhancement
• 12.3.6 Reporting
4. 11 Meta-data Management
• Data Quality Management(DQM) is the tenth Data
Management Function in the Data Management
framework in Chapter 1.
• ninth data management function that interacts with and
influenced by Data Governance function.
• In this Chapter, we will define the meta-data
Management Function and Explains the Concepts and
Activities involved.
5. 12.1 Introduction:
• DQM is critical process mandate IT function blend data sources, crate gold
data copies, populate data and integrate it. “should be supported”.
• Data Quality program is necessary to provide an economic solution to
improved data quality and integrity.
• This program have more than just correcting data:
• Involve managing the lifecycle for data Creation, transformation, and
transmission to ensure information meet the needs to data consumers within the
organization.
• Processes begins with identifying the business needs, then Determine the
best ways to measure, monitor, control, and report on the quality of data,
and then notify data stewards to take corrective action.
• Continuous process for defining the parameters of acceptable levels of
data quality to meet business needs, and then ensure its meet these
levels.
• DQM analysis involve: instituting inspection and control processes to
parsing, standardization, cleansing, and consolidation when necessary.
• Lastly, incorporates issues tracking as a way of monitoring compliance with
defined data quality SLAs.
6.
7. 12.2 Concepts and Activities
• Data quality expectations provide the inputs necessary to define
the data quality framework.
• The framework includes defining the requirements, inspection
policies, measures, and monitors that reflect changes in data
quality and performance.
• These requirements reflect three aspects of business data
expectations:
• Manner to record the expectation in business rules.
• A way to measure the quality of data within that dimension.
• An acceptability threshold.
8. 12.2.1 Data Quality Management Approach
• The deming cycle is the general approach for DQM. “Figure 12.2”
• Propose a problem-solving model known as “Plan-do-study-act”.
• When applied to data quality within the constraints of defined data
quality SLAs, it involves:
• Planning for the assessment of the current state and identification of key
metrics for measuring data quality.
• Deploying processes for measuring and improving the quality of data.
• Monitoring and measuring the levels in relation to the defined business
expectations.
• Acting to resolve any identified issues to improve data quality and better meet
business expectations.
10. 12.2.1 Data Quality Management Approach
• The DQM Cycle:
• Begins by identifying the data issues that are critical to achieve with business
objectives, business requirements for data quality, key data quality
dimensions, business rules critical to ensure high quality data.
• In the plan stage, assess the scope of known issues, determining the cost and
impact of the issues, and evaluating alternatives for addressing them.
• In the deploy stage, profile the data and institute inspections and monitors to
identify data issues when they occur. Fixed flawed process.
• The monitor stage, is for actively monitoring the quality of data as measured
against the defined business rules. Defined thresholds for acceptability.
• The act stage, taking action to address and resolve emerging data quality
issues.
• New cycle begin as new data sets come under investigation, or as
new data quality requirements are identified for existing data sets.
11. 12.2.1 Data Quality Management Approach
• The DQM Cycle:
• Begins by identifying the data issues that are critical to achieve with business
objectives, business requirements for data quality, key data quality
dimensions, business rules critical to ensure high quality data.
• In the plan stage, assess the scope of known issues, determining the cost and
impact of the issues, and evaluating alternatives for addressing them.
• In the deploy stage, profile the data and institute inspections and monitors to
identify data issues when they occur. Fixed flawed process.
• The monitor stage, is for actively monitoring the quality of data as measured
against the defined business rules. Defined thresholds for acceptability.
• The act stage, taking action to address and resolve emerging data quality
issues.
• New cycle begin as new data sets come under investigation, or as
new data quality requirements are identified for existing data sets.
12. 12.2.2 Develop and Promote Data Quality Awareness
• Promoting data quality awareness is essential to ensure buy-in of
necessary stakeholders in the organization.
• Awareness include:
• Relating material impacts to data issues
• Ensuring systematic approaches to regulators and oversight of the quality of
organizational data.
• Socializing the concept that data quality problems cannot be solely addressed by
technology solutions.
• establish data governance framework for data quality. tasks :
• Engaging business partners who will work with the data quality team and
champion the DQM program.
• Identifying data ownership roles and responsibilities
• Assigning accountability and responsibility for critical data elements and DQM
• Identify key data quality areas to address and directive to the organization
around these key areas.
• Synchronizing data elements used across the lines of business and providing clear,
unambiguous definitions, use of value domains, and data quality rules.
• Continuously reporting on the measured levels of data quality.
• Introducing the concepts of data requirements analysis as part of the overall
system development life cycle.
• Trying high quality data to individual performance objectives.
13. 12.2.2 Develop and Promote Data Quality Awareness
• The Data Quality Oversight Board is council or roles accountable for
Policies and procedures for oversight of the data quality community.
• The Guidance provided includes:
• Setting priorities for data quality.
• Developing and maintain standards for data quality.
• Reporting relevant measurements of enterprise-wide data quality.
• Providing guidance that facilitates staff involvement.
• Establishing communications mechanisms for knowledge sharing.
• Developing and applying certification and compliance policies.
• Monitoring and reporting on performance.
• Identifying opportunities for improvements and building consensus for
approval.
• Resolving variations and conflicts.
• The constituent participant work together to define Data quality
strategy and framework:
• Develop, formalize, and approve information policies, Data quality standards
and protocols, certify line-of-business conformance to the desired level of
business user expectations.
14. 12.2.3 Define Data Quality Requirements
• Quality of data should understood within context of “fitness for use”
• Data quality requirements are often hidden within defined business
policies.
• Incremental detailed review and iterative refinement of business
policies help identify those requirements. “data quality rules”
• Incremental detailed review steps include:
1. Identifying key data components associated with business policies.
2. Determining how identified data assertions affect the business.
3. Evaluating how data errors are categorized within a set of data quality
dimensions.
4. Specifying the business rules that measure the occurrence of data errors.
5. Providing a means for implementing measurement processes that asses
conformance to those business rules.
15. 12.2.3 Define Data Quality Requirements
• The dimensions of data quality segment the business rules through
measurements such as “ data value, data element, data record and data
table”. These Dimensions include:
• Accuracy: refers to the degree that data correctly represents the ‘real-life’ entities
they model.
• Completeness: rules indicates that certain attributes always have assigned values
in a data set. All appropriate rows in dataset are present.
• Consistency: refers to ensuring that data values in one data set are consistent
with values in another data set.
• Currency: refers to the degree to which information is current with the world that
it models.
• Precision: refers to the level of detail of the data element.
• Privacy: refers to the need for access control and usage monitoring.
• Reasonableness: used to consider consistency expectations relevant within
specific operational contexts.
• Referential Integrity: is condition exists when all intended references from data in
one column in another column of the same or different table is valid.
• Timeless: refer to the time expectation for accessibility and availability of
information.
• Uniqueness: no entity exists more than once within the data set.
• Validity: refers to whether data instances are stored, exchanged, or presented in a
format consistent with the domain of values, and consistent with other similar
attribute values.
16. 12.2.4 Profile, Analyze and Assess Data Quality
• To define quality metrics, two approaches to perform assessments.
• Bottom-up assessment of existing data quality issues: involves inspection and
evaluation of the data sets . Highlight potential issues base on the results of
automated processes, such as frequency analysis, duplicate analysis..etc.
• Top-down approach to data quality assessment involves engaging business
users to document their business processes and the corresponding critical
data dependencies.
• Data quality analyst can assess the kinds of business impacts that
are associated with data issues. Steps of analysis process are:
• Identify a data set for review
• Catalog the business uses of that data set.
• Subject the data set to empirical analysis using data profiling tools and
techniques.
• List all potential anomalies.
• For each anomalies:
• Review the anomaly with SME to determine if it represents a true data flaw.
• Evaluate potential business impacts
• Prioritize criticality of important anomalies in preparation for defining data
quality metrics.
17. 12.2.5 Define Data Quality Metrics
• In DQM, Metrics occurs as part of the strategy/ design/ plan step in
order to implement the function in an organization.
• The characteristics of information quality metrics are:
• Measurability: data quality metric must be measurable, quantifiable whin
discrete range.
• Business relevance: the value of the metric is limited if it cannot be related to
some aspect of business operations or performance.
• Acceptability: the quality of data meets business expectations on specified
acceptability thresholds.
• Accountability / Stewardship: the metric indicates that the quality doesn’t
meet expectations and taking appropriate corrective action is required.
• Trackability: Quantifiable metrics enable an organization to measure data
quality improvement over time.
18. 12.2.5 Define Data Quality Metrics
The process for defining data quality metrics is summarized as:
1. Select one of the identified critical business impacts.
2. Evaluate the dependent data elements, and data create and update
processes associated with that business impact.
3. For each data element, list any associated data requirements.
4. For each data expectation, specify the associated dimension of data quality
and one or more business rules to use to determine conformance of the
data expectation.
5. For each selected business rule, describe the process for measuring
conformance.
6. For each business rule, specify an acceptability threshold.
• The result of Data Quality Mertics:
• Set of measurement processes provide raw data quality scores that can roll
up to quantify conformance to data quality expectations.
• Measurements do not meet the specified acceptability thresholds indicate
nonconformance. Data remediation is necessary.
19. 12.2.6 Define Data Quality Business Rules
• The process of instituting the measurement of conformance to specific
business rules requires definition. Monitoring conformance to theses
business rules requires:
• Segregating data values, records, and collections of records that do not meet
business needs from the valid ones.
• Generating a notification event alerting a data steward of a potential data quality
issue.
• Establishing an automated or event driven process for aligning or possibly
correcting flawed data within business expectations.
• First process uses assertions of expectations of the data. Use templates to
specify these business rules, such as:
• Value Domain membership
• Definitional Conformance
• Range conformance
• Format Compliance
• Mapping Conformance
• Value presence and record completeness
• Consistency rules
• Accuracy verification
• Uniqueness verification
• Timeliness validation
20. 12.2.7 Test and Validate Data Quality Requirements
• Data Profiling tools analyze data to find potential anomalies.
• Most of these tools allow data analysts to :
• Define data rules for validation
• Assessing frequency distributions and corresponding measurements
• Applying the defined rules against the data sets.
• Reviewing the results, and verifying data non-conformant incorrect.
• Review business rules with business clients to ensure these rules
corresponding business requirements.
• Use data rules to characterizing data quality levels of conformance
with an objective measure of data quality.
• Organization can distinguish those records that conform to defined data
quality expectations and those that do not.
21. 12.2.8 Set and Evaluate Data Quality Service Levels
• Data quality inspection and monitoring are used to measure and
monitor compliance with defined data quality rules.
• SLAs Specify the organization’s expectations for response and
remediation.
• There is expectation that the operational procedures will provide a
scheme for remediation of the root cause within an agreed-to
timeframe.
• Operational data quality control defined in data quality SLA, includes
• The data elements covered by the agreement.
• The business impacts associated with data flaws.
• The data quality dimensions associated with each data element.
• The expectations for quality for each data element for each of the identified
dimensions in each application in the value chain.
• The methods for measuring against those expectations.
• The acceptability threshold for each measurement.
• The individual(s) to be notified in case the acceptability threshold is not met.
The timelines and deadlines for expected resolution or remediation of the
issue.
• The escalation strategy and possible rewards and penalties when the
resolution times are met.
22. 12.2.8 Set and Evaluate Data Quality Service Levels
• The data quality SLA also defines the roles and responsibilities
associated with performance of operational data quality procedures.
• provide reports on the conformance to the defined business rules.
• Monitoring staff performance in reacting to Data quality incidents.
• Data stewards and operational data quality staff when they
uploading the level of data quality service,
• They should take data quality SLA constraints into consideration and connect
data quality to individual plans.
• An escalation process when existing issues are not addressed within
the specified resolution times and communicate non-observance of
the level of service up the management chain.
• Data quality SLA establishes the time limits for notification generation, the
names of those in that management chain, when escalation needs to occur.
• Data quality team can monitor compliance of the data to the
business expectations, as how data quality team performs on the
procedures associated with data errors.
23. 12.2.9 Continuously Measure and Monitor Data Quality
• The Operational DQM Procedures depend on available services for
measuring and monitoring the quality of data.
• For conformance to data quality business rules, two contexts for
control and measurement exist: In-stream and batch
• divided for measurements at three levels: Granularity, namely data element
value, data instance or record, and data set, making six possible measures.
• Perform batch activities on collections of data instances assembled in a data
set, likely in persistent storage.
• Continuous monitoring Provided by incorporating control and
measurement process into information processing flow.
• The only in-stream points are when full data sets hand off between
processing stages.
• Incorporate data quality rules using the techniques detailed in Table
12.1
• The Results of incorporating control and measurement processes,
operational procedures and reporting framework, enable
monitoring of the levels of data quality.
24. 12.2.10 Manage Data Quality Issues
• Data quality incident reporting: system have a capability of
reporting and tracking data quality incident and activities for
researching and resolving, through evaluation logs, diagnosis,
subsequent actions associated with data quality events.
• Many organizations already have incident reporting systems for
tracking and managing software, hardware and network issues.
• Incorporating data quality incident tracking focuses on organizing
the categories of data issues into the incident hierarchies.
• Also focus on training staff to recognize when data issues appear
and how they are to be classified, logged, and tracked according to
data quality SLA. “next slide directives for applying”
• Benefits of implementing data quality issues tracking system:
• Information and knowledge sharing can improve performance and reduce
duplication of effort.
• Analysis of all the issues will help data quality team members determine any
repetitive patterns, their frequency, and potential source of issues.
• Depending on the governance, data quality SLA reporting can be
monthly, quarterly or annually, particularly in cases focused on
rewards and penalties.
25. 12.2.10 Manage Data Quality Issues
• The steps involve some or all of incident reporting system directives:
• Standardize data quality issues and activities: valuable per contra variety of
data issues concepts and terms, useful for classification and reporting, easier
to measure the volume, identify patterns and interdependencies between
systems and participants.
• Provide an assignment process for data issues: the assignment process
should be driven within the incident tracking system, by suggesting those
individuals with specific areas of expertise.
• Manage issue escalation procedures: helps expedite efficient handling and
resolution of data issues.
• Mange data quality resolution workflow: to track progress with issues
diagnosis and resolution.
26. 12.2.11 Clean and Correct Data Quality Defects
• The use of business rules for monitoring conformance to
expectations leads to two operational activities:
• Determine and eliminate the root cause of the introduction of errors.
• Isolate the data items that are incorrect, and bringing data into conformance
with expectations.
• Three general ways to perform data correction:
• Automated correction: submit and cleansing data using a collection of data
transformations, standardizations, normalizations, and corrections without
manual intervention.
• Manual directed correction: use automated tools to cleanse and correct data
but require manual review before committing the corrections to persistent
storage.
• Manual Correction: Data stewards inspect invalid records and determine the
correct values, make the corrections, and commit the updated records.
27. 12.2.12 Design and Implement Operational DQM Procedures
• Defined rules for validation of data quality provides a mean of
integrating data inspection into a set of operational procedures
associated with active DQM.
• Integrate the data quality rules into application services or data
services that supplement the data life cycle, either through the
introduction of data quality tools and technology.
• The use of rules engines and reporting tools for monitoring and
reporting, or custom-developed applications for data quality
inspection.
• The operational framework requires these services to be available to
the applications and data services, and the results presented to the
data quality team members.
28. 12.2.12 Design and Implement Operational DQM Procedures
• The Operational framework required from the team design and
implement detailed procedures for operationalizing 4 activities:
Activities The analyst must:
1- Inspection and monitoring • Review the measurements and metrics
• Determine if any acceptability threshold exist not met.
• Crate a new data quality incident report
• Assign the incident for diagnosis and evaluation
2- Diagnosis and evaluation of
remediation alternatives
• Review data issue, track error from the beginning
• Evaluate change environment cause errors
• Evaluate process issues contributed to incident.
• Determine any external data provider issues affect
data quality.
• Evaluate alternatives for addressing the issue.
• Provide updates to data quality incident tracking
system.
29. 12.2.12 Design and Implement Operational DQM Procedures
• The Operational framework required from the team design and
implement detailed procedures for operationalizing 4 activities:
Activities The analyst must:
3 – Resolving the issue • Assess the relative costs and metrics of the
alternatives.
• Recommend one of the alternatives
• Provide a plan for resolution, include modifying
the process and correcting flawed data.
• Implement the resolution.
• Provide updates to the data quality incident
tracking system.
4 – Reporting • Data quality scorecard
• Data quality trends
• Data quality performance
• These report align to metrics in data quality SLA,
so that the areas important to the achievement of
the data quality SLA are at some level, in internal
team reports.
30. 12.2.13 Monitor Operational DQM Procedures and Performance
• Accountability is critical in governance data quality control.
• All issues must be assigned to some number of individuals, groups,
departments or organizations.
• The incident tracking system will collect performance data relating
to issue resolution, work assignments, volume of issues, frequency
of occurrence, as well as the time to respond, diagnose, plan a
solution, and resolve issues.
• These metrics can provide:
• Valuable insights into the effectiveness of the current workflow
• systems and resource utilization
• important management data points that can drive continuous operational
improvement for data quality control.
31. 12.2.13 Monitor Operational DQM Procedures and Performance
• Accountability is critical in governance data quality control.
• All issues must be assigned to some number of individuals, groups,
departments or organizations.
• The incident tracking system will collect performance data relating
to issue resolution, work assignments, volume of issues, frequency
of occurrence, as well as the time to respond, diagnose, plan a
solution, and resolve issues.
• These metrics can provide:
• Valuable insights into the effectiveness of the current workflow
• systems and resource utilization
• important management data points that can drive continuous operational
improvement for data quality control.
32. 12.3 Data Quality Tools
• DQM employs well-established tools and techniques:
• Focus from empirically assessing the quality of data through data analysis
• Normalization of data values in accordance with defined business rules
• Ability to identify and resolve duplicate records into a single representation.
• Schedule inspections and change on a regular basis.
• Data Quality tools can be segregated into four categories of
activities:
• Analysis
• Cleansing
• Enhancement
• Monitoring
• The principles tools used are:
• Data Profiling
• Parsing and standardization
• Data transformation
• Identity resolution and matching
• Enhancement
• Reporting
33. 12.3 Data Quality Tools
• DQM employs well-established tools and techniques:
• Focus from empirically assessing the quality of data through data analysis
• Normalization of data values in accordance with defined business rules
• Ability to identify and resolve duplicate records into a single representation.
• Schedule inspections and change on a regular basis.
• Data Quality tools can be segregated into four categories of
activities:
• Analysis
• Cleansing
• Enhancement
• Monitoring
• The principles tools used are:
• Data Profiling
• Parsing and standardization
• Data transformation
• Identity resolution and matching
• Enhancement
• Reporting
34. Data Quality Tools
12.3.1 Data Profiling
• Data Profiling is a set of algorithms to analysis and discovery data
values, for two purposes:
• Statistical analysis and assessment of the quality of data values Whitin data
set.
• Exploring relationships that exist between value collections within and across
data sets.
• Can summarize key characteristics of the values within each column,
such as the minimum, maximum and average values.
• Data profiling analyzes and assesses data anomalies after perform
cross-column and inter-table analysis.
• Can also proactively test against a set of defined or discovered
business rules to supports Data Quality reporting Processes.
35. Data Quality Tools
12.3.2 Parsing and Standardization
• Data parsing tools enable data analyst to define sets of patterns that
feed into rules engine used to distinguish between valid and invalid
data values.
• Extract the action match these patterns, rearrange Tokens in standard
representation when parsing a valid pattern.
• When invalid pattern recognized, attempt to transform value into one that
meets expectations.
• Parsing and standardizing data values is valuable in case of confusion
and ambiguity.
• Example a standard format of telephone number segments into ( area code,
exchange, and line number)
• The ability to build patterns contributed by human ability and
recognitions.
• Pattern-based parsing can automate the recognition and subsequent
standardization of meaningful value components.
36. Data Quality Tools
12.3.3 Data Transformation
• Is target architecture that transform the flawed data identified by
errors, trigger data rules into acceptable format .
• Perform standardization by mapping data from some source pattern
into a corresponding target representation.
• Example: “Customer name” may represented in thousands of different forms.
• Data transformation builds on these types of standardization
techniques.
• By mapping data values in their original formats and patterns into a target
representation.
• Parsed components of a pattern are subjected to rearrangement, corrections,
or any changes as directed by the rules in the knowledge base.
• Standardization is a special case of transformation, employing rules
that capture context, linguistics, and idioms recognized as common
over time, though repeated analysis by the rules analyst or tool
vendor.
37. Data Quality Tools
12.3.4 Identity Resolution and Matching
• Strategic initiatives performed to identified and evaluate “similarity”
of records.
• Common data quality problem involves two side of the same coin:
• Multiple data instances that actually refer to the same real-world entity.
• The perception, by analyst or application, that a record does not exist for a
real-world entity, when in fact it really does.
• The situations:
• Something introduced similar, yet variant representations in data values into
the system.
• Slight variation in representation prevents the identification of an exact
match of the existing record in the data set.
• Similarity analysis perform for both situations containing score
degree and comparison between records to recognize the variations
• Deterministic and probabilistic are the approach of matching.
• Deterministic defined patterns and rules for assigning weights and scores for
determine similarity.
• Probabilistic “alternative” relies on statistical techniques for assessing the
probability that pair of records represents the same entity.
38. Data Quality Tools
12.3.5 Enhancement
• Data enhancement is a method of adding value to information by
accumulating additional information about a base set of entities and
then merging all the sets to provide view of the data.
• Data parsing help in determining potential sources for added benefit
• Example of data enhancement:
• Time/Data stamps: document the time and date that data items are created,
modified, or retired, which can help to track historical data events.
• Auditing Information: document data lineage, which also is important for
historical tracking as well as validation.
• Contextual Information: There is several geographic enhancements possible,
such as address standardization and geocoding, include regional coding,
municipality, neighborhood mapping, latitude / longitude pairs, or other
kinds of location-based data.
• Demographic Information: demographic enhancements such as customer
age, marital age, marital status , gender, income, or business entities, annual
revenue, number of employees, size of occupied space, etc.
• Psychographic Information: use these kinds of enhancements to segment the
target population by specified behaviors, such as product and brand
preferences, organization memberships, etc.
39. Data Quality Tools
12.3.6 Reporting
• All these following are supported by good reporting:
• Inspection and monitoring of conformance to data quality expectations.
• Monitoring performance of data stewards conforming to data quality SLAs.
• Workflow processing for data quality incidents.
• Manual oversight of data cleansing and correction
• It is optimal to have a user interface to report results associated with
data quality measurement, metrics, and activity.
• It is wise to incorporate visualization and reporting for standard
reports, scorecards, dashboards, and for provision of ad hoc queries
as part of the functional requirements for any acquired data quality
tools.