The document discusses data quality management (DQM) concepts and activities. It describes the DQM approach as a continuous cycle of planning, deployment, monitoring, and acting. Key activities include developing data quality awareness, defining requirements, profiling/assessing data, defining metrics/rules, testing requirements, setting service levels, continuously measuring/monitoring quality, and managing issues. DQM aims to ensure data meets fitness-for-use expectations and business needs.
This document discusses the fundamentals of data quality management. It begins by introducing the speaker, Laura Sebastian-Coleman, and providing an abstract and agenda for the presentation. The abstract states that while organizations rely on data, traditional data management requires many skills and a strategic perspective. Technology changes have increased data volume, velocity and variety, but veracity is still a challenge. Both traditional and big data must be managed together. The presentation will revisit data quality management fundamentals and how to apply them to traditional and big data environments. Attendees will learn how to assess their data environment and provide reliable data to stakeholders.
In today's competitive market, many organizations are unaware of the quantity of poor-quality data in their systems. Some organizations assume that their data is of adequate quality, although they have conducted no metrical or statistical analysis to support the assumption. Others know that their performance is hampered by poor-quality data, but they cannot measure the problem.
Data governance and data quality are often described as two sides of the same coin. Data governance provides a data framework relevant to business needs, and data quality provides visibility into the health of the data. If you only have a data governance tool, you’re missing half the picture.
Trillium Discovery seamlessly integrates with Collibra for a complete, closed-loop data governance solution. Build your data quality rules in Collibra, and they are automatically passed to Trillium for data quality processing. The data quality results and metrics are then passed back to Collibra – allowing data stewards and business users to see the health of the data right within their Collibra dashboard.
View this webinar on-demand to see how you can leverage this integration in your organization to readily build, apply, and execute business rules based on data governance policies within Collibra.
The document outlines 7 principles of data quality management:
1) Have a business-need focus to meet requirements.
2) Have leadership alignment on common strategies and policies.
3) Engage stakeholders across the organization to share responsibility.
4) Take a process approach to understand interconnected activities.
5) Have continuous improvement as an ongoing focus.
6) Make data-based decisions more often.
7) Manage relationships with vendors, producers, and consumers.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
Mr. Hery Purnama is an IT consultant and trainer in Bandung, Indonesia with over 20 years of experience in various IT projects. He specializes in areas like system development, data science, IoT, project management, IT service management, information security, and enterprise architecture. He holds several international certifications and provides training on topics such as CDMP (Certified Data Management Professional), COBIT, and TOGAF.
The document discusses an overview and exam requirements for the CDMP certification. It covers the 14 topics tested in the 100 question exam, including data governance, data modeling, data security, and big data. Tips are provided for exam registration and practice questions are available online.
The document discusses an overview of enterprise data governance. It describes the goals of data governance as making data usable, consistent, open, available and reliable across an organization. It outlines the roles and responsibilities involved in data governance including an oversight committee, data stewards, data custodians and various initiatives around master data management, data quality, naming conventions, metadata management and more. The document also discusses why organizations implement data governance and how to effectively implement a data governance program.
This document discusses the fundamentals of data quality management. It begins by introducing the speaker, Laura Sebastian-Coleman, and providing an abstract and agenda for the presentation. The abstract states that while organizations rely on data, traditional data management requires many skills and a strategic perspective. Technology changes have increased data volume, velocity and variety, but veracity is still a challenge. Both traditional and big data must be managed together. The presentation will revisit data quality management fundamentals and how to apply them to traditional and big data environments. Attendees will learn how to assess their data environment and provide reliable data to stakeholders.
In today's competitive market, many organizations are unaware of the quantity of poor-quality data in their systems. Some organizations assume that their data is of adequate quality, although they have conducted no metrical or statistical analysis to support the assumption. Others know that their performance is hampered by poor-quality data, but they cannot measure the problem.
Data governance and data quality are often described as two sides of the same coin. Data governance provides a data framework relevant to business needs, and data quality provides visibility into the health of the data. If you only have a data governance tool, you’re missing half the picture.
Trillium Discovery seamlessly integrates with Collibra for a complete, closed-loop data governance solution. Build your data quality rules in Collibra, and they are automatically passed to Trillium for data quality processing. The data quality results and metrics are then passed back to Collibra – allowing data stewards and business users to see the health of the data right within their Collibra dashboard.
View this webinar on-demand to see how you can leverage this integration in your organization to readily build, apply, and execute business rules based on data governance policies within Collibra.
The document outlines 7 principles of data quality management:
1) Have a business-need focus to meet requirements.
2) Have leadership alignment on common strategies and policies.
3) Engage stakeholders across the organization to share responsibility.
4) Take a process approach to understand interconnected activities.
5) Have continuous improvement as an ongoing focus.
6) Make data-based decisions more often.
7) Manage relationships with vendors, producers, and consumers.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
The document provides an overview of data management, including its mission, goals, functions, activities, roles, and supporting technologies. It describes the 10 main functions of data management as data governance, data architecture management, data development, data operations management, data security management, reference and master data management, data warehousing/business intelligence, document and content management, metadata management, and data quality management. For each function, it lists the core activities and sub-activities. The overview aims to cover the key processes, roles, and technologies involved in comprehensive data management.
Mr. Hery Purnama is an IT consultant and trainer in Bandung, Indonesia with over 20 years of experience in various IT projects. He specializes in areas like system development, data science, IoT, project management, IT service management, information security, and enterprise architecture. He holds several international certifications and provides training on topics such as CDMP (Certified Data Management Professional), COBIT, and TOGAF.
The document discusses an overview and exam requirements for the CDMP certification. It covers the 14 topics tested in the 100 question exam, including data governance, data modeling, data security, and big data. Tips are provided for exam registration and practice questions are available online.
The document discusses an overview of enterprise data governance. It describes the goals of data governance as making data usable, consistent, open, available and reliable across an organization. It outlines the roles and responsibilities involved in data governance including an oversight committee, data stewards, data custodians and various initiatives around master data management, data quality, naming conventions, metadata management and more. The document also discusses why organizations implement data governance and how to effectively implement a data governance program.
This document discusses implementing a data governance program to address various data challenges. It outlines current data issues like missing information, duplicate data and integration difficulties. A data governance program is proposed to establish policies, processes, roles and data ownership to improve data quality. The presentation recommends starting with a small pilot project, then expanding organization-wide. It provides examples of creating a data dictionary to define data elements and assigning data owners.
Oracle Application User Group sponsored Collaborate 2009 Presentation 'Building a Practical Strategy for Managing Data Quality' by Alex Fiteni CPA, CMA
What is the value of data? Data governance must look beyond master data to deliver real value.
Visit www,masterdata.co.za/index.php/data-governance-solutions
Akili provides data integration and management services for oil and gas companies. They leverage over 25 years of experience and experts in SAP, BI platforms, financial systems, and oil and gas data. Akili helps customers address challenges around data quality, reliability, disparate systems and gaining a single view of data. They provide predefined solutions and accelerators using industry standards from PPDM (Professional Petroleum Data Management). Akili's approach involves assessing an organization's data maturity, developing a data integration strategy, addressing governance, master data and tools to integrate data from multiple sources and systems into meaningful business information.
Master Data Management aims to manage shared core business data across systems to reduce risks from data redundancy and inconsistencies. It provides a single view of critical data like customers, products and locations. The goals are ensuring accurate and current shared data while reducing risks from duplicate identifiers. Master Data Management requires data governance and managing the "who, what, where" of business transactions. It includes processes for data acquisition, standardization, matching, merging and sharing a unified view of master data across the organization. Success is measured through metrics like data quality levels and tracking data changes and lineage.
Strategic Data Assessment Services Step by Step Measuring Of Data Quality.pdfEW Solutions
Experts working with data governance assessment often need to make changes in the collection, representation and maintenance of data processes to meet the market standards. Measuring data quality helps ensure accuracy and effectiveness of decision-making.
How Ally Financial Achieved Regulatory Compliance with the Data Management Ma...DATAVERSITY
A Data Management Maturity Model Case Study
Ally Financial Inc., previously known as GMAC Inc., is a bank holding company headquartered in Detroit, Michigan. Ally has more than 15 million customers worldwide, serving over 16,000 auto dealers in the US. In 2009 Ally Bank was launched – at present it has over 784,000 customers, a satisfaction score of over 90%, and has been named the “Best Online Bank” by Money magazine for the last four years.
Ally was an early adopter of the DMM, conducting a broad-based evaluation of its data management practices, and creating a strategy and sequence plan for improvements based on the results. Ally’s implementation of an integrated, organization-wide data management program including data governance, a robust data quality program, and managed data standards, resulted in a “Satisfactory” rating on its latest regulatory audit.
In this webinar, you will learn:
How Ally employed the DMM to evaluate its data management practices
Who was involved / lessons learned
How Ally prioritized and sequenced data management improvement initiatives
How the data management program has been enhanced and expanded
Business impacts and benefits realized
Major initiatives completed and underway
How Ally is leveraging DMM 1.0 to proactively prepare for BCBS 239 compliance.
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
Organizations must realize what it means to utilize data quality management in support of business strategy. This webinar will demonstrate how chronic business challenges can often be attributed to the root problem of poor data quality. Showing how data quality should be engineered provides a useful framework in which to develop an effective approach. Establishing this framework allows organizations to more efficiently identify business and data problems caused by structural issues versus practice-oriented defects; giving them the skillset to prevent these problems from re-occurring.
Learning Objectives:
Understanding foundational data quality concepts based on the DAMA DMBOK
Utilizing data quality engineering in support of business strategy
Case Studies illustrating data quality success
Data quality guiding principles & best practices
Steps for improving data quality at your organization
You Need a Data Catalog. Do You Know Why?Precisely
The data catalog has become a popular discussion topic within data management and data governance circles. A data catalog is a central repository that contains metadata for describing data sets, how they are defined, and where to find them. TDWI research indicates that implementing a data catalog is a top priority among organizations we survey. The data catalog can also play an important part in the governance process. It provides features that help ensure data quality, compliance, and that trusted data is used for analysis. Without an in-depth knowledge of data and associated metadata, organizations cannot truly safeguard and govern their data.
Join this on-demand webinar to learn more about the data catalog and its role in data governance efforts.
Topics include:
· Data management challenges and priorities
· The modern data catalog – what it is and why it is important
· The role of the modern data catalog in your data quality and governance programs
· The kinds of information that should be in your data catalog and why
Optimize Your Healthcare Data Quality Investment: Three Ways to Accelerate Ti...Health Catalyst
Healthcare organizations increasingly rely on data to inform strategic decisions. This growing dependence makes ensuring data across the organization is fit for purpose more critical than ever. Decision-making challenges associated with pandemic-driven urgency, variety of data, and lack of resources have further highlighted the critical importance of healthcare data quality and prompted more focus and investment. However, many data quality initiatives are too narrow in focus and reactive in nature or take longer than expected to demonstrate value. This leaves organizations unprepared for future events, like COVID-19, that require a rapid enterprise-wide analytic response.
What are some actionable ways you can help your organization guard against the data quality challenges uncovered this past year and better prepare to respond in the future? Join Taylor Larsen, Director of Data Quality for Health Catalyst, to learn more.
What You’ll Learn
- How data profiling and data quality assessments, in combination with your data catalog, can increase data quality transparency, expedite root cause analysis, and close data quality monitoring gaps.
- How to leverage AI to reduce data quality monitoring configuration and maintenance time and improve accuracy.
- How defining data quality based on its measurable utility (i.e., data represents information that supports better decisions) can provide a scalable way to ensure data are fit for purpose and avoid cost outstripping return.
Data Virtualization for Business Consumption (Australia)Denodo
This document discusses data virtualization and its benefits for business users. It summarizes that data virtualization can create a connected data landscape that is easily shared, empower business users with self-service BI tools, develop trusted high quality data, and support flexibility. It notes data virtualization provides a logical data layer that improves decision making, broadens data usage, and offers performant access to integrated data without moving or replicating source systems.
Unlocking the Power of Data: Key Benefits of Master Data ManagementCrif Gulf
Master Data Management (MDM) offers numerous benefits to organizations seeking to improve data quality, consistency, and reliability across their systems. Master Data Management offers organizations improved data quality, enhanced decision-making capabilities, increased operational efficiency, better customer experiences, regulatory compliance, data integration, and support for digital transformation initiatives.
The Data Quality Assessment Manager is a Data Quality product specifically designed to manage data quality assessments, manage data quality scores, review and correct quality issues and manage the workflow across all stakeholders involved in a data quality assessment. DQAM is the industry’s first platform designed to put data quality in the hands of data stewards and business owners who know and understand the data the best.
How to Realize Benefits from Data Management Maturity ModelsKingland
View individual use cases from a large B2B organization, mid-size financial institution, and a scientific data repository. See the plan and outcome from all case studies.
Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012TEST Huddle
Ray Scott discusses test data management in agile environments. He notes that while development may be agile, supporting test data often cannot keep up with frequent changes. Traditional test data generation methods take weeks but agile needs data in hours. He advocates treating test data management as a development project and service. Testers should own the data by determining usage, mapping test conditions to data conditions, and ensuring versioning. With solid data provisioning focusing on business rules and repeatability, testing can add value in agile projects.
This introduction to data governance presentation covers the inter-related DM foundational disciplines (Data Integration / DWH, Business Intelligence and Data Governance). Some of the pitfalls and success factors for data governance.
• IM Foundational Disciplines
• Cross-functional Workflow Exchange
• Key Objectives of the Data Governance Framework
• Components of a Data Governance Framework
• Key Roles in Data Governance
• Data Governance Committee (DGC)
• 4 Data Governance Policy Areas
• 3 Challenges to Implementing Data Governance
• Data Governance Success Factors
How to Build Data Governance Programs That Last: A Business-First ApproachPrecisely
Traditional data governance initiatives fail by focusing too heavily on policies, compliance, and enforcement, which quickly lose business interest and support. This leaves governance leaders and data stewards having to continually make the case for data governance to secure business adoption.
In this introductory session, we will share the core components of a business-first data governance approach that promotes organizational adoption, lays the foundation for data integrity, and consistently delivers business value for the long term.
The document discusses the new age of data quality and challenges of ensuring high quality data. It notes that traditional batch-based approaches are no longer sufficient and real-time validation of large, diverse datasets is now needed. Additionally, business users require more control over data rules rather than rules being centrally managed. Effective data quality requires balancing standards-governance with collaboration and giving users self-service functionality. Ensuring quality in big data also requires addressing completeness, conformity, accuracy and other metrics.
Data governance maturity levels range from 1 to 5, with level 1 being the initial stage and level 5 being optimized. Level 1 is characterized by undocumented and ad-hoc processes, while level 2 has some repeatable policies but may lack rigor. Level 3 involves defined and documented policies, an enterprise data governance function, and cataloged data assets. Level 4 provides enterprise-wide visibility of data and executive support for data governance. Level 5 focuses on continually improving data through practices from the prior levels.
The document outlines a STEAM-based curriculum to create the next generation of tech leaders. It includes 6 levels from foundation to expert that focus on skills like computer science, coding, game development, web development, and more. Sessions are 2.5 hours per week for 10-12 months per level. Students work in groups of 4 max on projects like mobile apps, websites, and games using technologies like mBlock, Unity, and Flutter. Feedback and assessments are provided to measure learning outcomes and skills. Certificates and membership benefits are available upon completion.
The document discusses meta-data management. It defines meta-data as "data about data" that describes other data. Meta-data management involves understanding requirements, defining architectures, implementing standards, creating and maintaining meta-data, and managing meta-data repositories. The document outlines the concepts, types, sources, and activities involved in effective meta-data management.
This document discusses implementing a data governance program to address various data challenges. It outlines current data issues like missing information, duplicate data and integration difficulties. A data governance program is proposed to establish policies, processes, roles and data ownership to improve data quality. The presentation recommends starting with a small pilot project, then expanding organization-wide. It provides examples of creating a data dictionary to define data elements and assigning data owners.
Oracle Application User Group sponsored Collaborate 2009 Presentation 'Building a Practical Strategy for Managing Data Quality' by Alex Fiteni CPA, CMA
What is the value of data? Data governance must look beyond master data to deliver real value.
Visit www,masterdata.co.za/index.php/data-governance-solutions
Akili provides data integration and management services for oil and gas companies. They leverage over 25 years of experience and experts in SAP, BI platforms, financial systems, and oil and gas data. Akili helps customers address challenges around data quality, reliability, disparate systems and gaining a single view of data. They provide predefined solutions and accelerators using industry standards from PPDM (Professional Petroleum Data Management). Akili's approach involves assessing an organization's data maturity, developing a data integration strategy, addressing governance, master data and tools to integrate data from multiple sources and systems into meaningful business information.
Master Data Management aims to manage shared core business data across systems to reduce risks from data redundancy and inconsistencies. It provides a single view of critical data like customers, products and locations. The goals are ensuring accurate and current shared data while reducing risks from duplicate identifiers. Master Data Management requires data governance and managing the "who, what, where" of business transactions. It includes processes for data acquisition, standardization, matching, merging and sharing a unified view of master data across the organization. Success is measured through metrics like data quality levels and tracking data changes and lineage.
Strategic Data Assessment Services Step by Step Measuring Of Data Quality.pdfEW Solutions
Experts working with data governance assessment often need to make changes in the collection, representation and maintenance of data processes to meet the market standards. Measuring data quality helps ensure accuracy and effectiveness of decision-making.
How Ally Financial Achieved Regulatory Compliance with the Data Management Ma...DATAVERSITY
A Data Management Maturity Model Case Study
Ally Financial Inc., previously known as GMAC Inc., is a bank holding company headquartered in Detroit, Michigan. Ally has more than 15 million customers worldwide, serving over 16,000 auto dealers in the US. In 2009 Ally Bank was launched – at present it has over 784,000 customers, a satisfaction score of over 90%, and has been named the “Best Online Bank” by Money magazine for the last four years.
Ally was an early adopter of the DMM, conducting a broad-based evaluation of its data management practices, and creating a strategy and sequence plan for improvements based on the results. Ally’s implementation of an integrated, organization-wide data management program including data governance, a robust data quality program, and managed data standards, resulted in a “Satisfactory” rating on its latest regulatory audit.
In this webinar, you will learn:
How Ally employed the DMM to evaluate its data management practices
Who was involved / lessons learned
How Ally prioritized and sequenced data management improvement initiatives
How the data management program has been enhanced and expanded
Business impacts and benefits realized
Major initiatives completed and underway
How Ally is leveraging DMM 1.0 to proactively prepare for BCBS 239 compliance.
Data-Ed Webinar: Data Quality Success StoriesDATAVERSITY
Organizations must realize what it means to utilize data quality management in support of business strategy. This webinar will demonstrate how chronic business challenges can often be attributed to the root problem of poor data quality. Showing how data quality should be engineered provides a useful framework in which to develop an effective approach. Establishing this framework allows organizations to more efficiently identify business and data problems caused by structural issues versus practice-oriented defects; giving them the skillset to prevent these problems from re-occurring.
Learning Objectives:
Understanding foundational data quality concepts based on the DAMA DMBOK
Utilizing data quality engineering in support of business strategy
Case Studies illustrating data quality success
Data quality guiding principles & best practices
Steps for improving data quality at your organization
You Need a Data Catalog. Do You Know Why?Precisely
The data catalog has become a popular discussion topic within data management and data governance circles. A data catalog is a central repository that contains metadata for describing data sets, how they are defined, and where to find them. TDWI research indicates that implementing a data catalog is a top priority among organizations we survey. The data catalog can also play an important part in the governance process. It provides features that help ensure data quality, compliance, and that trusted data is used for analysis. Without an in-depth knowledge of data and associated metadata, organizations cannot truly safeguard and govern their data.
Join this on-demand webinar to learn more about the data catalog and its role in data governance efforts.
Topics include:
· Data management challenges and priorities
· The modern data catalog – what it is and why it is important
· The role of the modern data catalog in your data quality and governance programs
· The kinds of information that should be in your data catalog and why
Optimize Your Healthcare Data Quality Investment: Three Ways to Accelerate Ti...Health Catalyst
Healthcare organizations increasingly rely on data to inform strategic decisions. This growing dependence makes ensuring data across the organization is fit for purpose more critical than ever. Decision-making challenges associated with pandemic-driven urgency, variety of data, and lack of resources have further highlighted the critical importance of healthcare data quality and prompted more focus and investment. However, many data quality initiatives are too narrow in focus and reactive in nature or take longer than expected to demonstrate value. This leaves organizations unprepared for future events, like COVID-19, that require a rapid enterprise-wide analytic response.
What are some actionable ways you can help your organization guard against the data quality challenges uncovered this past year and better prepare to respond in the future? Join Taylor Larsen, Director of Data Quality for Health Catalyst, to learn more.
What You’ll Learn
- How data profiling and data quality assessments, in combination with your data catalog, can increase data quality transparency, expedite root cause analysis, and close data quality monitoring gaps.
- How to leverage AI to reduce data quality monitoring configuration and maintenance time and improve accuracy.
- How defining data quality based on its measurable utility (i.e., data represents information that supports better decisions) can provide a scalable way to ensure data are fit for purpose and avoid cost outstripping return.
Data Virtualization for Business Consumption (Australia)Denodo
This document discusses data virtualization and its benefits for business users. It summarizes that data virtualization can create a connected data landscape that is easily shared, empower business users with self-service BI tools, develop trusted high quality data, and support flexibility. It notes data virtualization provides a logical data layer that improves decision making, broadens data usage, and offers performant access to integrated data without moving or replicating source systems.
Unlocking the Power of Data: Key Benefits of Master Data ManagementCrif Gulf
Master Data Management (MDM) offers numerous benefits to organizations seeking to improve data quality, consistency, and reliability across their systems. Master Data Management offers organizations improved data quality, enhanced decision-making capabilities, increased operational efficiency, better customer experiences, regulatory compliance, data integration, and support for digital transformation initiatives.
The Data Quality Assessment Manager is a Data Quality product specifically designed to manage data quality assessments, manage data quality scores, review and correct quality issues and manage the workflow across all stakeholders involved in a data quality assessment. DQAM is the industry’s first platform designed to put data quality in the hands of data stewards and business owners who know and understand the data the best.
How to Realize Benefits from Data Management Maturity ModelsKingland
View individual use cases from a large B2B organization, mid-size financial institution, and a scientific data repository. See the plan and outcome from all case studies.
Ray Scott - Agile Solutions – Leading with Test Data Management - EuroSTAR 2012TEST Huddle
Ray Scott discusses test data management in agile environments. He notes that while development may be agile, supporting test data often cannot keep up with frequent changes. Traditional test data generation methods take weeks but agile needs data in hours. He advocates treating test data management as a development project and service. Testers should own the data by determining usage, mapping test conditions to data conditions, and ensuring versioning. With solid data provisioning focusing on business rules and repeatability, testing can add value in agile projects.
This introduction to data governance presentation covers the inter-related DM foundational disciplines (Data Integration / DWH, Business Intelligence and Data Governance). Some of the pitfalls and success factors for data governance.
• IM Foundational Disciplines
• Cross-functional Workflow Exchange
• Key Objectives of the Data Governance Framework
• Components of a Data Governance Framework
• Key Roles in Data Governance
• Data Governance Committee (DGC)
• 4 Data Governance Policy Areas
• 3 Challenges to Implementing Data Governance
• Data Governance Success Factors
How to Build Data Governance Programs That Last: A Business-First ApproachPrecisely
Traditional data governance initiatives fail by focusing too heavily on policies, compliance, and enforcement, which quickly lose business interest and support. This leaves governance leaders and data stewards having to continually make the case for data governance to secure business adoption.
In this introductory session, we will share the core components of a business-first data governance approach that promotes organizational adoption, lays the foundation for data integrity, and consistently delivers business value for the long term.
The document discusses the new age of data quality and challenges of ensuring high quality data. It notes that traditional batch-based approaches are no longer sufficient and real-time validation of large, diverse datasets is now needed. Additionally, business users require more control over data rules rather than rules being centrally managed. Effective data quality requires balancing standards-governance with collaboration and giving users self-service functionality. Ensuring quality in big data also requires addressing completeness, conformity, accuracy and other metrics.
Data governance maturity levels range from 1 to 5, with level 1 being the initial stage and level 5 being optimized. Level 1 is characterized by undocumented and ad-hoc processes, while level 2 has some repeatable policies but may lack rigor. Level 3 involves defined and documented policies, an enterprise data governance function, and cataloged data assets. Level 4 provides enterprise-wide visibility of data and executive support for data governance. Level 5 focuses on continually improving data through practices from the prior levels.
Similar to chapter12-220725121546-610a1427.pdf (20)
The document outlines a STEAM-based curriculum to create the next generation of tech leaders. It includes 6 levels from foundation to expert that focus on skills like computer science, coding, game development, web development, and more. Sessions are 2.5 hours per week for 10-12 months per level. Students work in groups of 4 max on projects like mobile apps, websites, and games using technologies like mBlock, Unity, and Flutter. Feedback and assessments are provided to measure learning outcomes and skills. Certificates and membership benefits are available upon completion.
The document discusses meta-data management. It defines meta-data as "data about data" that describes other data. Meta-data management involves understanding requirements, defining architectures, implementing standards, creating and maintaining meta-data, and managing meta-data repositories. The document outlines the concepts, types, sources, and activities involved in effective meta-data management.
The document discusses document and content management. It defines document management as the control over electronic and paper documents, including their storage, inventory and access. Content management is defined as organizing, categorizing and structuring access to information content to enable effective retrieval and reuse. The document outlines key concepts and activities for both document and content management, including planning, implementing systems, backup/recovery, retention, auditing and governance to ensure quality.
The document discusses concepts and activities related to data warehousing and business intelligence management. It provides an overview of key terms and components, including Inmon and Kimball's approaches to data warehouse architecture. Inmon's Corporate Information Factory model describes the major components as the raw data applications, operational data store, data warehouse, operational data marts, and data marts. Kimball's approach focuses on dimensional modeling and his "data warehouse chess pieces" which include the business process, data, data warehouse, and access layers. The document then covers typical data warehousing and business intelligence activities.
The document discusses reference and master data management. It defines reference data as data used to classify or categorize other data, using predefined valid values. Master data provides context for business transactions and includes data about key entities like parties, products, locations. The objectives are to maintain consistent reference and master data across systems through activities like defining golden records, match rules, hierarchies and distributing reference and master data.
This document discusses data security management. It outlines concepts and activities related to data security including understanding business and regulatory requirements, defining security policies, standards, controls and procedures, managing users, passwords and access permissions. The goal is to protect information through proper authentication, authorization, access and auditing in alignment with organizational needs and regulations.
Data development involves analyzing, designing, implementing, deploying, and maintaining data solutions to maximize the value of enterprise data. It includes defining data requirements, designing data components like databases and reports, and implementing these components. Effective data development requires collaboration between business experts, data architects, analysts, developers and other roles. The activities of data development follow the system development lifecycle and include data modeling, analysis, design, implementation, and maintenance.
The document discusses data governance concepts and activities. It defines data governance as the exercise of authority and control over data asset management. The key activities of data governance include developing a data strategy and policies, overseeing data architecture and standards, ensuring regulatory compliance, managing issues, overseeing data management projects, and communicating guidelines. Data governance involves both business and technical roles working together through committees, councils and teams to effectively manage an organization's data assets.
The document discusses data architecture management. It provides an overview of data architecture, including its position within enterprise architecture and a diagram showing its key components. It describes the concepts and activities involved in data architecture management, such as developing and maintaining an enterprise data model. The Zachman Framework for enterprise architecture is also introduced as a tool for data architects.
The document summarizes Chapter 1 of the DAMA-DMBOK Guide, which discusses data as a vital enterprise asset and introduces key concepts in data management. It defines data, information, and knowledge; describes the data lifecycle and data management functions; and explains that data management is a shared responsibility between data stewards and professionals. It also provides an overview of the DAMA organization and their development of the DAMA-DMBOK Guide to establish a standard body of knowledge for the emerging data management profession.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
2. Objectives:
• 12.1 Introduction
• 12.2 Concepts and Activities
• 12.2.1 Data Quality Management Approach
• 12.2.2 Develop and Promote Data Quality Awareness
• 12.2.3 Define Data Quality Requirements
• 12.2.4 Profile, Analyze and Assess Data Quality
• 12.2.5 Define Data Quality Metrics
• 12.2.6 Define Data Quality Business Rules
• 12.2.7 Test and Validate Data Quality Requirements
• 12.2.8 Set and Evaluate Data Quality Service Levels
• 12.2.9 Continuously Measure and Monitor Data Quality
• 12.2.10 Manage Data Quality Issues
• 12.2.11 Clean and Correct Data Quality Defects
• 12.2.12 Design and Implement Operational DQM Procedures
3. Objectives:
• 12.3 Data Quality Tools
• 12.3.1 Data Profiling
• 12.3.2 Parsing and Standardization
• 12.3.3 Data Transformation
• 12.3.4 Identify Resolution and Matching
• 12.3.5 Enhancement
• 12.3.6 Reporting
4. 11 Meta-data Management
• Data Quality Management(DQM) is the tenth Data
Management Function in the Data Management
framework in Chapter 1.
• ninth data management function that interacts with and
influenced by Data Governance function.
• In this Chapter, we will define the meta-data
Management Function and Explains the Concepts and
Activities involved.
5. 12.1 Introduction:
• DQM is critical process mandate IT function blend data sources, crate gold
data copies, populate data and integrate it. “should be supported”.
• Data Quality program is necessary to provide an economic solution to
improved data quality and integrity.
• This program have more than just correcting data:
• Involve managing the lifecycle for data Creation, transformation, and
transmission to ensure information meet the needs to data consumers within the
organization.
• Processes begins with identifying the business needs, then Determine the
best ways to measure, monitor, control, and report on the quality of data,
and then notify data stewards to take corrective action.
• Continuous process for defining the parameters of acceptable levels of
data quality to meet business needs, and then ensure its meet these
levels.
• DQM analysis involve: instituting inspection and control processes to
parsing, standardization, cleansing, and consolidation when necessary.
• Lastly, incorporates issues tracking as a way of monitoring compliance with
defined data quality SLAs.
6.
7. 12.2 Concepts and Activities
• Data quality expectations provide the inputs necessary to define
the data quality framework.
• The framework includes defining the requirements, inspection
policies, measures, and monitors that reflect changes in data
quality and performance.
• These requirements reflect three aspects of business data
expectations:
• Manner to record the expectation in business rules.
• A way to measure the quality of data within that dimension.
• An acceptability threshold.
8. 12.2.1 Data Quality Management Approach
• The deming cycle is the general approach for DQM. “Figure 12.2”
• Propose a problem-solving model known as “Plan-do-study-act”.
• When applied to data quality within the constraints of defined data
quality SLAs, it involves:
• Planning for the assessment of the current state and identification of key
metrics for measuring data quality.
• Deploying processes for measuring and improving the quality of data.
• Monitoring and measuring the levels in relation to the defined business
expectations.
• Acting to resolve any identified issues to improve data quality and better meet
business expectations.
10. 12.2.1 Data Quality Management Approach
• The DQM Cycle:
• Begins by identifying the data issues that are critical to achieve with business
objectives, business requirements for data quality, key data quality
dimensions, business rules critical to ensure high quality data.
• In the plan stage, assess the scope of known issues, determining the cost and
impact of the issues, and evaluating alternatives for addressing them.
• In the deploy stage, profile the data and institute inspections and monitors to
identify data issues when they occur. Fixed flawed process.
• The monitor stage, is for actively monitoring the quality of data as measured
against the defined business rules. Defined thresholds for acceptability.
• The act stage, taking action to address and resolve emerging data quality
issues.
• New cycle begin as new data sets come under investigation, or as
new data quality requirements are identified for existing data sets.
11. 12.2.1 Data Quality Management Approach
• The DQM Cycle:
• Begins by identifying the data issues that are critical to achieve with business
objectives, business requirements for data quality, key data quality
dimensions, business rules critical to ensure high quality data.
• In the plan stage, assess the scope of known issues, determining the cost and
impact of the issues, and evaluating alternatives for addressing them.
• In the deploy stage, profile the data and institute inspections and monitors to
identify data issues when they occur. Fixed flawed process.
• The monitor stage, is for actively monitoring the quality of data as measured
against the defined business rules. Defined thresholds for acceptability.
• The act stage, taking action to address and resolve emerging data quality
issues.
• New cycle begin as new data sets come under investigation, or as
new data quality requirements are identified for existing data sets.
12. 12.2.2 Develop and Promote Data Quality Awareness
• Promoting data quality awareness is essential to ensure buy-in of
necessary stakeholders in the organization.
• Awareness include:
• Relating material impacts to data issues
• Ensuring systematic approaches to regulators and oversight of the quality of
organizational data.
• Socializing the concept that data quality problems cannot be solely addressed by
technology solutions.
• establish data governance framework for data quality. tasks :
• Engaging business partners who will work with the data quality team and
champion the DQM program.
• Identifying data ownership roles and responsibilities
• Assigning accountability and responsibility for critical data elements and DQM
• Identify key data quality areas to address and directive to the organization
around these key areas.
• Synchronizing data elements used across the lines of business and providing clear,
unambiguous definitions, use of value domains, and data quality rules.
• Continuously reporting on the measured levels of data quality.
• Introducing the concepts of data requirements analysis as part of the overall
system development life cycle.
• Trying high quality data to individual performance objectives.
13. 12.2.2 Develop and Promote Data Quality Awareness
• The Data Quality Oversight Board is council or roles accountable for
Policies and procedures for oversight of the data quality community.
• The Guidance provided includes:
• Setting priorities for data quality.
• Developing and maintain standards for data quality.
• Reporting relevant measurements of enterprise-wide data quality.
• Providing guidance that facilitates staff involvement.
• Establishing communications mechanisms for knowledge sharing.
• Developing and applying certification and compliance policies.
• Monitoring and reporting on performance.
• Identifying opportunities for improvements and building consensus for
approval.
• Resolving variations and conflicts.
• The constituent participant work together to define Data quality
strategy and framework:
• Develop, formalize, and approve information policies, Data quality standards
and protocols, certify line-of-business conformance to the desired level of
business user expectations.
14. 12.2.3 Define Data Quality Requirements
• Quality of data should understood within context of “fitness for use”
• Data quality requirements are often hidden within defined business
policies.
• Incremental detailed review and iterative refinement of business
policies help identify those requirements. “data quality rules”
• Incremental detailed review steps include:
1. Identifying key data components associated with business policies.
2. Determining how identified data assertions affect the business.
3. Evaluating how data errors are categorized within a set of data quality
dimensions.
4. Specifying the business rules that measure the occurrence of data errors.
5. Providing a means for implementing measurement processes that asses
conformance to those business rules.
15. 12.2.3 Define Data Quality Requirements
• The dimensions of data quality segment the business rules through
measurements such as “ data value, data element, data record and data
table”. These Dimensions include:
• Accuracy: refers to the degree that data correctly represents the ‘real-life’ entities
they model.
• Completeness: rules indicates that certain attributes always have assigned values
in a data set. All appropriate rows in dataset are present.
• Consistency: refers to ensuring that data values in one data set are consistent
with values in another data set.
• Currency: refers to the degree to which information is current with the world that
it models.
• Precision: refers to the level of detail of the data element.
• Privacy: refers to the need for access control and usage monitoring.
• Reasonableness: used to consider consistency expectations relevant within
specific operational contexts.
• Referential Integrity: is condition exists when all intended references from data in
one column in another column of the same or different table is valid.
• Timeless: refer to the time expectation for accessibility and availability of
information.
• Uniqueness: no entity exists more than once within the data set.
• Validity: refers to whether data instances are stored, exchanged, or presented in a
format consistent with the domain of values, and consistent with other similar
attribute values.
16. 12.2.4 Profile, Analyze and Assess Data Quality
• To define quality metrics, two approaches to perform assessments.
• Bottom-up assessment of existing data quality issues: involves inspection and
evaluation of the data sets . Highlight potential issues base on the results of
automated processes, such as frequency analysis, duplicate analysis..etc.
• Top-down approach to data quality assessment involves engaging business
users to document their business processes and the corresponding critical
data dependencies.
• Data quality analyst can assess the kinds of business impacts that
are associated with data issues. Steps of analysis process are:
• Identify a data set for review
• Catalog the business uses of that data set.
• Subject the data set to empirical analysis using data profiling tools and
techniques.
• List all potential anomalies.
• For each anomalies:
• Review the anomaly with SME to determine if it represents a true data flaw.
• Evaluate potential business impacts
• Prioritize criticality of important anomalies in preparation for defining data
quality metrics.
17. 12.2.5 Define Data Quality Metrics
• In DQM, Metrics occurs as part of the strategy/ design/ plan step in
order to implement the function in an organization.
• The characteristics of information quality metrics are:
• Measurability: data quality metric must be measurable, quantifiable whin
discrete range.
• Business relevance: the value of the metric is limited if it cannot be related to
some aspect of business operations or performance.
• Acceptability: the quality of data meets business expectations on specified
acceptability thresholds.
• Accountability / Stewardship: the metric indicates that the quality doesn’t
meet expectations and taking appropriate corrective action is required.
• Trackability: Quantifiable metrics enable an organization to measure data
quality improvement over time.
18. 12.2.5 Define Data Quality Metrics
The process for defining data quality metrics is summarized as:
1. Select one of the identified critical business impacts.
2. Evaluate the dependent data elements, and data create and update
processes associated with that business impact.
3. For each data element, list any associated data requirements.
4. For each data expectation, specify the associated dimension of data quality
and one or more business rules to use to determine conformance of the
data expectation.
5. For each selected business rule, describe the process for measuring
conformance.
6. For each business rule, specify an acceptability threshold.
• The result of Data Quality Mertics:
• Set of measurement processes provide raw data quality scores that can roll
up to quantify conformance to data quality expectations.
• Measurements do not meet the specified acceptability thresholds indicate
nonconformance. Data remediation is necessary.
19. 12.2.6 Define Data Quality Business Rules
• The process of instituting the measurement of conformance to specific
business rules requires definition. Monitoring conformance to theses
business rules requires:
• Segregating data values, records, and collections of records that do not meet
business needs from the valid ones.
• Generating a notification event alerting a data steward of a potential data quality
issue.
• Establishing an automated or event driven process for aligning or possibly
correcting flawed data within business expectations.
• First process uses assertions of expectations of the data. Use templates to
specify these business rules, such as:
• Value Domain membership
• Definitional Conformance
• Range conformance
• Format Compliance
• Mapping Conformance
• Value presence and record completeness
• Consistency rules
• Accuracy verification
• Uniqueness verification
• Timeliness validation
20. 12.2.7 Test and Validate Data Quality Requirements
• Data Profiling tools analyze data to find potential anomalies.
• Most of these tools allow data analysts to :
• Define data rules for validation
• Assessing frequency distributions and corresponding measurements
• Applying the defined rules against the data sets.
• Reviewing the results, and verifying data non-conformant incorrect.
• Review business rules with business clients to ensure these rules
corresponding business requirements.
• Use data rules to characterizing data quality levels of conformance
with an objective measure of data quality.
• Organization can distinguish those records that conform to defined data
quality expectations and those that do not.
21. 12.2.8 Set and Evaluate Data Quality Service Levels
• Data quality inspection and monitoring are used to measure and
monitor compliance with defined data quality rules.
• SLAs Specify the organization’s expectations for response and
remediation.
• There is expectation that the operational procedures will provide a
scheme for remediation of the root cause within an agreed-to
timeframe.
• Operational data quality control defined in data quality SLA, includes
• The data elements covered by the agreement.
• The business impacts associated with data flaws.
• The data quality dimensions associated with each data element.
• The expectations for quality for each data element for each of the identified
dimensions in each application in the value chain.
• The methods for measuring against those expectations.
• The acceptability threshold for each measurement.
• The individual(s) to be notified in case the acceptability threshold is not met.
The timelines and deadlines for expected resolution or remediation of the
issue.
• The escalation strategy and possible rewards and penalties when the
resolution times are met.
22. 12.2.8 Set and Evaluate Data Quality Service Levels
• The data quality SLA also defines the roles and responsibilities
associated with performance of operational data quality procedures.
• provide reports on the conformance to the defined business rules.
• Monitoring staff performance in reacting to Data quality incidents.
• Data stewards and operational data quality staff when they
uploading the level of data quality service,
• They should take data quality SLA constraints into consideration and connect
data quality to individual plans.
• An escalation process when existing issues are not addressed within
the specified resolution times and communicate non-observance of
the level of service up the management chain.
• Data quality SLA establishes the time limits for notification generation, the
names of those in that management chain, when escalation needs to occur.
• Data quality team can monitor compliance of the data to the
business expectations, as how data quality team performs on the
procedures associated with data errors.
23. 12.2.9 Continuously Measure and Monitor Data Quality
• The Operational DQM Procedures depend on available services for
measuring and monitoring the quality of data.
• For conformance to data quality business rules, two contexts for
control and measurement exist: In-stream and batch
• divided for measurements at three levels: Granularity, namely data element
value, data instance or record, and data set, making six possible measures.
• Perform batch activities on collections of data instances assembled in a data
set, likely in persistent storage.
• Continuous monitoring Provided by incorporating control and
measurement process into information processing flow.
• The only in-stream points are when full data sets hand off between
processing stages.
• Incorporate data quality rules using the techniques detailed in Table
12.1
• The Results of incorporating control and measurement processes,
operational procedures and reporting framework, enable
monitoring of the levels of data quality.
24. 12.2.10 Manage Data Quality Issues
• Data quality incident reporting: system have a capability of
reporting and tracking data quality incident and activities for
researching and resolving, through evaluation logs, diagnosis,
subsequent actions associated with data quality events.
• Many organizations already have incident reporting systems for
tracking and managing software, hardware and network issues.
• Incorporating data quality incident tracking focuses on organizing
the categories of data issues into the incident hierarchies.
• Also focus on training staff to recognize when data issues appear
and how they are to be classified, logged, and tracked according to
data quality SLA. “next slide directives for applying”
• Benefits of implementing data quality issues tracking system:
• Information and knowledge sharing can improve performance and reduce
duplication of effort.
• Analysis of all the issues will help data quality team members determine any
repetitive patterns, their frequency, and potential source of issues.
• Depending on the governance, data quality SLA reporting can be
monthly, quarterly or annually, particularly in cases focused on
rewards and penalties.
25. 12.2.10 Manage Data Quality Issues
• The steps involve some or all of incident reporting system directives:
• Standardize data quality issues and activities: valuable per contra variety of
data issues concepts and terms, useful for classification and reporting, easier
to measure the volume, identify patterns and interdependencies between
systems and participants.
• Provide an assignment process for data issues: the assignment process
should be driven within the incident tracking system, by suggesting those
individuals with specific areas of expertise.
• Manage issue escalation procedures: helps expedite efficient handling and
resolution of data issues.
• Mange data quality resolution workflow: to track progress with issues
diagnosis and resolution.
26. 12.2.11 Clean and Correct Data Quality Defects
• The use of business rules for monitoring conformance to
expectations leads to two operational activities:
• Determine and eliminate the root cause of the introduction of errors.
• Isolate the data items that are incorrect, and bringing data into conformance
with expectations.
• Three general ways to perform data correction:
• Automated correction: submit and cleansing data using a collection of data
transformations, standardizations, normalizations, and corrections without
manual intervention.
• Manual directed correction: use automated tools to cleanse and correct data
but require manual review before committing the corrections to persistent
storage.
• Manual Correction: Data stewards inspect invalid records and determine the
correct values, make the corrections, and commit the updated records.
27. 12.2.12 Design and Implement Operational DQM Procedures
• Defined rules for validation of data quality provides a mean of
integrating data inspection into a set of operational procedures
associated with active DQM.
• Integrate the data quality rules into application services or data
services that supplement the data life cycle, either through the
introduction of data quality tools and technology.
• The use of rules engines and reporting tools for monitoring and
reporting, or custom-developed applications for data quality
inspection.
• The operational framework requires these services to be available to
the applications and data services, and the results presented to the
data quality team members.
28. 12.2.12 Design and Implement Operational DQM Procedures
• The Operational framework required from the team design and
implement detailed procedures for operationalizing 4 activities:
Activities The analyst must:
1- Inspection and monitoring • Review the measurements and metrics
• Determine if any acceptability threshold exist not met.
• Crate a new data quality incident report
• Assign the incident for diagnosis and evaluation
2- Diagnosis and evaluation of
remediation alternatives
• Review data issue, track error from the beginning
• Evaluate change environment cause errors
• Evaluate process issues contributed to incident.
• Determine any external data provider issues affect
data quality.
• Evaluate alternatives for addressing the issue.
• Provide updates to data quality incident tracking
system.
29. 12.2.12 Design and Implement Operational DQM Procedures
• The Operational framework required from the team design and
implement detailed procedures for operationalizing 4 activities:
Activities The analyst must:
3 – Resolving the issue • Assess the relative costs and metrics of the
alternatives.
• Recommend one of the alternatives
• Provide a plan for resolution, include modifying
the process and correcting flawed data.
• Implement the resolution.
• Provide updates to the data quality incident
tracking system.
4 – Reporting • Data quality scorecard
• Data quality trends
• Data quality performance
• These report align to metrics in data quality SLA,
so that the areas important to the achievement of
the data quality SLA are at some level, in internal
team reports.
30. 12.2.13 Monitor Operational DQM Procedures and Performance
• Accountability is critical in governance data quality control.
• All issues must be assigned to some number of individuals, groups,
departments or organizations.
• The incident tracking system will collect performance data relating
to issue resolution, work assignments, volume of issues, frequency
of occurrence, as well as the time to respond, diagnose, plan a
solution, and resolve issues.
• These metrics can provide:
• Valuable insights into the effectiveness of the current workflow
• systems and resource utilization
• important management data points that can drive continuous operational
improvement for data quality control.
31. 12.2.13 Monitor Operational DQM Procedures and Performance
• Accountability is critical in governance data quality control.
• All issues must be assigned to some number of individuals, groups,
departments or organizations.
• The incident tracking system will collect performance data relating
to issue resolution, work assignments, volume of issues, frequency
of occurrence, as well as the time to respond, diagnose, plan a
solution, and resolve issues.
• These metrics can provide:
• Valuable insights into the effectiveness of the current workflow
• systems and resource utilization
• important management data points that can drive continuous operational
improvement for data quality control.
32. 12.3 Data Quality Tools
• DQM employs well-established tools and techniques:
• Focus from empirically assessing the quality of data through data analysis
• Normalization of data values in accordance with defined business rules
• Ability to identify and resolve duplicate records into a single representation.
• Schedule inspections and change on a regular basis.
• Data Quality tools can be segregated into four categories of
activities:
• Analysis
• Cleansing
• Enhancement
• Monitoring
• The principles tools used are:
• Data Profiling
• Parsing and standardization
• Data transformation
• Identity resolution and matching
• Enhancement
• Reporting
33. 12.3 Data Quality Tools
• DQM employs well-established tools and techniques:
• Focus from empirically assessing the quality of data through data analysis
• Normalization of data values in accordance with defined business rules
• Ability to identify and resolve duplicate records into a single representation.
• Schedule inspections and change on a regular basis.
• Data Quality tools can be segregated into four categories of
activities:
• Analysis
• Cleansing
• Enhancement
• Monitoring
• The principles tools used are:
• Data Profiling
• Parsing and standardization
• Data transformation
• Identity resolution and matching
• Enhancement
• Reporting
34. Data Quality Tools
12.3.1 Data Profiling
• Data Profiling is a set of algorithms to analysis and discovery data
values, for two purposes:
• Statistical analysis and assessment of the quality of data values Whitin data
set.
• Exploring relationships that exist between value collections within and across
data sets.
• Can summarize key characteristics of the values within each column,
such as the minimum, maximum and average values.
• Data profiling analyzes and assesses data anomalies after perform
cross-column and inter-table analysis.
• Can also proactively test against a set of defined or discovered
business rules to supports Data Quality reporting Processes.
35. Data Quality Tools
12.3.2 Parsing and Standardization
• Data parsing tools enable data analyst to define sets of patterns that
feed into rules engine used to distinguish between valid and invalid
data values.
• Extract the action match these patterns, rearrange Tokens in standard
representation when parsing a valid pattern.
• When invalid pattern recognized, attempt to transform value into one that
meets expectations.
• Parsing and standardizing data values is valuable in case of confusion
and ambiguity.
• Example a standard format of telephone number segments into ( area code,
exchange, and line number)
• The ability to build patterns contributed by human ability and
recognitions.
• Pattern-based parsing can automate the recognition and subsequent
standardization of meaningful value components.
36. Data Quality Tools
12.3.3 Data Transformation
• Is target architecture that transform the flawed data identified by
errors, trigger data rules into acceptable format .
• Perform standardization by mapping data from some source pattern
into a corresponding target representation.
• Example: “Customer name” may represented in thousands of different forms.
• Data transformation builds on these types of standardization
techniques.
• By mapping data values in their original formats and patterns into a target
representation.
• Parsed components of a pattern are subjected to rearrangement, corrections,
or any changes as directed by the rules in the knowledge base.
• Standardization is a special case of transformation, employing rules
that capture context, linguistics, and idioms recognized as common
over time, though repeated analysis by the rules analyst or tool
vendor.
37. Data Quality Tools
12.3.4 Identity Resolution and Matching
• Strategic initiatives performed to identified and evaluate “similarity”
of records.
• Common data quality problem involves two side of the same coin:
• Multiple data instances that actually refer to the same real-world entity.
• The perception, by analyst or application, that a record does not exist for a
real-world entity, when in fact it really does.
• The situations:
• Something introduced similar, yet variant representations in data values into
the system.
• Slight variation in representation prevents the identification of an exact
match of the existing record in the data set.
• Similarity analysis perform for both situations containing score
degree and comparison between records to recognize the variations
• Deterministic and probabilistic are the approach of matching.
• Deterministic defined patterns and rules for assigning weights and scores for
determine similarity.
• Probabilistic “alternative” relies on statistical techniques for assessing the
probability that pair of records represents the same entity.
38. Data Quality Tools
12.3.5 Enhancement
• Data enhancement is a method of adding value to information by
accumulating additional information about a base set of entities and
then merging all the sets to provide view of the data.
• Data parsing help in determining potential sources for added benefit
• Example of data enhancement:
• Time/Data stamps: document the time and date that data items are created,
modified, or retired, which can help to track historical data events.
• Auditing Information: document data lineage, which also is important for
historical tracking as well as validation.
• Contextual Information: There is several geographic enhancements possible,
such as address standardization and geocoding, include regional coding,
municipality, neighborhood mapping, latitude / longitude pairs, or other
kinds of location-based data.
• Demographic Information: demographic enhancements such as customer
age, marital age, marital status , gender, income, or business entities, annual
revenue, number of employees, size of occupied space, etc.
• Psychographic Information: use these kinds of enhancements to segment the
target population by specified behaviors, such as product and brand
preferences, organization memberships, etc.
39. Data Quality Tools
12.3.6 Reporting
• All these following are supported by good reporting:
• Inspection and monitoring of conformance to data quality expectations.
• Monitoring performance of data stewards conforming to data quality SLAs.
• Workflow processing for data quality incidents.
• Manual oversight of data cleansing and correction
• It is optimal to have a user interface to report results associated with
data quality measurement, metrics, and activity.
• It is wise to incorporate visualization and reporting for standard
reports, scorecards, dashboards, and for provision of ad hoc queries
as part of the functional requirements for any acquired data quality
tools.