CDMP - Certified Data
Management Professional
DMBOK V.2
Trainer :
Hery Purnama, SE., MM.
MCP, PMP, ITILF, CISA, CISM, CISSP, CDMP, COBIT, CTFL,
TOGAF9
Mr. Hery Purnama is an IT Practitioner, Lecturer and IT
Consultant in Bandung, with more than 20 years of
experience in various IT projects with specialization in
System Development, Bigdata, Data Science, Internet of
Things, ISO, Project Management, IT Service Management,
I.S Governance, InfoSec Governance, Data Governance ,
Enterprise Architect , Quality Assurance, and IT Audit
Until now he is still actively working as a consultant and also
a trainer with clients from the Government, BUMN, Mining,
Industrial Banking, Telecommunications.
Some of the international certifications he holds are:
MCP, PMP, ITILF, COBIT, CGEIT, CDMP, CISA, CISM, CISSP, CTFL,
TOGAF 9
Exam Overview
CDMP - Certified Data Management Professional
DMBOK V.2
CDMP - Requirements
100 Questions Covers 14 Topics of DMBOK2
1. Data Management Process – 2%
2. Data Ethics – 2%
3. Data Governance – 11%
4. Data Architecture – 6%
5. Data Modelling and Design – 11%
6. Data Storage and Operations – 6%
7. Data Security – 6%
8. Data Integration and Interoperability – 6%
9. Document and Content Management – 6%
10. Master and Reference Data Management – 10%
11. Data Warehousing and Business Intelligence – 10%
12. Metadata Management – 11%
13. Data Quality – 11%
14. Big Data – 2%
Tips
Registration
• https://cdmp.info/exams/
Chapter 1 : Data Management
• LET’S GET EXERCISE
•https://wato.xyz/cdmppractice1
passcode : cdmp
Introduction
• Data Management is the development, execution, and supervision of
plans, policies, programs, and practices that deliver, control, protect,
and enhance the value of data and information assets throughout
their lifecycles.
• Data Management Professional is any person who works in any facet
of data management
• Data is the ‘currency’, the ‘life blood’, and even the ‘new oil’ of the
information economy.
• Business Driver : Data Asset Value
• Data Management Goals
Essential Concept
• VARIOUS DATA DEFINITIONS :
• data emphasize its role in representing facts about the
world. (Common)
• Data information that has been stored in digital form (IT)
• Facts : Data is a mean representation  Need Context
(Metadata)
Essential Concept
• DATA VS INFORMATION :
• DATA PYRAMID DIKW :
1.DATA (RAW) ->
2.INFORMATION (WHO,WHAT,WHEN, WHERE) ->
3.KNOWLEDGE (HOW) ->
4.WISDOM (WHY)
KNOWLEDGE & WISDOM
DATA & INFORMATION GOALS
DKIW
Example Data vs Information
(“Here is a sales report for the last quarter [information]. It is based on
data from our data warehouse [data]. Next quarter these results [data]
will be used to generate our quarter-over-quarter performance measures
[information]”)
Essential Concept
• Data as an Organizational Asset (Economic
Resources :
shows up as an item on the Profit and Loss
Statement (P&L) ,
& to make more effective decisions and to
operate more efficiently
• Data Management Principles >
• Data Management Challenges (Differs,
Valuation, Quality, Planning for Better Data,
Metadata and Meta management , Cross
functionality..)
Data Lifecycle
The focus of data management on the data lifecycle
IMPLICATIONS :
•Creation and usage are the most critical points in the data lifecycle
•Data Quality must be managed throughout the data lifecycle
•Metadata Quality must be managed through the data lifecycle
•Data Security must be managed throughout the data lifecycle
•Data Management efforts should focus on the most critical data
Data and Risk
• Low Quality Data (Inaccurate, Incomplete, Out of Date)
• Missunderstood,Missused
“Information Gaps : the difference between what we know and what
we need to know to make an effective decision.
Information gaps represent enterprise liabilities with potentially
profound impacts on operational effectiveness and profitability. “
• The increased role of information as an organizational asset across all
sectors has led to an increased focus by regulators and legislators on
the potential uses and abuses of information
Data Management Strategy
The components of a data management strategy :
•A compelling vision for data management
•A summary business case for data management, with selected examples
•Guiding principles, values, and management perspectives
•The mission and long-term directional goals of data management
•Proposed measures of data management success
•Short-term (12-24 months) Data Management program objectives that are SMART
(specific, measurable,actionable, realistic, time-bound)
•Descriptions of data management roles and organizations, along with a summary of their
responsibilitiesand decision rights
•Descriptions of Data Management program components and initiatives
•A prioritized program of work with scope boundaries
•A draft implementation roadmap with projects and action items
Data Management Framework
Strategic Alignment Model Amsterdam Information Model
Data Management Framework
DMBOK Pyramid (Aiken)
DAMA Data Management Framework Evolved
DAMA WHEEL EVOLVED
• LET’S GET EXERCISE
•https://wato.xyz/cdmppractice1
passcode : cdmp
1
D
2
A,B,D,E
3
A
4
B,C,D,E,F
5
A,B,E,F
6
A
7
A
8
A
Chapter 2 : Data Handling Ethics
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice2
passcode : cdmp
Introduction
• Data handling ethics are concerned with how to procure, store,
manage, use, and dispose of data in ways that are aligned with ethical
principles.
• Handling data in an ethical manner is necessary to the long-term
success of any organization that wants to get value from its data.
Core Concept
• Impact on people: Because data represents characteristics of
individuals and is used to make decisions that affect people’s lives,
there is an imperative to manage its quality and reliability.
• Potential for misuse: Misusing data can negatively affect people and
organizations, so there is an ethical imperative to prevent the misuse
of data.
• Economic value of data: Data has economic value. Ethics of data
ownership should determine how that value can be accessed and by
whom.
Context Diagram >
• There is an ethical
imperative not only to
protect data, but also to
manage quality
Business Driver
• Ethical data handling can increase the trustworthiness of an
organization and the organization’s data and process outcomes.
• This can create better relationships between the organization and its
stakeholders.
“The emerging roles of Chief Data Officer, Chief Risk Officer, Chief Privacy Officer,
and Chief Analytics Officer are focused on controlling risk by establishing
acceptable practices for data handling “
Ethical Principles for Data
• Respect for Person : respects their dignity and autonomy as
human individuals
• Beneficence : do not harm; maximize possible benefits and
minimize possible harms.
• Justice : fair and equitable treatment of people
Principles Behind Data Privacy Law
Risks of Unethical Data Handling Practices
• Timing
• Misleading Visualizations
• Unclear Definitions or Invalid Comparisons
• Bias ( Data Collection for pre-defined result, Biased use of data collected,
Hunch and search, Biased sampling methodology, Context and Culture )
• Transforming and Integrating Data ( Limited knowledge of data’s origin
and lineage, Data of poor quality, Unreliable Metadata, No
Documentation)
• Obfuscation / Redaction of Data ( Data aggregation, Data Marking, Data
Masking)
Establishing an Ethical Data Culture
STEPS :
• Review Current State Data Handling
Practices
• Identify Principles, Practices, and Risk
Factors
• Create an Ethical Data Handling Strategy
and Roadmap
• Adopt a Socially Responsible Ethical Risk
Model
Data analytics data handling ethics
Data Ethics and Governance
• Oversight for the appropriate handling of data falls under both data
governance and legal counsel.
• Keep up-to-date on legal changes
• CDMP formal code of ethics
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice2
passcode : cdmp
Chapter 3 : Data Governance
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice3
passcode : cdmp
GOVERNANCE VS MANAGEMENT
DATA GOVERNANCE = Ensure The Data Managed Properly
DATA MANAGEMENT = ensure an organization gets value out of its data
“ SCOPE & FOCUS DATA GOVERNANCE PROGRAMS 
OREGANITATION NEEDS “
Context Diagram
DATA GOVERNANCE PROGRAMS :
• Strategy: Defining, communicating, and driving execution of Data Strategy and Data
Governance Strategy
• •Policy: Setting and enforcing policies related to data and Metadata management, access,
usage, security, and quality
• •Standards and quality: Setting and enforcing Data Quality and Data Architecture standards
• •Oversight: Providing hands-on observation, audit, and correction in key areas of quality,
policy, and data management (often referred to as stewardship)
• •Compliance: Ensuring the organization can meet data-related regulatory compliance
requirements
• •Issue management: Identifying, defining, escalating, and resolving issues related to data
security, data access, data quality, regulatory compliance, data ownership, policy, standards,
terminology, or data governance procedures
• •Data management projects: Sponsoring efforts to improve data management practices
• •Data asset valuation: Setting standards and processes to consistently define the business
value of data assets
Goals
1. Enable an organization to
manage its data as an asset.
2. Define, approve,
communicate, and implement
principles, policies, procedures,
metrics, tools, and
responsibilities for data
management.
3. Monitor and guide policy
compliance, data usage, and
management activities.
DG Business Driver
• Common Driver : regulatory compliance, especially for heavily
regulated industries
• Focus on reducing risks or improving processes :
Reducing Risk : General risk management, Data security , Privacy
Improving Processes: Regulatory compliance, Data quality improvement,
Metadata Management. Efficiency in development projects (SDLC) , Vendor
management
DG - Improving Processes
Data Governance vs IT Governance
Data governance is separate from IT governance.
• IT governance makes decisions about IT investments, the IT
application portfolio, and the IT project portfolio – in other words,
hardware, software, and overall technical architecture.
• IT governance aligns the IT strategies and investments with enterprise
goals and strategies.
• The COBIT (Control Objectives for Information and Related
Technology) framework provides standards for IT governance,
DG Goals and Principles
Data Governance is to enable an organization to manage data as an
asset. DG Program must be :
DG Essential Concept
Data governance represents an inherent separation of duty between
oversight and execution
Data Governance Organization
Data Governance Operating Model Types
• Centralized
• Replicated
• Federated
Data Stewardship
• Data Stewardship is the most common label to describe
accountability and responsibility for data and processes that ensure
effective control and use of data assets
• Core activities : Creating and managing core Metadata,
Documenting rules and standards, Managing data quality issues,
Executing operational data governance activities
Types of Data Stewards
> Types of Data Stewards
Develop Data Governance Strategy
Chapter 4 : Data Architecture
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice4
passcode : cdmp
What is Architecture ?
• Architecture refers to an organized arrangement of component
elements intended to optimize the function, performance, feasibility,
cost, and aesthetics of an overall structure or system
Data Architecture Perspective
Data Architecture will be considered from the following perspectives:
•Data Architecture outcomes
•Data Architecture activities
•Data Architecture behavior
Together, these three form the essential components of Data
Architecture.
Introduction
• The most detailed Data Architecture design document is a formal
enterprise data model, containing data names, comprehensive data
and Metadata definitions, conceptual and logical entities and
relationships, and business rules.
• Physical data models are included, but as a product of data modeling
and design, rather than Data Architecture.
Business Driver
Goals
1. Identify data storage and
processing requirements.
2. Design structures and plans to
meet the current and long-term
data requirements of the
enterprise.
3. Strategically prepare
organizations to quickly evolve
their products, services, and data
to take advantageof business
opportunities inherent in
emerging technologies.
Enterprise Architecture Domains
Zachman’ Columns
• What (the inventory column): Entities used to build the architecture
• How (the process column): Activities performed
• Where (the distribution column): Business location and technology
location
• Who (the responsibility column): Roles and organizations
• When (the timing column): Intervals, events, cycles, and schedules
• Why (the motivation column): Goals, strategies, and means
Enterprise Architecture (Zachman)
Example
Enterprise Data Architecture
Enterprise Data Architecture defines standard terms and designs
for the elements that are important to the organization.
• Enterprise Data Model (EDM): The EDM is a holistic, enterprise-level,
implementation-independentconceptual or logical data model
providing a common consistent view of data across the enterprise.
• Data Flow Design: Defines the requirements and master blueprint for
storage and processing acrossdatabases, applications, platforms, and
networks (the components).
Project Development Method
• Waterfall methods: Understand the requirements and construct
systems in sequential phases as part ofan overall enterprise design.
• •Incremental methods: Learn and construct in gradual steps (i.e.,
mini-waterfalls). This method createsprototypes based on vague
overall requirements. The initiation phase is crucial;
• •Agile, iterative, methods: Learn, construct, and test in discrete
delivery packages (called ‘sprints’)that are small enough that if work
needs to be discarded, not much is lost.
Modeling Tool
Project Activities
• Define scope
• Understand business requirements
• Design
• Implement: (When buying, When reusing data, When building)
Chapter 5 : Data Modeling & Design
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice5
passcode : cdmp
Introduction
Context Diagram
Deliverables /Level of Detail:
- Conceptual Data Model
- Logical Data Model
- Physical Data Model
Goals and Principles
• The goal of data modeling is to confirm and document understanding of
different perspectives, which leads to applications that more closely align
with current and future business requirements, and creates a foundation
to successfully complete broad-scoped initiatives such as Master Data
Management and data governance programs.
• Confirming and documenting understanding of different perspectives
facilitates :
- Formalization
- Scope Definition
- Knowledge retention/documentation
Types of Data that are Modeled
• Category information: Data used to classify and assign types to
things.
• Resource information: Basic profiles of resources needed conduct
operational processes such asProduct, Customer, Supplier, Facility,
Organization, and Account.
• Business event information: Data created while operational
processes are in progress.
• Detail transaction information: Detailed transaction information is
often produced through point-of-sale systems (either in stores or
online).
• Data at Rest
Data Model Components
• Entity
• Relationship
• Domain
Arity of Relationships - Model component
Relational data in model Scheme
• A foreign key is used in physical and sometimes logical
relational data modeling schemes to represent a
relationship
Model Scheme
Domain – Model Component
In data modeling, a domain is the complete set of
possible values that an attribute can be assigned.
Object Oriented Model (UML)
• Time-based patterns are used when data values must be
associated in chronological order and with specific time
values.
Chapter 6 : Data Storage & Operations
• LET’S GET PRACTICE
•https://wato.xyz/cdmppractice678
passcode : cdmp
Introduction
• Data Storage and Operations includes the design, implementation,
and support of stored data, to maximize its value throughout its
lifecycle, from creation/acquisition to disposal.
• Data Storage and Operations includes two sub-activities:
1. Database Support
2. Database Support Technology
• Play Key Roles : DBA
Context Diagram
The goals of data storage
and operations include:
• Managing the
availability of data
throughout the data
lifecycle
• Ensuring the integrity
of data assets
• Managing the
performance of data
transactions
SLA
Service Level Agreement Principles Practice:
• The Service Level Agreement(SLA) can reflect DBA-recommended and
developer-accepted methods of ensuring data integrity and data
security. The SLA should reflect the transfer of responsibility from the
DBAs to the development team if the development team will be
coding their own database update procedures or data accesslayer.
• This prevents an ‘all or nothing’ approach to standards.
Procedural and Development DBAs
Procedural DBAs :
• Lead the review and administration of procedural database objects.
• Specializes in development and support of procedural logic controlled
and execute by the DBMS:
• Development DBAs focus on data design activities including creating
and managing special use databases
Database Architecture Types
Database Processing Types
CAP (BREWER’S THEOREM ) Consistency, Availability and Partition 
How Distribution System Closely match with :
• ACID (Atomicity, Consistency, Isolation, Durability)
• BASE (Basically Available, Soft State, Eventual Consistency)
ACID vs BASE
CAP
Other Knowledge Concerns
• MANAGE SLA
• MANAGE DATABASE ACCESS CONTROL
• MANAGE DATABASE PERFORMANCE
• MANAGE DATABASE BACKUP RECOVERY, REPLICATION
• MANAGE PHYSICAL STORAGE ENVIRONMENT &IMPLEMENTATION
• MANAGE DATASET & MIGRATION
• TOOLS & TECHNIQUE
• IMPLEMENTATION GUIDLINES
Chapter 7 : Data Security
Introduction
• Data Security includes the planning, development, and execution of
security policies and procedures to provide proper authentication,
authorization, access, and auditing of data and information assets.
Context Diagram
Business Driver
RISK
RISK CLASSIFICATION
- CRITICAL RISK DATA (CRD)
- HIGHT RISK DATA (HRD)
- MODERATE RISK DATA (MRD)
Confidential Data (REQUIREMENT)
Data Security Laws & Regulations
Eg. Regulations
Define Data Security Policy
Assess Current Security Risks
• The sensitivity of the data stored or in transit
• The requirements to protect that data, and
• The current security protections in place
Other Concerns
• Regulatory Requirements
• Data Security Standards
• Data Security Roles
• Tools & Technique
• Guidelines
Chapter 8 : Data Integration and
Interoperability
Introduction
Data Integration and Interoperability (DII) describes processes related to the movement
and consolidation of data within and between data stores, applications and organizations.
Org. Data Management Function Depend Data Management Area Depend
• Data migration and conversion
• Data consolidation into hubs or marts
• Integration of vendor packages into an
organization’s application portfolio
• Data sharing between applications and across
organizations
• Distributing data across data stores and data
centers
• Archiving data
• Managing data interfaces
• Obtaining and ingesting external data
• Integrating structured and unstructured data
• Providing operational intelligence and
management decision support
• Data Governance
• Data Architecture:
• Data Security:
• Metadata:
• Data Storage and Operations
• Data Modeling and Design
Essential Concepts
• Extract, Transform, and Load (ETL)
• Extract, Transform, and Load (ELT)
• LATENCY, REPLICATION …
Change Data Capture - Technique
Interaction Model
• Point-to-point (Pass Data Directly )
• Hub-and-spoke (Consolidates share data)
• Publish - Subscribe (System push data – Other System Pull data –
Distributed to subscriber)
DII Architecture Concepts
Application Coupling
Chapter 9 : Document & Content
Management
Introduction
• Document and Content Management entails controlling the capture,
storage, access, and use of data and information stored outside
relational databases
• In some Organizations unstructured data has a direct relationship to
structured data.
• Management decisions about such content should be applied
consistently.
Business Driver
• Regulatory compliance
• the ability to respond to litigation and e-discovery requests, and
business continuity requirements.
• Good records management can also help organizations become more
efficient
• Well-organized, searchable websites that result from effective
management of ontologies
• E-discovery is the process of finding electronic records that might
serve as evidence in a legal action.
Goals & Principles
ARMA International Principles - 2009
Generally Acceptable Recordkeeping Principles® (GARP)
• Principle of Accountability
• Principle of Integrity
• Principle of Protection
• Principle of Compliance
• Principle of Availability
• Principle of Retention
• Principle of Disposition
• Principle of Transparency
Essential Concepts
• Content : document is to content what a bucket is to water: a container. Content refers
to the data and information inside the file, document, or website.
• Controlled Vocabularies : is a defined list of explicitly allowed terms used to index,
categorize, tag, sort, and retrieve content through browsing and searching. : ,
• Documents and Records : Documents are electronic or paper objects that contain
instructions for tasks, requirements for how and when to perform a task or function,
and logs of task execution and decisions. Documents can communicate and share
information and knowledge. Examples of documents include procedures, protocols,
methods, and specifications. ,
• Data Map : is an inventory of all ESI data sources, applications, and IT environments
that includes the owners of the applications, custodians, relevant geographical
locations, and data types
• E-Discovery , etc.
Control Vocabulary
Create Content Handling Policies
Activities - Plan for Record Management
• Records management starts with a clear definition of what
constitutes a record.
• Managing electronic records requires decisions about where to store
current, active records and how to archive older records
Activities - Manage the Lifecycle
• Capture Records and Content : Capturing content is the first step to
managing it. Electronic content is often already in a format to be stored in
electronic repositories.
• Manage Versioning and Control : Formal, Revision, Custody
• Backup and Recovery : The document / record management system needs
to be included in the organization’s overall corporate backup and recovery
activities, including business continuity and disaster recovery planning.
• Manage Retention and Disposal : Effective document / records
management requires clear policies and procedures, especially regarding
retention and disposal of records.
• Audit Documents / Records : Document / records management requires
periodic auditing to ensure that the right information is getting to the right
people at the right time for decision-making or performing operational
activities
Manage Versioning and Control
ANSI Standard 859 has three levels of control of data:
• •Formal control requires formal change initiation, thorough
evaluation for impact, decision by achange authority, and full status
accounting of implementation and validation to stakeholders
• •Revision control is less formal, notifying stakeholders and
incrementing versions when a change isrequired
• •Custody control is the least formal, merely requiring safe storage
and a means of retrieval
Chapter 10 : Master Data Management
• LET’S TRY OUT (90 questions in 80 minutes )
•https://wato.xyz/cdmptryout
passcode : cdmp
Introduction
• In any organization, certain data is required across business areas,
processes, and systems.
• The overall organization and its customers benefit if this data is
shared and all business units can access the same customer lists,
geographic location codes, business unit lists, delivery options, part
lists, accounting cost center codes, governmental tax codes, and
other data used to run the business.
Business Driver
Master Data Management
• Meeting organizational data requirements
• Managing data quality
• Managing the costs of data integration
• Reducing risk
The drivers for managing Reference Data are similar. Centrally managed
Reference Data enables organizations to:
• Meet data requirements for multiple initiatives and reduce the risks and
costs of data integration through use of consistent Reference Data
• Manage the quality of Reference Data
Differences Between Master and Reference Data
• Different types of data play different roles within an organization. They
also have different management requirements.
• Six-layer taxonomy of data that includes Metadata, Reference Data,
enterprise structure data, transaction structure data, transaction
activity data, and transaction audit data (Chisholm, 2008; Talburt and
Zhou, 2015).
• Master Data as an aggregation of Reference Data, enterprise structure
data, and transaction structure data
Master Data - Trusted Source, Golden Record
• A Trusted Source is recognized as the ‘best version of the truth’ based
on a combination of automated rules and manual stewardship of data
content.
• A trusted source may also be referred to as a Single View, 360° View.
• Any MDM system should be managed so that it is a trusted source.
Within a trusted source, records that represent the most accurate
data about entity instances can be referred to as Golden Records.
• ‘Golden Record’ does not mean that it is always a 100% complete and
100% accurate representation of all the entities within the
organization (especially in organizations that have multiple SOR’s
supplying data to the Master Data environment).
Data Sharing Architecture
Three basic approaches to implementing a Master Data hub
environment :
• A Registry
• In a Transaction Hub
• A Consolidated
Party Master Data
• Party Master Data includes data about individuals, organizations, and
the roles they play in business relationships.
• In the commercial environment, parties include customers,
employees, vendors, partners, and competitors.
• In the public sector, parties are usually citizens
• Customer Relationship Management (CRM) systems manage Master
Data about customers. The goal of CRM is to provide complete and
accurate information about each and every customer.
Master Data Management Key Processing Steps
• Key processing steps for MDM includes data model management;
data acquisition; data validation, standardization, and enrichment;
entity resolution; and stewardship and sharing.
• Product Master Data can focus on an organization’s internal
products and services or on industry-wide (including competitor)
products and services.
• Different types of product Master Data solutions support different
business functions.
Entity Resolution and Identifier Management
Entity resolution is the process of determining
whether two references to real world objects refer
to the same object or to different objects (Talburt,
2011).
Entity resolution is a decision-making process
Entity Resolution and Identifier Management
Matching, or candidate identification, is the process of identifying how
different records may relate to a single entity. The risks with this
process are:
• False positives: Two references that do not represent the same entity
are linked with a single identifier. This results in one identifier that
refers to more than one real-world entity instance.
• False negatives: Two references represent the same entity but they
are not linked with a single identifier. This results in multiple
identifiers that refer to the same real-world entity when each
instanceis expected to have one-and-only-one identifier.
CDMP SLIDE TRAINER .pptx
CDMP SLIDE TRAINER .pptx
CDMP SLIDE TRAINER .pptx

CDMP SLIDE TRAINER .pptx

  • 1.
    CDMP - CertifiedData Management Professional DMBOK V.2 Trainer : Hery Purnama, SE., MM. MCP, PMP, ITILF, CISA, CISM, CISSP, CDMP, COBIT, CTFL, TOGAF9
  • 2.
    Mr. Hery Purnamais an IT Practitioner, Lecturer and IT Consultant in Bandung, with more than 20 years of experience in various IT projects with specialization in System Development, Bigdata, Data Science, Internet of Things, ISO, Project Management, IT Service Management, I.S Governance, InfoSec Governance, Data Governance , Enterprise Architect , Quality Assurance, and IT Audit Until now he is still actively working as a consultant and also a trainer with clients from the Government, BUMN, Mining, Industrial Banking, Telecommunications. Some of the international certifications he holds are: MCP, PMP, ITILF, COBIT, CGEIT, CDMP, CISA, CISM, CISSP, CTFL, TOGAF 9
  • 3.
    Exam Overview CDMP -Certified Data Management Professional DMBOK V.2
  • 4.
  • 5.
    100 Questions Covers14 Topics of DMBOK2 1. Data Management Process – 2% 2. Data Ethics – 2% 3. Data Governance – 11% 4. Data Architecture – 6% 5. Data Modelling and Design – 11% 6. Data Storage and Operations – 6% 7. Data Security – 6% 8. Data Integration and Interoperability – 6% 9. Document and Content Management – 6% 10. Master and Reference Data Management – 10% 11. Data Warehousing and Business Intelligence – 10% 12. Metadata Management – 11% 13. Data Quality – 11% 14. Big Data – 2%
  • 6.
  • 7.
  • 8.
    Chapter 1 :Data Management
  • 9.
    • LET’S GETEXERCISE •https://wato.xyz/cdmppractice1 passcode : cdmp
  • 10.
    Introduction • Data Managementis the development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles. • Data Management Professional is any person who works in any facet of data management • Data is the ‘currency’, the ‘life blood’, and even the ‘new oil’ of the information economy. • Business Driver : Data Asset Value • Data Management Goals
  • 11.
    Essential Concept • VARIOUSDATA DEFINITIONS : • data emphasize its role in representing facts about the world. (Common) • Data information that has been stored in digital form (IT) • Facts : Data is a mean representation  Need Context (Metadata)
  • 12.
    Essential Concept • DATAVS INFORMATION : • DATA PYRAMID DIKW : 1.DATA (RAW) -> 2.INFORMATION (WHO,WHAT,WHEN, WHERE) -> 3.KNOWLEDGE (HOW) -> 4.WISDOM (WHY) KNOWLEDGE & WISDOM DATA & INFORMATION GOALS DKIW Example Data vs Information (“Here is a sales report for the last quarter [information]. It is based on data from our data warehouse [data]. Next quarter these results [data] will be used to generate our quarter-over-quarter performance measures [information]”)
  • 13.
    Essential Concept • Dataas an Organizational Asset (Economic Resources : shows up as an item on the Profit and Loss Statement (P&L) , & to make more effective decisions and to operate more efficiently • Data Management Principles > • Data Management Challenges (Differs, Valuation, Quality, Planning for Better Data, Metadata and Meta management , Cross functionality..)
  • 14.
  • 15.
    The focus ofdata management on the data lifecycle IMPLICATIONS : •Creation and usage are the most critical points in the data lifecycle •Data Quality must be managed throughout the data lifecycle •Metadata Quality must be managed through the data lifecycle •Data Security must be managed throughout the data lifecycle •Data Management efforts should focus on the most critical data
  • 16.
    Data and Risk •Low Quality Data (Inaccurate, Incomplete, Out of Date) • Missunderstood,Missused “Information Gaps : the difference between what we know and what we need to know to make an effective decision. Information gaps represent enterprise liabilities with potentially profound impacts on operational effectiveness and profitability. “ • The increased role of information as an organizational asset across all sectors has led to an increased focus by regulators and legislators on the potential uses and abuses of information
  • 17.
    Data Management Strategy Thecomponents of a data management strategy : •A compelling vision for data management •A summary business case for data management, with selected examples •Guiding principles, values, and management perspectives •The mission and long-term directional goals of data management •Proposed measures of data management success •Short-term (12-24 months) Data Management program objectives that are SMART (specific, measurable,actionable, realistic, time-bound) •Descriptions of data management roles and organizations, along with a summary of their responsibilitiesand decision rights •Descriptions of Data Management program components and initiatives •A prioritized program of work with scope boundaries •A draft implementation roadmap with projects and action items
  • 18.
    Data Management Framework StrategicAlignment Model Amsterdam Information Model
  • 19.
  • 20.
  • 21.
    DAMA Data ManagementFramework Evolved
  • 22.
  • 23.
    • LET’S GETEXERCISE •https://wato.xyz/cdmppractice1 passcode : cdmp
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
    Chapter 2 :Data Handling Ethics
  • 33.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice2 passcode : cdmp
  • 34.
    Introduction • Data handlingethics are concerned with how to procure, store, manage, use, and dispose of data in ways that are aligned with ethical principles. • Handling data in an ethical manner is necessary to the long-term success of any organization that wants to get value from its data.
  • 35.
    Core Concept • Impacton people: Because data represents characteristics of individuals and is used to make decisions that affect people’s lives, there is an imperative to manage its quality and reliability. • Potential for misuse: Misusing data can negatively affect people and organizations, so there is an ethical imperative to prevent the misuse of data. • Economic value of data: Data has economic value. Ethics of data ownership should determine how that value can be accessed and by whom.
  • 36.
    Context Diagram > •There is an ethical imperative not only to protect data, but also to manage quality
  • 37.
    Business Driver • Ethicaldata handling can increase the trustworthiness of an organization and the organization’s data and process outcomes. • This can create better relationships between the organization and its stakeholders. “The emerging roles of Chief Data Officer, Chief Risk Officer, Chief Privacy Officer, and Chief Analytics Officer are focused on controlling risk by establishing acceptable practices for data handling “
  • 38.
    Ethical Principles forData • Respect for Person : respects their dignity and autonomy as human individuals • Beneficence : do not harm; maximize possible benefits and minimize possible harms. • Justice : fair and equitable treatment of people
  • 39.
  • 40.
    Risks of UnethicalData Handling Practices • Timing • Misleading Visualizations • Unclear Definitions or Invalid Comparisons • Bias ( Data Collection for pre-defined result, Biased use of data collected, Hunch and search, Biased sampling methodology, Context and Culture ) • Transforming and Integrating Data ( Limited knowledge of data’s origin and lineage, Data of poor quality, Unreliable Metadata, No Documentation) • Obfuscation / Redaction of Data ( Data aggregation, Data Marking, Data Masking)
  • 41.
    Establishing an EthicalData Culture STEPS : • Review Current State Data Handling Practices • Identify Principles, Practices, and Risk Factors • Create an Ethical Data Handling Strategy and Roadmap • Adopt a Socially Responsible Ethical Risk Model
  • 42.
    Data analytics datahandling ethics
  • 43.
    Data Ethics andGovernance • Oversight for the appropriate handling of data falls under both data governance and legal counsel. • Keep up-to-date on legal changes • CDMP formal code of ethics
  • 44.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice2 passcode : cdmp
  • 45.
    Chapter 3 :Data Governance
  • 46.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice3 passcode : cdmp
  • 48.
    GOVERNANCE VS MANAGEMENT DATAGOVERNANCE = Ensure The Data Managed Properly DATA MANAGEMENT = ensure an organization gets value out of its data “ SCOPE & FOCUS DATA GOVERNANCE PROGRAMS  OREGANITATION NEEDS “
  • 50.
    Context Diagram DATA GOVERNANCEPROGRAMS : • Strategy: Defining, communicating, and driving execution of Data Strategy and Data Governance Strategy • •Policy: Setting and enforcing policies related to data and Metadata management, access, usage, security, and quality • •Standards and quality: Setting and enforcing Data Quality and Data Architecture standards • •Oversight: Providing hands-on observation, audit, and correction in key areas of quality, policy, and data management (often referred to as stewardship) • •Compliance: Ensuring the organization can meet data-related regulatory compliance requirements • •Issue management: Identifying, defining, escalating, and resolving issues related to data security, data access, data quality, regulatory compliance, data ownership, policy, standards, terminology, or data governance procedures • •Data management projects: Sponsoring efforts to improve data management practices • •Data asset valuation: Setting standards and processes to consistently define the business value of data assets
  • 52.
    Goals 1. Enable anorganization to manage its data as an asset. 2. Define, approve, communicate, and implement principles, policies, procedures, metrics, tools, and responsibilities for data management. 3. Monitor and guide policy compliance, data usage, and management activities.
  • 54.
    DG Business Driver •Common Driver : regulatory compliance, especially for heavily regulated industries • Focus on reducing risks or improving processes : Reducing Risk : General risk management, Data security , Privacy Improving Processes: Regulatory compliance, Data quality improvement, Metadata Management. Efficiency in development projects (SDLC) , Vendor management
  • 56.
    DG - ImprovingProcesses
  • 58.
    Data Governance vsIT Governance Data governance is separate from IT governance. • IT governance makes decisions about IT investments, the IT application portfolio, and the IT project portfolio – in other words, hardware, software, and overall technical architecture. • IT governance aligns the IT strategies and investments with enterprise goals and strategies. • The COBIT (Control Objectives for Information and Related Technology) framework provides standards for IT governance,
  • 60.
    DG Goals andPrinciples Data Governance is to enable an organization to manage data as an asset. DG Program must be :
  • 62.
    DG Essential Concept Datagovernance represents an inherent separation of duty between oversight and execution
  • 64.
  • 66.
    Data Governance OperatingModel Types • Centralized • Replicated • Federated
  • 68.
    Data Stewardship • DataStewardship is the most common label to describe accountability and responsibility for data and processes that ensure effective control and use of data assets • Core activities : Creating and managing core Metadata, Documenting rules and standards, Managing data quality issues, Executing operational data governance activities
  • 70.
    Types of DataStewards
  • 71.
    > Types ofData Stewards
  • 73.
  • 75.
    Chapter 4 :Data Architecture
  • 76.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice4 passcode : cdmp
  • 78.
    What is Architecture? • Architecture refers to an organized arrangement of component elements intended to optimize the function, performance, feasibility, cost, and aesthetics of an overall structure or system
  • 79.
    Data Architecture Perspective DataArchitecture will be considered from the following perspectives: •Data Architecture outcomes •Data Architecture activities •Data Architecture behavior Together, these three form the essential components of Data Architecture.
  • 81.
    Introduction • The mostdetailed Data Architecture design document is a formal enterprise data model, containing data names, comprehensive data and Metadata definitions, conceptual and logical entities and relationships, and business rules. • Physical data models are included, but as a product of data modeling and design, rather than Data Architecture.
  • 83.
  • 85.
    Goals 1. Identify datastorage and processing requirements. 2. Design structures and plans to meet the current and long-term data requirements of the enterprise. 3. Strategically prepare organizations to quickly evolve their products, services, and data to take advantageof business opportunities inherent in emerging technologies.
  • 87.
  • 89.
    Zachman’ Columns • What(the inventory column): Entities used to build the architecture • How (the process column): Activities performed • Where (the distribution column): Business location and technology location • Who (the responsibility column): Roles and organizations • When (the timing column): Intervals, events, cycles, and schedules • Why (the motivation column): Goals, strategies, and means
  • 90.
  • 91.
  • 93.
    Enterprise Data Architecture EnterpriseData Architecture defines standard terms and designs for the elements that are important to the organization. • Enterprise Data Model (EDM): The EDM is a holistic, enterprise-level, implementation-independentconceptual or logical data model providing a common consistent view of data across the enterprise. • Data Flow Design: Defines the requirements and master blueprint for storage and processing acrossdatabases, applications, platforms, and networks (the components).
  • 96.
    Project Development Method •Waterfall methods: Understand the requirements and construct systems in sequential phases as part ofan overall enterprise design. • •Incremental methods: Learn and construct in gradual steps (i.e., mini-waterfalls). This method createsprototypes based on vague overall requirements. The initiation phase is crucial; • •Agile, iterative, methods: Learn, construct, and test in discrete delivery packages (called ‘sprints’)that are small enough that if work needs to be discarded, not much is lost.
  • 98.
  • 99.
    Project Activities • Definescope • Understand business requirements • Design • Implement: (When buying, When reusing data, When building)
  • 100.
    Chapter 5 :Data Modeling & Design
  • 101.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice5 passcode : cdmp
  • 103.
  • 105.
    Context Diagram Deliverables /Levelof Detail: - Conceptual Data Model - Logical Data Model - Physical Data Model
  • 107.
    Goals and Principles •The goal of data modeling is to confirm and document understanding of different perspectives, which leads to applications that more closely align with current and future business requirements, and creates a foundation to successfully complete broad-scoped initiatives such as Master Data Management and data governance programs. • Confirming and documenting understanding of different perspectives facilitates : - Formalization - Scope Definition - Knowledge retention/documentation
  • 109.
    Types of Datathat are Modeled • Category information: Data used to classify and assign types to things. • Resource information: Basic profiles of resources needed conduct operational processes such asProduct, Customer, Supplier, Facility, Organization, and Account. • Business event information: Data created while operational processes are in progress. • Detail transaction information: Detailed transaction information is often produced through point-of-sale systems (either in stores or online). • Data at Rest
  • 112.
    Data Model Components •Entity • Relationship • Domain
  • 114.
    Arity of Relationships- Model component
  • 116.
    Relational data inmodel Scheme • A foreign key is used in physical and sometimes logical relational data modeling schemes to represent a relationship
  • 118.
  • 120.
    Domain – ModelComponent In data modeling, a domain is the complete set of possible values that an attribute can be assigned.
  • 122.
  • 124.
    • Time-based patternsare used when data values must be associated in chronological order and with specific time values.
  • 125.
    Chapter 6 :Data Storage & Operations
  • 126.
    • LET’S GETPRACTICE •https://wato.xyz/cdmppractice678 passcode : cdmp
  • 128.
    Introduction • Data Storageand Operations includes the design, implementation, and support of stored data, to maximize its value throughout its lifecycle, from creation/acquisition to disposal. • Data Storage and Operations includes two sub-activities: 1. Database Support 2. Database Support Technology • Play Key Roles : DBA
  • 130.
    Context Diagram The goalsof data storage and operations include: • Managing the availability of data throughout the data lifecycle • Ensuring the integrity of data assets • Managing the performance of data transactions
  • 132.
    SLA Service Level AgreementPrinciples Practice: • The Service Level Agreement(SLA) can reflect DBA-recommended and developer-accepted methods of ensuring data integrity and data security. The SLA should reflect the transfer of responsibility from the DBAs to the development team if the development team will be coding their own database update procedures or data accesslayer. • This prevents an ‘all or nothing’ approach to standards.
  • 134.
    Procedural and DevelopmentDBAs Procedural DBAs : • Lead the review and administration of procedural database objects. • Specializes in development and support of procedural logic controlled and execute by the DBMS: • Development DBAs focus on data design activities including creating and managing special use databases
  • 135.
  • 137.
    Database Processing Types CAP(BREWER’S THEOREM ) Consistency, Availability and Partition  How Distribution System Closely match with : • ACID (Atomicity, Consistency, Isolation, Durability) • BASE (Basically Available, Soft State, Eventual Consistency)
  • 139.
  • 141.
  • 142.
    Other Knowledge Concerns •MANAGE SLA • MANAGE DATABASE ACCESS CONTROL • MANAGE DATABASE PERFORMANCE • MANAGE DATABASE BACKUP RECOVERY, REPLICATION • MANAGE PHYSICAL STORAGE ENVIRONMENT &IMPLEMENTATION • MANAGE DATASET & MIGRATION • TOOLS & TECHNIQUE • IMPLEMENTATION GUIDLINES
  • 143.
    Chapter 7 :Data Security
  • 145.
    Introduction • Data Securityincludes the planning, development, and execution of security policies and procedures to provide proper authentication, authorization, access, and auditing of data and information assets.
  • 147.
  • 149.
  • 151.
    RISK RISK CLASSIFICATION - CRITICALRISK DATA (CRD) - HIGHT RISK DATA (HRD) - MODERATE RISK DATA (MRD)
  • 153.
  • 154.
    Data Security Laws& Regulations Eg. Regulations
  • 156.
  • 157.
    Assess Current SecurityRisks • The sensitivity of the data stored or in transit • The requirements to protect that data, and • The current security protections in place
  • 158.
    Other Concerns • RegulatoryRequirements • Data Security Standards • Data Security Roles • Tools & Technique • Guidelines
  • 159.
    Chapter 8 :Data Integration and Interoperability
  • 161.
    Introduction Data Integration andInteroperability (DII) describes processes related to the movement and consolidation of data within and between data stores, applications and organizations. Org. Data Management Function Depend Data Management Area Depend • Data migration and conversion • Data consolidation into hubs or marts • Integration of vendor packages into an organization’s application portfolio • Data sharing between applications and across organizations • Distributing data across data stores and data centers • Archiving data • Managing data interfaces • Obtaining and ingesting external data • Integrating structured and unstructured data • Providing operational intelligence and management decision support • Data Governance • Data Architecture: • Data Security: • Metadata: • Data Storage and Operations • Data Modeling and Design
  • 165.
    Essential Concepts • Extract,Transform, and Load (ETL) • Extract, Transform, and Load (ELT) • LATENCY, REPLICATION …
  • 167.
  • 169.
    Interaction Model • Point-to-point(Pass Data Directly ) • Hub-and-spoke (Consolidates share data) • Publish - Subscribe (System push data – Other System Pull data – Distributed to subscriber)
  • 170.
  • 171.
    Chapter 9 :Document & Content Management
  • 173.
    Introduction • Document andContent Management entails controlling the capture, storage, access, and use of data and information stored outside relational databases • In some Organizations unstructured data has a direct relationship to structured data. • Management decisions about such content should be applied consistently.
  • 175.
    Business Driver • Regulatorycompliance • the ability to respond to litigation and e-discovery requests, and business continuity requirements. • Good records management can also help organizations become more efficient • Well-organized, searchable websites that result from effective management of ontologies • E-discovery is the process of finding electronic records that might serve as evidence in a legal action.
  • 176.
  • 178.
    ARMA International Principles- 2009 Generally Acceptable Recordkeeping Principles® (GARP) • Principle of Accountability • Principle of Integrity • Principle of Protection • Principle of Compliance • Principle of Availability • Principle of Retention • Principle of Disposition • Principle of Transparency
  • 180.
    Essential Concepts • Content: document is to content what a bucket is to water: a container. Content refers to the data and information inside the file, document, or website. • Controlled Vocabularies : is a defined list of explicitly allowed terms used to index, categorize, tag, sort, and retrieve content through browsing and searching. : , • Documents and Records : Documents are electronic or paper objects that contain instructions for tasks, requirements for how and when to perform a task or function, and logs of task execution and decisions. Documents can communicate and share information and knowledge. Examples of documents include procedures, protocols, methods, and specifications. , • Data Map : is an inventory of all ESI data sources, applications, and IT environments that includes the owners of the applications, custodians, relevant geographical locations, and data types • E-Discovery , etc.
  • 182.
  • 184.
  • 186.
    Activities - Planfor Record Management • Records management starts with a clear definition of what constitutes a record. • Managing electronic records requires decisions about where to store current, active records and how to archive older records
  • 188.
    Activities - Managethe Lifecycle • Capture Records and Content : Capturing content is the first step to managing it. Electronic content is often already in a format to be stored in electronic repositories. • Manage Versioning and Control : Formal, Revision, Custody • Backup and Recovery : The document / record management system needs to be included in the organization’s overall corporate backup and recovery activities, including business continuity and disaster recovery planning. • Manage Retention and Disposal : Effective document / records management requires clear policies and procedures, especially regarding retention and disposal of records. • Audit Documents / Records : Document / records management requires periodic auditing to ensure that the right information is getting to the right people at the right time for decision-making or performing operational activities
  • 189.
    Manage Versioning andControl ANSI Standard 859 has three levels of control of data: • •Formal control requires formal change initiation, thorough evaluation for impact, decision by achange authority, and full status accounting of implementation and validation to stakeholders • •Revision control is less formal, notifying stakeholders and incrementing versions when a change isrequired • •Custody control is the least formal, merely requiring safe storage and a means of retrieval
  • 190.
    Chapter 10 :Master Data Management
  • 191.
    • LET’S TRYOUT (90 questions in 80 minutes ) •https://wato.xyz/cdmptryout passcode : cdmp
  • 193.
    Introduction • In anyorganization, certain data is required across business areas, processes, and systems. • The overall organization and its customers benefit if this data is shared and all business units can access the same customer lists, geographic location codes, business unit lists, delivery options, part lists, accounting cost center codes, governmental tax codes, and other data used to run the business.
  • 196.
    Business Driver Master DataManagement • Meeting organizational data requirements • Managing data quality • Managing the costs of data integration • Reducing risk The drivers for managing Reference Data are similar. Centrally managed Reference Data enables organizations to: • Meet data requirements for multiple initiatives and reduce the risks and costs of data integration through use of consistent Reference Data • Manage the quality of Reference Data
  • 198.
    Differences Between Masterand Reference Data • Different types of data play different roles within an organization. They also have different management requirements. • Six-layer taxonomy of data that includes Metadata, Reference Data, enterprise structure data, transaction structure data, transaction activity data, and transaction audit data (Chisholm, 2008; Talburt and Zhou, 2015). • Master Data as an aggregation of Reference Data, enterprise structure data, and transaction structure data
  • 200.
    Master Data -Trusted Source, Golden Record • A Trusted Source is recognized as the ‘best version of the truth’ based on a combination of automated rules and manual stewardship of data content. • A trusted source may also be referred to as a Single View, 360° View. • Any MDM system should be managed so that it is a trusted source. Within a trusted source, records that represent the most accurate data about entity instances can be referred to as Golden Records. • ‘Golden Record’ does not mean that it is always a 100% complete and 100% accurate representation of all the entities within the organization (especially in organizations that have multiple SOR’s supplying data to the Master Data environment).
  • 202.
    Data Sharing Architecture Threebasic approaches to implementing a Master Data hub environment : • A Registry • In a Transaction Hub • A Consolidated
  • 204.
    Party Master Data •Party Master Data includes data about individuals, organizations, and the roles they play in business relationships. • In the commercial environment, parties include customers, employees, vendors, partners, and competitors. • In the public sector, parties are usually citizens • Customer Relationship Management (CRM) systems manage Master Data about customers. The goal of CRM is to provide complete and accurate information about each and every customer.
  • 206.
    Master Data ManagementKey Processing Steps • Key processing steps for MDM includes data model management; data acquisition; data validation, standardization, and enrichment; entity resolution; and stewardship and sharing.
  • 208.
    • Product MasterData can focus on an organization’s internal products and services or on industry-wide (including competitor) products and services. • Different types of product Master Data solutions support different business functions.
  • 210.
    Entity Resolution andIdentifier Management Entity resolution is the process of determining whether two references to real world objects refer to the same object or to different objects (Talburt, 2011). Entity resolution is a decision-making process
  • 211.
    Entity Resolution andIdentifier Management Matching, or candidate identification, is the process of identifying how different records may relate to a single entity. The risks with this process are: • False positives: Two references that do not represent the same entity are linked with a single identifier. This results in one identifier that refers to more than one real-world entity instance. • False negatives: Two references represent the same entity but they are not linked with a single identifier. This results in multiple identifiers that refer to the same real-world entity when each instanceis expected to have one-and-only-one identifier.