SlideShare a Scribd company logo
1 of 17
Data Quality Architecture
 Phase 1 – Account Verification

          Art Nicewick
Project Scope
• Define a Architectural flow diagram that
  provides for basis for data governance, data
  quality and impact analysis

• Create a framework to report on
  inconsistencies in data (Initial emphasis on
  Accounts)
FISMA
• The architecture provides a foundation for verifying
  that Accounts are deleted after the employees leave
  the Gallery
• The Exceptions Facility, Provides the ability for a
  application administrator to request that an Non-AD
  account be left on file
   – Audit trails
   – Non Standard accounts (e.g. TDP as Custodian)
   – CIO can approvedeny and give timelines for resolutions
• Focus of first phase of the initiative
Why Consistency Reports
• Common Practice (Asset Inventory, …)
• Ensures that data is corrected in the correct
  manner
• Re-validates automated processes
• Some changes need to be informed to system
  manager (e.g. They should know if someone
  has a new last name)
• Links into existing manual pratices
General Data Quality Process
1. Identify data stores (Based on priority)
2. Identify authoritative data
3. Identify Interfaces  replicated  redundant
   data
4. Identify consistency analysis process
5. Correct and continuous monitoring
Identify data stores
• 1.1. Create list of all know data applications
   – Define the name of the data application
   – Define the contacts related to the application
      • TDP Contact
      • Application Administrator
   – Categorize the application
Identify data stores
• 1.2. Link data into data flow representation
  for a visual analysis on enterprise data flows
Identify authoritative data
• 2.1. Review Application data to determine
  – What type of data is supported
  – Is data authoritative
Identify Interfaces  replicated 
                 redundant data
• 2.1. Review Application data to determine
   –   Where the data is sent
   –   Where the data is received from
   –   Data Quality
   –   Note: Source assumed by reverse lookup of target definitions
Identify Interfaces  replicated 
           redundant data
• Diagram linkages between data stores for
  visual review and impact analysis
Identify consistency analysis process
            Review participating data sources and
            determine how to define consistency




* At this point only “SQL” methods are used.
Correct and continuous monitoring

• Inconsistencies are periodically sent to end users for “correction” or
  “exceptions”
• Valid exceptions may be
    – “Supervisor Accounts Outside Active Directory” (e.g. TMSAdmin)
    – Ex-Employees with data attached to userid
    – Contractor or testing userid
Correct and continuous monitoring
• Users can review and update exceptions
  online
Correct and continuous monitoring
• Administrators can create schedules and Email
  recipients
Correct and continuous monitoring
• Email can be sent to
  as many people as
  desired and as
  frequently (or
  infrequently) as
  desired.
Target Data
•   Userids (First Phase and Proof of concept)
•   Object Data
•   Location data
•   Employee Names and Titles
•   Other ..
Challenges
•   Object data (Portfolio)
•   Non-SQL Data (Filemaker)
•   Secure Data (Tradewin)
•   Desktop Data (Excel)
•   Offsite data (FMS)
•   Other …

More Related Content

What's hot

What's hot (20)

Data Management Maturity Assessment
Data Management Maturity AssessmentData Management Maturity Assessment
Data Management Maturity Assessment
 
Data Architecture Brief Overview
Data Architecture Brief OverviewData Architecture Brief Overview
Data Architecture Brief Overview
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Strategic Business Requirements for Master Data Management Systems
Strategic Business Requirements for Master Data Management SystemsStrategic Business Requirements for Master Data Management Systems
Strategic Business Requirements for Master Data Management Systems
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Introduction to Data Governance
Introduction to Data GovernanceIntroduction to Data Governance
Introduction to Data Governance
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
DAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best PracticesDAS Slides: Data Quality Best Practices
DAS Slides: Data Quality Best Practices
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of Metadata
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Data Quality
Data QualityData Quality
Data Quality
 
Data Quality Rules introduction
Data Quality Rules introductionData Quality Rules introduction
Data Quality Rules introduction
 
Data-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success StoriesData-Ed Webinar: Data Quality Success Stories
Data-Ed Webinar: Data Quality Success Stories
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management Functionality
 
Strategy For Data Quality
Strategy For Data QualityStrategy For Data Quality
Strategy For Data Quality
 
Most Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital EconomyMost Common Data Governance Challenges in the Digital Economy
Most Common Data Governance Challenges in the Digital Economy
 

Similar to Data quality architecture

Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architecture
Costa Pissaris
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system design
Rahul Hedau
 
Data Governance Overview - Doreen Christian
Data Governance Overview - Doreen ChristianData Governance Overview - Doreen Christian
Data Governance Overview - Doreen Christian
Doreen Christian
 

Similar to Data quality architecture (20)

Data flow ii extract
Data flow   ii extractData flow   ii extract
Data flow ii extract
 
AIS PPt.pptx
AIS PPt.pptxAIS PPt.pptx
AIS PPt.pptx
 
DATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing PlanDATA WAREHOUSE -- ETL testing Plan
DATA WAREHOUSE -- ETL testing Plan
 
Various Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.pptVarious Applications of Data Warehouse.ppt
Various Applications of Data Warehouse.ppt
 
DW (1).ppt
DW (1).pptDW (1).ppt
DW (1).ppt
 
System design
System designSystem design
System design
 
Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architecture
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system design
 
Data Governance Overview - Doreen Christian
Data Governance Overview - Doreen ChristianData Governance Overview - Doreen Christian
Data Governance Overview - Doreen Christian
 
Soft requirement
Soft requirementSoft requirement
Soft requirement
 
22-REQUIREMENT.ppt
22-REQUIREMENT.ppt22-REQUIREMENT.ppt
22-REQUIREMENT.ppt
 
Overview of Function Points Analysis
Overview of Function Points Analysis Overview of Function Points Analysis
Overview of Function Points Analysis
 
Function Points
Function PointsFunction Points
Function Points
 
5.Developing IT Solution.pptx
5.Developing IT Solution.pptx5.Developing IT Solution.pptx
5.Developing IT Solution.pptx
 
Systems Development and Documentation Techniques
Systems Development and Documentation TechniquesSystems Development and Documentation Techniques
Systems Development and Documentation Techniques
 
Topic5 - IT Implementation & Challenges.pptx
Topic5 - IT Implementation & Challenges.pptxTopic5 - IT Implementation & Challenges.pptx
Topic5 - IT Implementation & Challenges.pptx
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
Webinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your BusinessWebinar: 10-Step Guide to Creating a Single View of your Business
Webinar: 10-Step Guide to Creating a Single View of your Business
 
ITFT- Dbms
ITFT- DbmsITFT- Dbms
ITFT- Dbms
 
An Introduction to Clinical Study Migrations
An Introduction to Clinical Study MigrationsAn Introduction to Clinical Study Migrations
An Introduction to Clinical Study Migrations
 

More from anicewick

More from anicewick (6)

Semantic web2
Semantic web2Semantic web2
Semantic web2
 
Defining conservation taxonomy
Defining conservation taxonomyDefining conservation taxonomy
Defining conservation taxonomy
 
Creating an RAD Authoratative Data Environment
Creating an RAD Authoratative Data EnvironmentCreating an RAD Authoratative Data Environment
Creating an RAD Authoratative Data Environment
 
FISMA Compliance
FISMA ComplianceFISMA Compliance
FISMA Compliance
 
User Interface Patterns and Nuxeo
User Interface Patterns and NuxeoUser Interface Patterns and Nuxeo
User Interface Patterns and Nuxeo
 
Understanding Document Managment Systems and Nuxeo
Understanding Document Managment Systems and NuxeoUnderstanding Document Managment Systems and Nuxeo
Understanding Document Managment Systems and Nuxeo
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

Data quality architecture

  • 1. Data Quality Architecture Phase 1 – Account Verification Art Nicewick
  • 2. Project Scope • Define a Architectural flow diagram that provides for basis for data governance, data quality and impact analysis • Create a framework to report on inconsistencies in data (Initial emphasis on Accounts)
  • 3. FISMA • The architecture provides a foundation for verifying that Accounts are deleted after the employees leave the Gallery • The Exceptions Facility, Provides the ability for a application administrator to request that an Non-AD account be left on file – Audit trails – Non Standard accounts (e.g. TDP as Custodian) – CIO can approvedeny and give timelines for resolutions • Focus of first phase of the initiative
  • 4. Why Consistency Reports • Common Practice (Asset Inventory, …) • Ensures that data is corrected in the correct manner • Re-validates automated processes • Some changes need to be informed to system manager (e.g. They should know if someone has a new last name) • Links into existing manual pratices
  • 5. General Data Quality Process 1. Identify data stores (Based on priority) 2. Identify authoritative data 3. Identify Interfaces replicated redundant data 4. Identify consistency analysis process 5. Correct and continuous monitoring
  • 6. Identify data stores • 1.1. Create list of all know data applications – Define the name of the data application – Define the contacts related to the application • TDP Contact • Application Administrator – Categorize the application
  • 7. Identify data stores • 1.2. Link data into data flow representation for a visual analysis on enterprise data flows
  • 8. Identify authoritative data • 2.1. Review Application data to determine – What type of data is supported – Is data authoritative
  • 9. Identify Interfaces replicated redundant data • 2.1. Review Application data to determine – Where the data is sent – Where the data is received from – Data Quality – Note: Source assumed by reverse lookup of target definitions
  • 10. Identify Interfaces replicated redundant data • Diagram linkages between data stores for visual review and impact analysis
  • 11. Identify consistency analysis process Review participating data sources and determine how to define consistency * At this point only “SQL” methods are used.
  • 12. Correct and continuous monitoring • Inconsistencies are periodically sent to end users for “correction” or “exceptions” • Valid exceptions may be – “Supervisor Accounts Outside Active Directory” (e.g. TMSAdmin) – Ex-Employees with data attached to userid – Contractor or testing userid
  • 13. Correct and continuous monitoring • Users can review and update exceptions online
  • 14. Correct and continuous monitoring • Administrators can create schedules and Email recipients
  • 15. Correct and continuous monitoring • Email can be sent to as many people as desired and as frequently (or infrequently) as desired.
  • 16. Target Data • Userids (First Phase and Proof of concept) • Object Data • Location data • Employee Names and Titles • Other ..
  • 17. Challenges • Object data (Portfolio) • Non-SQL Data (Filemaker) • Secure Data (Tradewin) • Desktop Data (Excel) • Offsite data (FMS) • Other …