SlideShare a Scribd company logo
The Data Driven University
Automating Data Governance & Stewardship in
Autonomous & Decentralized Environments
Pieter De Leenheer, PhD
Cofounder and VP Innovation
What we talk about when we talk about
no Data Governance
Who approved this?
I wish these guys
spoke our
language
I can’t understand
this report !
I’ve never seen this
funding code! Who
introduced this ?
Are we sure this
definition of
‘professor’ is correct
?
The Problem
This rule is
different on our
campus!
Are we allowed to share this
student data with IR?
Glossary Search
• How frequently do you look up a word for your
business?
• To what purpose?
Clarification
Differentiation
• What are your main sources?
• Hierarchy-based navigation or key-word based
search?
• Authoritative Truth or trust?
Overview
• Data Governance Operating Framework
Data Governance
Data Stewardship
Data Management
• Implementations
Stanford University Data Stewardship (SUDS)
George Washington University
Brigham Young University
• The Bigger Picture
Inter-university Data Governance in
the Flanders Research Information Space
Data Governance Framework
Data Governance Council: Governance Operating Model
Roles &
Responsibilities
Processes &
Workflow
Asset Types &
Traceability
Data Governance
Organization
Data Stewardship Activities
Data Quality
Development
IT / Operational Data Management Activities
Data
Modeling
Metadata
Lineage
Establishes& drives
Aligns& Coordinates
Reports& Escalates
Monitors& Remediates
Metadata
Scanning
Reference Data
Authoring
Data
Integration
Collibra Business
Semantics Glossary (BSG)
Collibra Reference Data
Accelerator (RDA)
Hierarchy
Management
Business &
Data Definitions
Business
Traceability
Semantic
Modeling
Mapping
Specifications
Policy
Management
Business
Rules
Data Quality
Rules
Data Quality
Reporting
Issue
Management
Reference Data
Crosswalks
Master Data
Stewardship
Data Quality Profiling
DQ Defect
Resolution
Collibra Data Stewardship
Manager (DSM)
Collibra Platform
Other Data Management
Vendor products
...
https://compass.collibra.com/display/COOK/Data+Governance+Operating+Model
Stanford University Data Stewardship
(SUDS)
• All Materials available here
dg.stanford.edu
• Establish foundation for
Institutional Research
• Data Quality
How many faculty do we have?
• Context and Meaning
What does faculty mean in which
context?
How is faculty data structured and
where is it stored?
• Data Usage Request
Am I allowed to use faculty or student
name and age for external reporting?
SUDS: Approach
• Decentralized
 1 DG coordinator (also show vacancy)
 Project staff
 cross-functional working groups : natural scope
and resources
 focus on BI reporting, with input from above
projects
 sign off by DG coordinator and end user through
usage (full cycle)
• Step-by step; success by success
SUDS: First Success in OBIEE
reporting
REST / JSON / CSV / Excel
DG Operating Model
• What do we want to capture?
Asset Type: Business Terms, Policies, Rules, Code
Values
Attribute/Relation Type: Name, Definition, Example,
Derivations, Specializations
• Who should be involved in this process?
Communities: Finance, HR, Student, Research
Domains / subject areas: Task Management
Users and User groups
• How to execute and Monitor the process?
Key events and workflow chains
Validation rules
SUDS Data Dictionary Example
+4000 data elements
Community context: Finance, HR,
Research and Student
Custom attribute types and relation types
What attribute- and relation-types do we want to capture?
Out of the box but also custom
attribute types and relation types
What attribute- and relation-types do we want to capture?
• https://stanford.app.box.com/CollibraQuickReference
• https://stanford.box.com/UsingCollibraFields
Who is involved in the
process?
• https://compass.collibra.com/display/COOK/Role+Ty
pes
ResponsibleAccountable Informed Consulted
Who? User groups and
Dashboards
Who? – User groups and Dashboards
How to execute and monitor?
From Best Practice to Auto-Validation Rules
http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/?p=577
(generic example – not from SUDS)
How to execute and monitor?
• Status Types and Workflows
E.g., For Domains, Terms, Users, and later for Issues and Data Sharing
Agreements, we first define a “finite state machine” and then a set of
workflows that each define a transition between states. This means
workflows can trigger each other and form a complex chain.
BUSINESS SEMANTICS GLOSSARY
Candidate In Progress
Under Review
Accepted In Revision
Rejected
Term requested on
the domain page
11
1
2
2
3
3
2
3
Depricated
4
5
Workflows
1
2
Propose Business Term
Edit Business Term
3 Onboarding Business Term
4 Deprecate Business Term
5 Reactivate Business Term
How it it to be governed? Onboarding Workflow
(Not Stanford content - illustrative example only)
How it it to be governed? Approval Workflow
(not Stanford content - illustrative example only)
Stanford DG Program Key Results
(from http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/wp-content/uploads/2014/11/Stanford_DS_CAIR_v2.pdf
• Understand data from multiple
perspectives
• Central repository of verified information
(and better data infrastructure)
• Easier access to information; less reliance
on ‘oral tradition’
• Improved data quality, consistency
• Increased understanding; thoughtful
decision-making around data
SUDS Future Directions
• Continue building engagement around
data governance (define policy), in
addition to data stewardship (enforce
policy)
• Continue building engagement, especially
by executive-level leadership
• Continue increasing visibility and
consumption of definitions and other
metadata
George Washington University
(by courtesy of Ron Layne, GWU)
• centralized
• run by the DG Office division of IT
• mapping data dictionaries, rules and metrics and data sharing
agreements
• Integration with Informatica Data Quality
Flanders Research Information Space
• Providing Scientific Research Information and
Services
• Easy
• Transparent
• Open
• Timely
• Unambiguous
• Supported by Data Governance
• Qualitative meta data: e.g., definition for
project, funding codes, mappings,
classifications, etc.
• Roles and responsibilities for Information
Providers and Stiweto
• Collaborative workflows between Information
Providers and Stiweto
By courtesy of G. Van Grootel, EWI
FRIS’ Data-driven Innovation Engine
By courtesy of G. Van Grootel, EWI
The Data providers landscape
25
Universities
Research Institutes
Funders
Others
Strategic Research
Centers
Universitiy Colleges
By courtesy of G. Van Grootel, EWI
FRIS Metamodel: an example
By courtesy of G. Van Grootel, EWI
Traceability diagram
Node Description
JRC (Joint Research Centre) The Business Term representing the
Funding Source
Zevende Kader Programma.. The Business Term representin the
parent Funding Source
3723 Generation 1 Funding Code Value
258 Generation 2 Funding Code Value
G3 The Funding Stream Code Value
By courtesy of G. Van Grootel, EW
Conclusions
• Case by Case, success by success
• Identify key events and design workflow
‘chains’ to automate governance
• To support your specific use case and the
growing DG platform you need extend
asset, relation, attribute types
• Collaboration and business user
friendliness
• BOK http://compass.collibra.com
Questions For Audience
• How much % of data user need to look up
the definition of a term?
• How many % wants to know where data
around a term is stored.
• How many business terms do you have?
• Who is in charge for data quality /
governance ?
• How much % of data definition decisions
depends on business?

More Related Content

What's hot

Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and GovernanceMaster Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
DATAVERSITY
 
How to Strengthen Enterprise Data Governance with Data Quality
How to Strengthen Enterprise Data Governance with Data QualityHow to Strengthen Enterprise Data Governance with Data Quality
How to Strengthen Enterprise Data Governance with Data Quality
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
DATAVERSITY
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
DATAVERSITY
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
Dr. Hamdan Al-Sabri
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
DATAVERSITY
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data Governance
Tuba Yaman Him
 
Introduction to Data Governance
Introduction to Data GovernanceIntroduction to Data Governance
Introduction to Data Governance
John Bao Vuu
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?
DATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
DATAVERSITY
 
Real-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance ExpectationsReal-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance Expectations
DATAVERSITY
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
Lars E Martinsson
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected Approach
DATAVERSITY
 
Data-Ed Webinar: Data Governance Strategies
Data-Ed Webinar: Data Governance StrategiesData-Ed Webinar: Data Governance Strategies
Data-Ed Webinar: Data Governance Strategies
DATAVERSITY
 
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
DATAVERSITY
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
DATAVERSITY
 

What's hot (20)

Master Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and GovernanceMaster Data Management – Aligning Data, Process, and Governance
Master Data Management – Aligning Data, Process, and Governance
 
How to Strengthen Enterprise Data Governance with Data Quality
How to Strengthen Enterprise Data Governance with Data QualityHow to Strengthen Enterprise Data Governance with Data Quality
How to Strengthen Enterprise Data Governance with Data Quality
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital TransformationData Architecture Strategies: Data Architecture for Digital Transformation
Data Architecture Strategies: Data Architecture for Digital Transformation
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)Data Governance Takes a Village (So Why is Everyone Hiding?)
Data Governance Takes a Village (So Why is Everyone Hiding?)
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data Governance
 
Introduction to Data Governance
Introduction to Data GovernanceIntroduction to Data Governance
Introduction to Data Governance
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Data Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data QualityData Modeling, Data Governance, & Data Quality
Data Modeling, Data Governance, & Data Quality
 
Real-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance ExpectationsReal-World Data Governance: Data Governance Expectations
Real-World Data Governance: Data Governance Expectations
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected Approach
 
Data-Ed Webinar: Data Governance Strategies
Data-Ed Webinar: Data Governance StrategiesData-Ed Webinar: Data Governance Strategies
Data-Ed Webinar: Data Governance Strategies
 
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
Lessons in Data Modeling: Why a Data Model is an Important Part of Your Data ...
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 

Similar to The Data Driven University - Automating Data Governance and Stewardship in Autonomous and Decentralized University Environments

Business Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and StewardshipBusiness Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and Stewardship
Pieter De Leenheer
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-
AshishGuleria
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
DATAVERSITY
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
Data Blueprint
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdf
cedrinemadera
 
jgordonres112015
jgordonres112015jgordonres112015
jgordonres112015
Juedienne Gordon
 
Charles Rygula: Value Beyond Words
Charles Rygula: Value Beyond WordsCharles Rygula: Value Beyond Words
Charles Rygula: Value Beyond Words
Jack Molisani
 
Seminoles United Consolidated Advancement Project
Seminoles United Consolidated Advancement ProjectSeminoles United Consolidated Advancement Project
Seminoles United Consolidated Advancement Project
Wendy Jaccard
 
KSU IT Capstone Report 2012-2017.pdf
KSU IT Capstone Report 2012-2017.pdfKSU IT Capstone Report 2012-2017.pdf
KSU IT Capstone Report 2012-2017.pdf
Jack Zheng
 
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
SPTechCon
 
Cff data governance best practices
Cff data governance best practicesCff data governance best practices
Cff data governance best practices
Beth Fitzpatrick
 
SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?
Nicolas Georgeault
 
Data managementfornonprofits 2014-06-19
Data managementfornonprofits   2014-06-19Data managementfornonprofits   2014-06-19
Data managementfornonprofits 2014-06-19
501 Commons
 
Why Data Standards?
Why Data Standards?Why Data Standards?
Why Data Standards?
Accounting_Whitepapers
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
DATAVERSITY
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
Data Blueprint
 
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
Enterprise Knowledge
 
Jgordonres jan262016
Jgordonres jan262016Jgordonres jan262016
Jgordonres jan262016
Juedienne Gordon
 
jgordonresJan262016
jgordonresJan262016jgordonresJan262016
jgordonresJan262016
Juedienne Gordon
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data Blueprint
 

Similar to The Data Driven University - Automating Data Governance and Stewardship in Autonomous and Decentralized University Environments (20)

Business Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and StewardshipBusiness Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and Stewardship
 
Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-Data warehouse 101-fundamentals-
Data warehouse 101-fundamentals-
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
 
EPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdfEPF-datagov-part1-1.pdf
EPF-datagov-part1-1.pdf
 
jgordonres112015
jgordonres112015jgordonres112015
jgordonres112015
 
Charles Rygula: Value Beyond Words
Charles Rygula: Value Beyond WordsCharles Rygula: Value Beyond Words
Charles Rygula: Value Beyond Words
 
Seminoles United Consolidated Advancement Project
Seminoles United Consolidated Advancement ProjectSeminoles United Consolidated Advancement Project
Seminoles United Consolidated Advancement Project
 
KSU IT Capstone Report 2012-2017.pdf
KSU IT Capstone Report 2012-2017.pdfKSU IT Capstone Report 2012-2017.pdf
KSU IT Capstone Report 2012-2017.pdf
 
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
User-Centric Design: How to Leverage Use Cases and User Scenarios to Design S...
 
Cff data governance best practices
Cff data governance best practicesCff data governance best practices
Cff data governance best practices
 
SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?SPSChicagoBurbs 2019 - What is CDM and CDS?
SPSChicagoBurbs 2019 - What is CDM and CDS?
 
Data managementfornonprofits 2014-06-19
Data managementfornonprofits   2014-06-19Data managementfornonprofits   2014-06-19
Data managementfornonprofits 2014-06-19
 
Why Data Standards?
Why Data Standards?Why Data Standards?
Why Data Standards?
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality EngineeringData-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
 
Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering Data-Ed: Unlock Business Value through Data Quality Engineering
Data-Ed: Unlock Business Value through Data Quality Engineering
 
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
DGIQ - Case Studies_ Applications of Data Governance in the Enterprise (Final...
 
Jgordonres jan262016
Jgordonres jan262016Jgordonres jan262016
Jgordonres jan262016
 
jgordonresJan262016
jgordonresJan262016jgordonresJan262016
jgordonresJan262016
 
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: MetadataData-Ed: Data Systems Integration & Business Value PT. 1: Metadata
Data-Ed: Data Systems Integration & Business Value PT. 1: Metadata
 

More from Pieter De Leenheer

TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
Pieter De Leenheer
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
Pieter De Leenheer
 
Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
Pieter De Leenheer
 
Data Governance in the Big Data Era
Data Governance in the Big Data EraData Governance in the Big Data Era
Data Governance in the Big Data Era
Pieter De Leenheer
 
Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...
Pieter De Leenheer
 
Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...
Pieter De Leenheer
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Pieter De Leenheer
 

More from Pieter De Leenheer (7)

TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
 
Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
 
Data Governance in the Big Data Era
Data Governance in the Big Data EraData Governance in the Big Data Era
Data Governance in the Big Data Era
 
Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...
 
Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 

The Data Driven University - Automating Data Governance and Stewardship in Autonomous and Decentralized University Environments

  • 1. The Data Driven University Automating Data Governance & Stewardship in Autonomous & Decentralized Environments Pieter De Leenheer, PhD Cofounder and VP Innovation
  • 2. What we talk about when we talk about no Data Governance Who approved this? I wish these guys spoke our language I can’t understand this report ! I’ve never seen this funding code! Who introduced this ? Are we sure this definition of ‘professor’ is correct ? The Problem This rule is different on our campus! Are we allowed to share this student data with IR?
  • 3. Glossary Search • How frequently do you look up a word for your business? • To what purpose? Clarification Differentiation • What are your main sources? • Hierarchy-based navigation or key-word based search? • Authoritative Truth or trust?
  • 4. Overview • Data Governance Operating Framework Data Governance Data Stewardship Data Management • Implementations Stanford University Data Stewardship (SUDS) George Washington University Brigham Young University • The Bigger Picture Inter-university Data Governance in the Flanders Research Information Space
  • 5. Data Governance Framework Data Governance Council: Governance Operating Model Roles & Responsibilities Processes & Workflow Asset Types & Traceability Data Governance Organization Data Stewardship Activities Data Quality Development IT / Operational Data Management Activities Data Modeling Metadata Lineage Establishes& drives Aligns& Coordinates Reports& Escalates Monitors& Remediates Metadata Scanning Reference Data Authoring Data Integration Collibra Business Semantics Glossary (BSG) Collibra Reference Data Accelerator (RDA) Hierarchy Management Business & Data Definitions Business Traceability Semantic Modeling Mapping Specifications Policy Management Business Rules Data Quality Rules Data Quality Reporting Issue Management Reference Data Crosswalks Master Data Stewardship Data Quality Profiling DQ Defect Resolution Collibra Data Stewardship Manager (DSM) Collibra Platform Other Data Management Vendor products ... https://compass.collibra.com/display/COOK/Data+Governance+Operating+Model
  • 6. Stanford University Data Stewardship (SUDS) • All Materials available here dg.stanford.edu • Establish foundation for Institutional Research • Data Quality How many faculty do we have? • Context and Meaning What does faculty mean in which context? How is faculty data structured and where is it stored? • Data Usage Request Am I allowed to use faculty or student name and age for external reporting?
  • 7. SUDS: Approach • Decentralized  1 DG coordinator (also show vacancy)  Project staff  cross-functional working groups : natural scope and resources  focus on BI reporting, with input from above projects  sign off by DG coordinator and end user through usage (full cycle) • Step-by step; success by success
  • 8. SUDS: First Success in OBIEE reporting REST / JSON / CSV / Excel
  • 9. DG Operating Model • What do we want to capture? Asset Type: Business Terms, Policies, Rules, Code Values Attribute/Relation Type: Name, Definition, Example, Derivations, Specializations • Who should be involved in this process? Communities: Finance, HR, Student, Research Domains / subject areas: Task Management Users and User groups • How to execute and Monitor the process? Key events and workflow chains Validation rules
  • 10. SUDS Data Dictionary Example +4000 data elements Community context: Finance, HR, Research and Student Custom attribute types and relation types
  • 11. What attribute- and relation-types do we want to capture? Out of the box but also custom attribute types and relation types
  • 12. What attribute- and relation-types do we want to capture? • https://stanford.app.box.com/CollibraQuickReference • https://stanford.box.com/UsingCollibraFields
  • 13. Who is involved in the process? • https://compass.collibra.com/display/COOK/Role+Ty pes ResponsibleAccountable Informed Consulted
  • 14. Who? User groups and Dashboards
  • 15. Who? – User groups and Dashboards
  • 16. How to execute and monitor? From Best Practice to Auto-Validation Rules http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/?p=577 (generic example – not from SUDS)
  • 17. How to execute and monitor? • Status Types and Workflows E.g., For Domains, Terms, Users, and later for Issues and Data Sharing Agreements, we first define a “finite state machine” and then a set of workflows that each define a transition between states. This means workflows can trigger each other and form a complex chain. BUSINESS SEMANTICS GLOSSARY Candidate In Progress Under Review Accepted In Revision Rejected Term requested on the domain page 11 1 2 2 3 3 2 3 Depricated 4 5 Workflows 1 2 Propose Business Term Edit Business Term 3 Onboarding Business Term 4 Deprecate Business Term 5 Reactivate Business Term
  • 18. How it it to be governed? Onboarding Workflow (Not Stanford content - illustrative example only)
  • 19. How it it to be governed? Approval Workflow (not Stanford content - illustrative example only)
  • 20. Stanford DG Program Key Results (from http://web.stanford.edu/dept/pres-provost/cgi-bin/dg/wordpress/wp-content/uploads/2014/11/Stanford_DS_CAIR_v2.pdf • Understand data from multiple perspectives • Central repository of verified information (and better data infrastructure) • Easier access to information; less reliance on ‘oral tradition’ • Improved data quality, consistency • Increased understanding; thoughtful decision-making around data
  • 21. SUDS Future Directions • Continue building engagement around data governance (define policy), in addition to data stewardship (enforce policy) • Continue building engagement, especially by executive-level leadership • Continue increasing visibility and consumption of definitions and other metadata
  • 22. George Washington University (by courtesy of Ron Layne, GWU) • centralized • run by the DG Office division of IT • mapping data dictionaries, rules and metrics and data sharing agreements • Integration with Informatica Data Quality
  • 23. Flanders Research Information Space • Providing Scientific Research Information and Services • Easy • Transparent • Open • Timely • Unambiguous • Supported by Data Governance • Qualitative meta data: e.g., definition for project, funding codes, mappings, classifications, etc. • Roles and responsibilities for Information Providers and Stiweto • Collaborative workflows between Information Providers and Stiweto By courtesy of G. Van Grootel, EWI
  • 24. FRIS’ Data-driven Innovation Engine By courtesy of G. Van Grootel, EWI
  • 25. The Data providers landscape 25 Universities Research Institutes Funders Others Strategic Research Centers Universitiy Colleges By courtesy of G. Van Grootel, EWI
  • 26. FRIS Metamodel: an example By courtesy of G. Van Grootel, EWI
  • 27. Traceability diagram Node Description JRC (Joint Research Centre) The Business Term representing the Funding Source Zevende Kader Programma.. The Business Term representin the parent Funding Source 3723 Generation 1 Funding Code Value 258 Generation 2 Funding Code Value G3 The Funding Stream Code Value By courtesy of G. Van Grootel, EW
  • 28. Conclusions • Case by Case, success by success • Identify key events and design workflow ‘chains’ to automate governance • To support your specific use case and the growing DG platform you need extend asset, relation, attribute types • Collaboration and business user friendliness • BOK http://compass.collibra.com
  • 29. Questions For Audience • How much % of data user need to look up the definition of a term? • How many % wants to know where data around a term is stored. • How many business terms do you have? • Who is in charge for data quality / governance ? • How much % of data definition decisions depends on business?

Editor's Notes

  1. Audience from various academic institutions before collibra I was a researcher and assisttant prof in 3 univ In fact collibra is a university spinoff valorising the research on data govenance, ontologies and semantics web.
  2. We should know how universities tick. From my own experience as an employee and as vendor I think I know how university departments thrive as decentralized and autonomous entities. No data governance does not mean data quality can be managed good. It us just that globalization and increased data servicing between university entities that makes quality and truth of data relative, and we more and more have to rely on mutual trust.
  3. Put yourself in the user of your data. An consider these questions ?
  4. Stewardship activities doing: scattered, no unified operatnig model and no clear sight on the results of doing it.
  5. Een goede cae om dit te illustreren is de Vlaamse departement van Economie, Wetenschap en Innovatie.
  6. Logo’s van alle Hogescholen post fusie
  7. Illustrates the implemented FRIS Metamodel in the DGC operational model. Allows for formal named relations between the different FRIS Business leveling model concepts