SlideShare a Scribd company logo
Data Governance
in a Big Data Era
Pieter De Leenheer, PhD
Stanford University
Nov 3, 2016
Misconceptions of Data Governance that
impede Data Valuation
• Data governance is a published repository of common definitions.
• Data governance is a concern of – and hence managed by – IT.
• Data governance is just data quality (DQ) and master data
management (MDM).
• Data governance is siloed by business function.
• Data governance provides no value or participation for the data-
consuming community.
Admin
• http://www.slideshare.net/pdeleenh/data-governance-in-the-big-data-era
Hierarchical Data
Management
• Formal
• Operational and analytical data
• Inward Focus:
• Improve Internal/external coordination
• Understand customer
• Predict next transaction
• Controlled by Central Provider
• MDM, DWH, DM, Dashboards
• Tedious Waterfall
• Comprised by Obsolete Cost assumption
• Consumer
• Small Elite C-level
Hierarchical Data
Governance
• Wikipedia: “a set of processes that ensures that important
data assets are formally managed throughout the enterprise.
Data governance ensures that data can be trusted and that
people can be made accountable for any adverse event that
happens because of low data quality”.
• biased by Total (Data) Quality Management practice
• Suggest ‘policing’ rather than ‘empowerment’
• How to evolve to a democratic networked approach?
• Involves IC’s and middle-management
• With less middle-men slack
• Dealing with Big Data
Data Big Bang
• Phenomenon: connectivity between
• Social
• Knowledge
• Technology
• Draws curiosity
• Web Science (Pentland, etc)
• Big Data Native Market Entrants (23andMe, Uber,
Inventure)
• Disruption
• Bottom up
• Starting From data
• Low end
• +80% unstructured data or ‘dark matter’
Three Forces Shaping
the Digital Economy (1)
1. Digitalization of the Physical
• Entertainment, Wealth, Biology,
Chemistry
• MPx, Paypal, Bitcoin, 3d printing, IoT, VR
2. Sustained and accelerated growth of
digital power (despite slow down
Moore’s Law)
• Mass parallelization (Hadoop and Hive)
• Move function and reliability to
software
• Miniaturization
Three Forces Shaping the
Digital Economy (2)
3. Modular and Generative Programmability
“By carefully excluding features that are not universally useful
Internet technologies became easily adopted on a massive
scale and gave the Web a generative [i.e. self-reproductive]
character” (Zittrain, 2009).
• This opens new business models unimaginable before:
• apps extend function of a smartphone
• aggregations of components in complex machines
• once digitized opens new ways of manipulation and
transport
The “Dark Matter” of Big Data Universe
• Observed consequence of these forces:
1. Consumerization of Digital Technologies pivoting around 2000
2. Grassroot Participation / Peer-based
3. Digitalization of Trust
• All contribute to Big Data
• (2) and (3) contribute to Social Capital: Dark Matter (aka
unstructured data)?
• Human communication, Text heavy
• Context: emphasis, emotion, location at moment of capturing
changes meaning:
• “I did not say Peter’s talk stinks”
Data-driven Hierarchies, Networks &Hybrids
Hierarchical Networked Network peers provide ideas, feedback but
also service (uber driver analogy data scientist)
Product Ownership Service (hence Data) Access
Example: Uber doesn’t own. It only dispatches
information about rolling material to riders
and focus over lifetime value retention.
Data analogy: access to data more important
than owning as cost of IS is marginal and
replaced by data value appreciation by using
community
Passive resources (material,
goods)
Active resources (data,
consumer)
Value-in-exchange Value-in-use
Acquisition Retention Example: Saas, Netflix, Costco, etc.
Data analogy: From formal roles and
responsibilities to support internal process to
social capital based trust
Process Relations
Provider push Consumer pulls Example: Feedback, mods on games, user
participation, A/b testing etc.
Data analogy: data helpdesk
Consumerization of tech, grassroot participation, digitalization of trust
Shift in Data Governance Approaches
• Consequences of digital forces gigantic risk on organizations even with
hierarchical governance
• Hierarchical data governance
• Few consumers served by a central oblique provider
• Inward
• Compromises on old obsolete cost assumptions of digital power
• Use of digital optimizes to some extent
• Not scalable for big data by larger ‘data scientist’ populations
• Combine with Networked Approach
• Democratization (production)
• Breadlines
• Consumerization of BI and cheap digital power
• Many serve many
• Supports customer
• Amazonification (consumption)
• Access, SLA, Trust, etc
• Outward
Big Data Analytics Challenges
• When everybody has data scientists: predict next
transaction is not competitive anymore
• from 'predict next transaction' to life-long relation
building and value creation
• reduce search and navigation for customer with
better apps
• crowd sourcing to cross compare with and learn
from other customers (Opower, INRIX, zillow)
• get trust from customer through branded non-intrusive
apps: personal health monitoring, Nest
• Retention analysis example
Big Data Governance Challenges
• Scalable Balance between (hierarchical) control and (networked) empowerment
• Minimize search for data sets
• Advanced descriptors such as business glossary
• Manage attention drift in case of proliferation
• Usage (page ranking): data sets that are reused more are more relevant
• Digitalization of Trust
• Authenticity: lineage and provenance
• data sets owned by people in your social capital
• Price: prices may be a mechanism but is difficult to identify a fair price and
establish a currency-based market for data assets: see Infonomics
• Service level agreements
Digitalization of Trust
Challenges
• In Hierarchical data governance trust
• established by a centrally sanctioned competence center
• Or external appointed trustees with formal roles: steward,
owners, architects
• In networked peer-driven approach Trust is more complicated:
• Authenticity: is the data factual or opiniated?
• Intention: does this data have good intentions? Can I use
it without peril? Hidden privacy concerns I should be
aware of?
• Assess expertise or quality: are people involved skilled or
certified stewards?
• Is it accurately representing our business reality, i.e.
customer base?
• Is it complete and up to date?
• Has it be certified through standard process?
Danger of the old paradigm models
• Weapons of Math Destruction (WMD) are
models
• Threaten to destabilize
• Equality
• Democracy
• Traits of WMDs
• Opaque
• Unregulated
• Uncontestable
• …hence : ungoverned
The Rise of the Chief Data Officer (CD0) [6]
Data governance & stewardship provide the right level of control and trust in data
Data Infrastructure (IT) Data Consumers (Business)
LEADERSHIP
CEO, CFO, VP, Marketing
ROLES
Data Scientist, Business
Analyst
TECHNOLOGY
Visualization, Self-service BI
NEED
Data
Authority
LEADERSHIP
CIO
ROLES
Information Manager, Data
Architect, Data Modeler
TECHNOLOGY
Hadoop, Databases, Data
Integration
Data Authority
LEADERSHIP
Chief Data Officer
ROLES
Data Governance Manager,
Data Steward
TECHNOLOGY
Data Stewardship
Platform
Recommendations for the Chief Data Officer
• Collaboration: inwards / outwards
• Data Space: traditional data / big
data
• Value Impact: service / strategy
• Join our MIT Sloan CDO Research
• http://www.iscdo.org/
Conclusion
• Digital forces have digitally empowered individuals in the organization
• Hybrid data governance approach should combine
• Hierarchical control of critical data assets to enhance internal coordination
• Networked peer-driven empowerment to drive ‘serendipity’
• On a shared platform
• Key challenges are:
• Digitalization of trust with focus on social capital
• Big data analytics that drives life-time value for customer
• Get rid of old models that are oblique, unregulated and incontestable
• Recognize CDO Leadership and Role transition
Recommended Reading
• O’Neil, C.: Weapons of Math Destruction
• Franks, B.: Taming the Big Data Tidal Wave
• Sundararajan, A.: The Sharing Economy
• Pentland, S.: Social Physics: How Good Ideas Spread
• Madnick, R. et al.: A Cubic Framework for the Chief Data Officer
• Zittrain, J.: The Future of the Internet
• https://www.collibra.com/blog/unleash-the-data-democracy-5-
misconceptions-of-data-governance/
• https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/

More Related Content

What's hot

Big Data Boom
Big Data BoomBig Data Boom
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
Caserta
 
Data Management for Dummies
Data Management for DummiesData Management for Dummies
Data Management for Dummies
Dmitrii Kovalchuk
 
Data leaders summit 2019
Data leaders summit 2019Data leaders summit 2019
Data leaders summit 2019
Harvinder Atwal
 
Threat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the OutsideThreat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the Outside
DLT Solutions
 
Data Management
Data Management Data Management
Data Management
Biswajeet Dasmajumdar
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Mark Hewitt
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
Christopher Bradley
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?DATAVERSITY
 
Analyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataAnalyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast Data
EMC
 
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...emermell
 
TDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse InfrastructureTDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse Infrastructure
Jeannette Browning
 
Data Leaders Summit Barcelona 2018
Data Leaders Summit Barcelona 2018Data Leaders Summit Barcelona 2018
Data Leaders Summit Barcelona 2018
Harvinder Atwal
 
RWDG Webinar: A Data Governance Framework for Smart Data
RWDG Webinar: A Data Governance Framework for Smart DataRWDG Webinar: A Data Governance Framework for Smart Data
RWDG Webinar: A Data Governance Framework for Smart Data
DATAVERSITY
 
How to Integrate Data and Protect Privacy
How to Integrate Data and Protect PrivacyHow to Integrate Data and Protect Privacy
How to Integrate Data and Protect Privacy
DATAVERSITY
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
Lenia Miltiadous
 
Data Management
Data ManagementData Management
Data Management
BashirMutebi1
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
Vasu S
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
Tony Bain
 

What's hot (19)

Big Data Boom
Big Data BoomBig Data Boom
Big Data Boom
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Data Management for Dummies
Data Management for DummiesData Management for Dummies
Data Management for Dummies
 
Data leaders summit 2019
Data leaders summit 2019Data leaders summit 2019
Data leaders summit 2019
 
Threat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the OutsideThreat Ready Data: Protect Data from the Inside and the Outside
Threat Ready Data: Protect Data from the Inside and the Outside
 
Data Management
Data Management Data Management
Data Management
 
Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...Building an Effective Data & Analytics Operating Model A Data Modernization G...
Building an Effective Data & Analytics Operating Model A Data Modernization G...
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
 
Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?Real-World Data Governance: What is a Data Steward and What Do They Do?
Real-World Data Governance: What is a Data Steward and What Do They Do?
 
Analyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataAnalyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast Data
 
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
Making ‘Big Data’ Your Ally – Using data analytics to improve compliance, due...
 
TDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse InfrastructureTDWI checklist 2018 - Data Warehouse Infrastructure
TDWI checklist 2018 - Data Warehouse Infrastructure
 
Data Leaders Summit Barcelona 2018
Data Leaders Summit Barcelona 2018Data Leaders Summit Barcelona 2018
Data Leaders Summit Barcelona 2018
 
RWDG Webinar: A Data Governance Framework for Smart Data
RWDG Webinar: A Data Governance Framework for Smart DataRWDG Webinar: A Data Governance Framework for Smart Data
RWDG Webinar: A Data Governance Framework for Smart Data
 
How to Integrate Data and Protect Privacy
How to Integrate Data and Protect PrivacyHow to Integrate Data and Protect Privacy
How to Integrate Data and Protect Privacy
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
 
Data Management
Data ManagementData Management
Data Management
 
Modern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | QuboleModern Integrated Data Environment - Whitepaper | Qubole
Modern Integrated Data Environment - Whitepaper | Qubole
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 

Viewers also liked

Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
Christopher Bradley
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
Boris Otto
 
Ibm data governance framework
Ibm data governance frameworkIbm data governance framework
Ibm data governance framework
kaiyun7631
 
Data Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with ClouderaData Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with Cloudera
Caserta
 
Data Governance
Data GovernanceData Governance
Data Governance
SambaSoup
 
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
Ui Ikeuchi
 
An Overview of Data Completeness Assessment Techniques
An Overview of Data Completeness Assessment TechniquesAn Overview of Data Completeness Assessment Techniques
An Overview of Data Completeness Assessment Techniques
srazniewski
 
Data Quality Management - Data Issue Management & Resolutionn / Practical App...
Data Quality Management - Data Issue Management & Resolutionn / Practical App...Data Quality Management - Data Issue Management & Resolutionn / Practical App...
Data Quality Management - Data Issue Management & Resolutionn / Practical App...
Burak S. Arikan
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
DataWorks Summit
 
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンスデータライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
Ui Ikeuchi
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
Khaled Mosharraf
 
Agile Data Governance Tutorial
Agile Data Governance TutorialAgile Data Governance Tutorial
Agile Data Governance Tutorial
Tami Flowers
 
Data Governance and the Internet of Things
Data Governance and the Internet of ThingsData Governance and the Internet of Things
Data Governance and the Internet of Things
DATAVERSITY
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend
Jean-Michel Franco
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
Beth Fitzpatrick
 
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
SAP Ariba
 
The Chief Data Officer Golden Rules to Data Quality and Data Governance Success
The Chief Data Officer Golden Rules to Data Quality and Data Governance SuccessThe Chief Data Officer Golden Rules to Data Quality and Data Governance Success
The Chief Data Officer Golden Rules to Data Quality and Data Governance Success
Mario Faria
 
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
DATAVERSITY
 

Viewers also liked (20)

Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Ibm data governance framework
Ibm data governance frameworkIbm data governance framework
Ibm data governance framework
 
Data Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with ClouderaData Governance, Compliance and Security in Hadoop with Cloudera
Data Governance, Compliance and Security in Hadoop with Cloudera
 
Data Governance
Data GovernanceData Governance
Data Governance
 
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
オープンサイエンスを支えるデータライブラリアン 第16回図書館総合展(2014)
 
An Overview of Data Completeness Assessment Techniques
An Overview of Data Completeness Assessment TechniquesAn Overview of Data Completeness Assessment Techniques
An Overview of Data Completeness Assessment Techniques
 
Data Quality Management - Data Issue Management & Resolutionn / Practical App...
Data Quality Management - Data Issue Management & Resolutionn / Practical App...Data Quality Management - Data Issue Management & Resolutionn / Practical App...
Data Quality Management - Data Issue Management & Resolutionn / Practical App...
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンスデータライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
データライブラリアンとその育成:ラーニングコモンズとDigital Scholarship,オープンサイエンス
 
Data quality management Basic
Data quality management BasicData quality management Basic
Data quality management Basic
 
Agile Data Governance Tutorial
Agile Data Governance TutorialAgile Data Governance Tutorial
Agile Data Governance Tutorial
 
Data Governance and the Internet of Things
Data Governance and the Internet of ThingsData Governance and the Internet of Things
Data Governance and the Internet of Things
 
Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend Unleashing the value of metadata with Talend
Unleashing the value of metadata with Talend
 
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deckDC Salesforce1 Tour Data Governance Lunch Best Practices deck
DC Salesforce1 Tour Data Governance Lunch Best Practices deck
 
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
Best Practices in Data Governance and Integration for Driving Supply Chain Ex...
 
The Chief Data Officer Golden Rules to Data Quality and Data Governance Success
The Chief Data Officer Golden Rules to Data Quality and Data Governance SuccessThe Chief Data Officer Golden Rules to Data Quality and Data Governance Success
The Chief Data Officer Golden Rules to Data Quality and Data Governance Success
 
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
Real-World Data Governance Webinar: Big Data Governance - What Is It and Why ...
 
Data Quality Presentation
Data Quality PresentationData Quality Presentation
Data Quality Presentation
 

Similar to Data Governance in the Big Data Era

Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
Pieter De Leenheer
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
Pieter De Leenheer
 
Big data
Big dataBig data
Big data
Riya
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
Mukul Krishna
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Umair Shafique
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
Dr.Florence Dayana
 
Big data
Big dataBig data
Big data
Sakshi Chawla
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Nathan Bijnens
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Denodo
 
Perspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data GovernancePerspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data Governance
Cloudera, Inc.
 
Slides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data GovernanceSlides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data Governance
DATAVERSITY
 
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptxExplorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
windu19
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
IT Network marcus evans
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
DATAVERSITY
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData Blueprint
 
How a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewHow a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 View
Denodo
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
DATAVERSITY
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
Denodo
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1
RUHULAMINHAZARIKA
 
Down to Business: Taking Action Quickly with Linked Data Services
Down to Business: Taking Action Quickly with Linked Data ServicesDown to Business: Taking Action Quickly with Linked Data Services
Down to Business: Taking Action Quickly with Linked Data Services
Inside Analysis
 

Similar to Data Governance in the Big Data Era (20)

Data Governance in a big data era
Data Governance in a big data eraData Governance in a big data era
Data Governance in a big data era
 
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
 
Big data
Big dataBig data
Big data
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
Big data
Big dataBig data
Big data
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
 
Perspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data GovernancePerspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data Governance
 
Slides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data GovernanceSlides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data Governance
 
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptxExplorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
Explorasi Data untuk Peluang Bisnis dan Pengembangan Karir.pptx
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Data-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data ModelingData-Ed Online: Trends in Data Modeling
Data-Ed Online: Trends in Data Modeling
 
Data-Ed: Trends in Data Modeling
Data-Ed: Trends in Data ModelingData-Ed: Trends in Data Modeling
Data-Ed: Trends in Data Modeling
 
How a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 ViewHow a Logical Data Fabric Enhances the Customer 360 View
How a Logical Data Fabric Enhances the Customer 360 View
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Increasing Agility Through Data Virtualization
Increasing Agility Through Data VirtualizationIncreasing Agility Through Data Virtualization
Increasing Agility Through Data Virtualization
 
Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1Big Data Analytics Materials, Chapter: 1
Big Data Analytics Materials, Chapter: 1
 
Down to Business: Taking Action Quickly with Linked Data Services
Down to Business: Taking Action Quickly with Linked Data ServicesDown to Business: Taking Action Quickly with Linked Data Services
Down to Business: Taking Action Quickly with Linked Data Services
 

Recently uploaded

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
ViralQR
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 

Recently uploaded (20)

The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.Welocme to ViralQR, your best QR code generator.
Welocme to ViralQR, your best QR code generator.
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 

Data Governance in the Big Data Era

  • 1. Data Governance in a Big Data Era Pieter De Leenheer, PhD Stanford University Nov 3, 2016
  • 2. Misconceptions of Data Governance that impede Data Valuation • Data governance is a published repository of common definitions. • Data governance is a concern of – and hence managed by – IT. • Data governance is just data quality (DQ) and master data management (MDM). • Data governance is siloed by business function. • Data governance provides no value or participation for the data- consuming community.
  • 4. Hierarchical Data Management • Formal • Operational and analytical data • Inward Focus: • Improve Internal/external coordination • Understand customer • Predict next transaction • Controlled by Central Provider • MDM, DWH, DM, Dashboards • Tedious Waterfall • Comprised by Obsolete Cost assumption • Consumer • Small Elite C-level
  • 5. Hierarchical Data Governance • Wikipedia: “a set of processes that ensures that important data assets are formally managed throughout the enterprise. Data governance ensures that data can be trusted and that people can be made accountable for any adverse event that happens because of low data quality”. • biased by Total (Data) Quality Management practice • Suggest ‘policing’ rather than ‘empowerment’ • How to evolve to a democratic networked approach? • Involves IC’s and middle-management • With less middle-men slack • Dealing with Big Data
  • 6. Data Big Bang • Phenomenon: connectivity between • Social • Knowledge • Technology • Draws curiosity • Web Science (Pentland, etc) • Big Data Native Market Entrants (23andMe, Uber, Inventure) • Disruption • Bottom up • Starting From data • Low end • +80% unstructured data or ‘dark matter’
  • 7. Three Forces Shaping the Digital Economy (1) 1. Digitalization of the Physical • Entertainment, Wealth, Biology, Chemistry • MPx, Paypal, Bitcoin, 3d printing, IoT, VR 2. Sustained and accelerated growth of digital power (despite slow down Moore’s Law) • Mass parallelization (Hadoop and Hive) • Move function and reliability to software • Miniaturization
  • 8. Three Forces Shaping the Digital Economy (2) 3. Modular and Generative Programmability “By carefully excluding features that are not universally useful Internet technologies became easily adopted on a massive scale and gave the Web a generative [i.e. self-reproductive] character” (Zittrain, 2009). • This opens new business models unimaginable before: • apps extend function of a smartphone • aggregations of components in complex machines • once digitized opens new ways of manipulation and transport
  • 9. The “Dark Matter” of Big Data Universe • Observed consequence of these forces: 1. Consumerization of Digital Technologies pivoting around 2000 2. Grassroot Participation / Peer-based 3. Digitalization of Trust • All contribute to Big Data • (2) and (3) contribute to Social Capital: Dark Matter (aka unstructured data)? • Human communication, Text heavy • Context: emphasis, emotion, location at moment of capturing changes meaning: • “I did not say Peter’s talk stinks”
  • 10. Data-driven Hierarchies, Networks &Hybrids Hierarchical Networked Network peers provide ideas, feedback but also service (uber driver analogy data scientist) Product Ownership Service (hence Data) Access Example: Uber doesn’t own. It only dispatches information about rolling material to riders and focus over lifetime value retention. Data analogy: access to data more important than owning as cost of IS is marginal and replaced by data value appreciation by using community Passive resources (material, goods) Active resources (data, consumer) Value-in-exchange Value-in-use Acquisition Retention Example: Saas, Netflix, Costco, etc. Data analogy: From formal roles and responsibilities to support internal process to social capital based trust Process Relations Provider push Consumer pulls Example: Feedback, mods on games, user participation, A/b testing etc. Data analogy: data helpdesk Consumerization of tech, grassroot participation, digitalization of trust
  • 11. Shift in Data Governance Approaches • Consequences of digital forces gigantic risk on organizations even with hierarchical governance • Hierarchical data governance • Few consumers served by a central oblique provider • Inward • Compromises on old obsolete cost assumptions of digital power • Use of digital optimizes to some extent • Not scalable for big data by larger ‘data scientist’ populations • Combine with Networked Approach • Democratization (production) • Breadlines • Consumerization of BI and cheap digital power • Many serve many • Supports customer • Amazonification (consumption) • Access, SLA, Trust, etc • Outward
  • 12. Big Data Analytics Challenges • When everybody has data scientists: predict next transaction is not competitive anymore • from 'predict next transaction' to life-long relation building and value creation • reduce search and navigation for customer with better apps • crowd sourcing to cross compare with and learn from other customers (Opower, INRIX, zillow) • get trust from customer through branded non-intrusive apps: personal health monitoring, Nest • Retention analysis example
  • 13. Big Data Governance Challenges • Scalable Balance between (hierarchical) control and (networked) empowerment • Minimize search for data sets • Advanced descriptors such as business glossary • Manage attention drift in case of proliferation • Usage (page ranking): data sets that are reused more are more relevant • Digitalization of Trust • Authenticity: lineage and provenance • data sets owned by people in your social capital • Price: prices may be a mechanism but is difficult to identify a fair price and establish a currency-based market for data assets: see Infonomics • Service level agreements
  • 14. Digitalization of Trust Challenges • In Hierarchical data governance trust • established by a centrally sanctioned competence center • Or external appointed trustees with formal roles: steward, owners, architects • In networked peer-driven approach Trust is more complicated: • Authenticity: is the data factual or opiniated? • Intention: does this data have good intentions? Can I use it without peril? Hidden privacy concerns I should be aware of? • Assess expertise or quality: are people involved skilled or certified stewards? • Is it accurately representing our business reality, i.e. customer base? • Is it complete and up to date? • Has it be certified through standard process?
  • 15. Danger of the old paradigm models • Weapons of Math Destruction (WMD) are models • Threaten to destabilize • Equality • Democracy • Traits of WMDs • Opaque • Unregulated • Uncontestable • …hence : ungoverned
  • 16. The Rise of the Chief Data Officer (CD0) [6] Data governance & stewardship provide the right level of control and trust in data Data Infrastructure (IT) Data Consumers (Business) LEADERSHIP CEO, CFO, VP, Marketing ROLES Data Scientist, Business Analyst TECHNOLOGY Visualization, Self-service BI NEED Data Authority LEADERSHIP CIO ROLES Information Manager, Data Architect, Data Modeler TECHNOLOGY Hadoop, Databases, Data Integration Data Authority LEADERSHIP Chief Data Officer ROLES Data Governance Manager, Data Steward TECHNOLOGY Data Stewardship Platform
  • 17. Recommendations for the Chief Data Officer • Collaboration: inwards / outwards • Data Space: traditional data / big data • Value Impact: service / strategy • Join our MIT Sloan CDO Research • http://www.iscdo.org/
  • 18. Conclusion • Digital forces have digitally empowered individuals in the organization • Hybrid data governance approach should combine • Hierarchical control of critical data assets to enhance internal coordination • Networked peer-driven empowerment to drive ‘serendipity’ • On a shared platform • Key challenges are: • Digitalization of trust with focus on social capital • Big data analytics that drives life-time value for customer • Get rid of old models that are oblique, unregulated and incontestable • Recognize CDO Leadership and Role transition
  • 19. Recommended Reading • O’Neil, C.: Weapons of Math Destruction • Franks, B.: Taming the Big Data Tidal Wave • Sundararajan, A.: The Sharing Economy • Pentland, S.: Social Physics: How Good Ideas Spread • Madnick, R. et al.: A Cubic Framework for the Chief Data Officer • Zittrain, J.: The Future of the Internet • https://www.collibra.com/blog/unleash-the-data-democracy-5- misconceptions-of-data-governance/ • https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/

Editor's Notes

  1. Data governance is a published repository of common definitions. This is an incomplete definition of data governance. Of course, a common glossary is a foundational component of many data governance initiatives. However, a repository is only trustworthy if a meaningful and transparent process and responsive ownership is in place to maintain it. Trust is an essential value to achieve democratic data governance. Data governance is a concern of – and hence managed by – IT. This definition excludes the business side of data governance. Indeed, IT plays a crucial role in the underlying identification of authoritative sources and verification of their lineage. Yet the business as a consumer has an inevitable role in the certification of the business context on the data assets you manage. Data governance is just data quality (DQ) and master data management (MDM). It’s true that data quality and MDM are data management activities which have to be governed. Yet DQ and MDM are about finding a mathematical truth for data in terms of quantifiable dimensions such as accuracy and completeness. Data governance goes beyond DQ and MDM by building trust in data which only human beings can qualify. Again, trust comes into the picture as an essential value in democratic data governance. Data governance is siloed by business function. Your organization may be extremely decentralized and geographically distributed. Yet that doesn’t mean you can’t establish a coordinated approach to data governance among autonomous sub-organizations. Many organizations that are decentralized and geographically distributed such as universities and global banks have successfully implemented a shared platform. Moreover, organizations can gain competitive advantage by having a broader perspective on the business as a result of global data governance. Data governance provides no value or participation for the data-consuming community. This definition is clearly wrong. Self-service BI tools empower more and more consumers to also produce data and reports for their own applications. Data governance policies help define how confidential data can be used and how to ensure data security and quality. If trust is an essential value in the holistic governance of data, then it should be grounded in transparency and equal participation for all data citizens, which necessarily includes the consumers of the data. All together, they are your sentinels who can identify data issues in a more granular way which the traditional monitoring could not.
  2. 23 and me geno Inventure credit scoring in emergin markets
  3. If all these communication channels are in place, we can trace whereabouts and usage of every data product individually, from the definition level down to the storage.
  4. Outwards: e.g., manufacturing company may agree with his suppliers and distributors on 1 global product ID Traditional: enterprise-level MDM, BI and Analytics Big Data: more on the application level, more self-service BI, more data scientist experimenting with big data require appropr. Approvals for data usage and sharing Data as a service as immediate need to improve service quality, regulatory compliance, reputation of the company Data as a strategy: build aggregated data products and resell them as a strategy: e.g., the ab company selling GPS information of cabs to Google.