SlideShare a Scribd company logo
1 of 18
Download to read offline
Data Quality, Data Engineering
and Data Science for
Better Insights – Series of Questions
1
Tom Redman
Prashanth Southekal
Our plan
• Look holistically at the union of data quality
engineering, and science.
• An open-ended discussion
2
• Use The Leader’s Data Manifesto to start
or continue important conversations about
managing data assets
• Use these conversations to initiate action
within your organization to better manage
data assets
• Support this movement and show you
are committed to change by signing the
manifesto online at www.dataleaders.org
© 2017 dataleaders.org
Bad Data is a hidden killer
Baseline:
• Best estimate is 45% of newly-created data records have a critical
error.
• Best estimate is CoPDQ ~ 20% of revenue.
In Data Science:
• Impact is different:
• Errors may “cancel out.”
• Bad data  Bad decision/prediction  Impacts thousands
(e.g., financial crisis)
• Bad data  Bad algorithm  Damage potentially unlimited
• etc
• Aligning data sources is a far bigger challenge.
• Aligning decision makers is a far bigger challenge
• Etc
©DQS, 2000-2017
Data for Business Performance
My Book – Data for Business
Performance is in line with you
have just said.
Specifically the book has three key
elements that makes it special.
1. The book is holistic
2. The book is for practitioners
3. The book is technology
agnostic
5
Reference Data
Master Data
Transactional DataMetadata
Relationships between Different Data Types
6
Business Data
Technical Data
© 2017 DBP-Institute
7
Example of Integrated Business Data
Reference Data
Reference Data
Master Data
Master Data
Transactional DataReference Data
Transactional Data
© 2017 DBP-Institute
Structured vs. Unstructured data
Depending on the manner in which data is initially created or recorded, data
can be categorized into two main forms.
• Structured Data. Data that resides in a fixed field within a record or file is
structured data.
• Unstructured Data. Unstructured data is the data in its native state. i.e.
data doesn't have a any predefined data structure when created.
8
Customer Identifier
10 Digit Numeric
Code
Description with
25 character
Structured
Data
Un-Structured
Data
© 2017 DBP-Institute
Taxonomy holds the key in Unstructured Data
9© 2017 DBP-Institute
Data Science in the Big Picture
“Data revolution” or not, we need to get better at practically
everything:
• Day-in, day-out work: Largely quality.
• Management. Planning.
• Put data to work in new and exciting ways. My current list:
– Making better decisions
– Innovation
– Informationalization
– Providing content
– Infomediation
– Creating and Leveraging asymmetries
• Seeking out, leveraging, and protecting proprietary data.
10©DQS, 2000-2017
Putting Data to work: End-to-end process
©DQS, 2000-2017
Data
Discovery
(data science)
Delivery “Dollars”
The D4 Process:
Acquire and understand
“potentially interesting” data
Find something “truly
interesting” in that data
Deliver the discovery in the
form of a product/service/report
“Monetize” the discovery
12
Origination Capture Validation Processing Distribution Aggregation Interpretation Consumption
Data Storage
Data Security
Dominance in the Data Lifecycle (DLC)
Data Engineering
Data Science
8 of the 10
stages in the DLC
pertain to Data
Engineering
© 2017 DBP-Institute
Data Engineering V/s Data Cleansing
13
Data Engineering
Origination Capture Validation Processing Distribution Aggregation
Data
Cleansing
60% of the
Effort in
deriving Insights
© 2017 DBP-Institute
Where do the data scientists sit?
Basic Process
Improvements
New,
sophisticated
algorithms
Fundamental
New
Discovery
In the line:
And everyone
is involved
In a “lab”
Analytical “sophistication”
“Home”
©DQS, 2000-2017
Data Lab and Data Factory
• Lab for discovery, new products: Different
management mind-set, people, goals
• Factory for scale, control, profit:
• Connect the two!
15©DQS, 2000-2017
16
Building
blocks of
Data Factory
Manage Core
business
processes in the
SoR
Manage
Reference and
Master data
with Standards
Enable Data
Integration
using Standards
Position Data
Governance as a
Business
Function
What’s required in the factory?
1 2
3 4
© 2017 DBP-Institute
Most important takeaways:
Prashanth:
• Business Performance can
be achieved any aligning
data to the business goals,
key questions, and KPIs.
• There is no data
management endeavour
without a customer.
• Data Quality is a journey
and NOT a destination
Tom:
• The “data space” is
advancing too slowly and it
is time for data practitioners
to push far harder.
• The “easiest” place to begin
is with quality. And the
benefits stun.
• Data practitioners must also
take on the tough
organizational issues.
17
Our Profiles
18
Dr. Prashanth H Southekal is the Managing Principal of
DBP-Institute. He brings over 20 years of Information
Management from companies such as SAP AG, Accenture,
Deloitte, P&G, and General Electric. Dr. Southekal has
published three books on Information
Management including "Data for Business Performance".
Dr. Thomas C Redman "the Data Doc” is an Advisor at DBP-
Institute. Dr Redman is a world renowned thought leader
who has helped blue-chip companies such as Chevron,
Shell, JP Morgan, and AT&T make big improvements in
Information Management. He has written dozens of papers
and five books including the most popular “Data Driven:
Profiting from Your Most Important Business Asset”.

More Related Content

What's hot

Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...DATAVERSITY
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
Balancing Data and Processes to Achieve Organizational Maturity
Balancing Data and Processes to Achieve Organizational MaturityBalancing Data and Processes to Achieve Organizational Maturity
Balancing Data and Processes to Achieve Organizational MaturityDATAVERSITY
 
Data Monetization
Data MonetizationData Monetization
Data MonetizationDATAVERSITY
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data SquaredDATAVERSITY
 
Governing Quality Analytics
Governing Quality AnalyticsGoverning Quality Analytics
Governing Quality AnalyticsDATAVERSITY
 
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...DATAVERSITY
 
Mastering Data Modeling for NoSQL Platforms
Mastering Data Modeling for NoSQL PlatformsMastering Data Modeling for NoSQL Platforms
Mastering Data Modeling for NoSQL PlatformsDATAVERSITY
 
RWDG Slides: Using Agile to Justify Data Governance
RWDG Slides: Using Agile to Justify Data GovernanceRWDG Slides: Using Agile to Justify Data Governance
RWDG Slides: Using Agile to Justify Data GovernanceDATAVERSITY
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...DATAVERSITY
 
DI&A Webinar: Big Data Analytics
DI&A Webinar: Big Data AnalyticsDI&A Webinar: Big Data Analytics
DI&A Webinar: Big Data AnalyticsDATAVERSITY
 
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDATAVERSITY
 
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...DATAVERSITY
 
Trends in Data Analytics - From Database to Analyst
Trends in Data Analytics - From Database to AnalystTrends in Data Analytics - From Database to Analyst
Trends in Data Analytics - From Database to AnalystDATAVERSITY
 
Is a Data Governance Charter Necessary?
Is a Data Governance Charter Necessary?Is a Data Governance Charter Necessary?
Is a Data Governance Charter Necessary?DATAVERSITY
 
RWDG Slides: Three Approaches to Data Stewardship
RWDG Slides: Three Approaches to Data StewardshipRWDG Slides: Three Approaches to Data Stewardship
RWDG Slides: Three Approaches to Data StewardshipDATAVERSITY
 
Data Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityData Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityDATAVERSITY
 
Big Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataBig Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataDATAVERSITY
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeDATAVERSITY
 
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...DATAVERSITY
 

What's hot (20)

Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
Conformed Dimensions of Data Quality – An Organized Approach to Data Quality ...
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Balancing Data and Processes to Achieve Organizational Maturity
Balancing Data and Processes to Achieve Organizational MaturityBalancing Data and Processes to Achieve Organizational Maturity
Balancing Data and Processes to Achieve Organizational Maturity
 
Data Monetization
Data MonetizationData Monetization
Data Monetization
 
Metadata Strategies - Data Squared
Metadata Strategies - Data SquaredMetadata Strategies - Data Squared
Metadata Strategies - Data Squared
 
Governing Quality Analytics
Governing Quality AnalyticsGoverning Quality Analytics
Governing Quality Analytics
 
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...
Real-World Data Governance Webinar: Using Data Governance to Achieve Data Qua...
 
Mastering Data Modeling for NoSQL Platforms
Mastering Data Modeling for NoSQL PlatformsMastering Data Modeling for NoSQL Platforms
Mastering Data Modeling for NoSQL Platforms
 
RWDG Slides: Using Agile to Justify Data Governance
RWDG Slides: Using Agile to Justify Data GovernanceRWDG Slides: Using Agile to Justify Data Governance
RWDG Slides: Using Agile to Justify Data Governance
 
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling –...
 
DI&A Webinar: Big Data Analytics
DI&A Webinar: Big Data AnalyticsDI&A Webinar: Big Data Analytics
DI&A Webinar: Big Data Analytics
 
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics ArchitectureDI&A Webinar: Building a Flexible and Scalable Analytics Architecture
DI&A Webinar: Building a Flexible and Scalable Analytics Architecture
 
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...
CDO Webinar: Coordinating Your Data Strategies – When Data Management Worlds ...
 
Trends in Data Analytics - From Database to Analyst
Trends in Data Analytics - From Database to AnalystTrends in Data Analytics - From Database to Analyst
Trends in Data Analytics - From Database to Analyst
 
Is a Data Governance Charter Necessary?
Is a Data Governance Charter Necessary?Is a Data Governance Charter Necessary?
Is a Data Governance Charter Necessary?
 
RWDG Slides: Three Approaches to Data Stewardship
RWDG Slides: Three Approaches to Data StewardshipRWDG Slides: Three Approaches to Data Stewardship
RWDG Slides: Three Approaches to Data Stewardship
 
Data Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great AccountabilityData Governance Strategies - With Great Power Comes Great Accountability
Data Governance Strategies - With Great Power Comes Great Accountability
 
Big Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling MetadataBig Challenges in Data Modeling: Modeling Metadata
Big Challenges in Data Modeling: Modeling Metadata
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
LDM Slides: Conceptual Data Models - How to Get the Attention of Business Use...
 

Similar to Webinar: Data Quality, Data Engineering, and Data Science

Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Business Intelligence (BI) and Data Management Basics
Business Intelligence (BI) and Data Management  Basics Business Intelligence (BI) and Data Management  Basics
Business Intelligence (BI) and Data Management Basics amorshed
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation Caserta
 
The New Age Data Quality
The New Age Data QualityThe New Age Data Quality
The New Age Data QualityRanjeet202050
 
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...DATAVERSITY
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data Blueprint
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMDATAVERSITY
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data IntegrationDATAVERSITY
 
Data Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityData Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityDATAVERSITY
 
The Role of Data Governance in a Data Strategy
The Role of Data Governance in a Data StrategyThe Role of Data Governance in a Data Strategy
The Role of Data Governance in a Data StrategyDATAVERSITY
 
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)DATAVERSITY
 
The Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseThe Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseCaserta
 
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your Business
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your BusinessData-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your Business
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your BusinessDATAVERSITY
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataVineet
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality RightDATAVERSITY
 
Big Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansBig Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansMark Laurance
 
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBig Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBigDataExpo
 

Similar to Webinar: Data Quality, Data Engineering, and Data Science (20)

Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Business Intelligence (BI) and Data Management Basics
Business Intelligence (BI) and Data Management  Basics Business Intelligence (BI) and Data Management  Basics
Business Intelligence (BI) and Data Management Basics
 
The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation The Data Lake - Balancing Data Governance and Innovation
The Data Lake - Balancing Data Governance and Innovation
 
The New Age Data Quality
The New Age Data QualityThe New Age Data Quality
The New Age Data Quality
 
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...
DataEd Webinar: Implementing Successful Data Strategies - Developing Organiza...
 
Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM Data-Ed: Business Value From MDM
Data-Ed: Business Value From MDM
 
Data-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDMData-Ed Online Webinar: Business Value from MDM
Data-Ed Online Webinar: Business Value from MDM
 
Why data governance is the new buzz?
Why data governance is the new buzz?Why data governance is the new buzz?
Why data governance is the new buzz?
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
Data Modeling & Data Integration
Data Modeling & Data IntegrationData Modeling & Data Integration
Data Modeling & Data Integration
 
Data Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data QualityData Governance and Data Science to Improve Data Quality
Data Governance and Data Science to Improve Data Quality
 
The Role of Data Governance in a Data Strategy
The Role of Data Governance in a Data StrategyThe Role of Data Governance in a Data Strategy
The Role of Data Governance in a Data Strategy
 
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)Data-Ed Slides: Best Practices in Data Stewardship (Technical)
Data-Ed Slides: Best Practices in Data Stewardship (Technical)
 
The Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's EnterpriseThe Rise of the CDO in Today's Enterprise
The Rise of the CDO in Today's Enterprise
 
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your Business
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your BusinessData-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your Business
Data-Ed Slides: Data-Centric Strategy & Roadmap - Supercharging Your Business
 
Managing Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big DataManaging Data Warehouse Growth in the New Era of Big Data
Managing Data Warehouse Growth in the New Era of Big Data
 
Getting Data Quality Right
Getting Data Quality RightGetting Data Quality Right
Getting Data Quality Right
 
Big Data - Bridging Technology and Humans
Big Data - Bridging Technology and HumansBig Data - Bridging Technology and Humans
Big Data - Bridging Technology and Humans
 
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data QualityBig Data Expo 2015 - Trillium software Big Data and the Data Quality
Big Data Expo 2015 - Trillium software Big Data and the Data Quality
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageMatteo Carbone
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear RegressionRavindra Nath Shukla
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsApsara Of India
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Tina Ji
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...lizamodels9
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 

Recently uploaded (20)

Insurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usageInsurers' journeys to build a mastery in the IoT usage
Insurers' journeys to build a mastery in the IoT usage
 
Regression analysis: Simple Linear Regression Multiple Linear Regression
Regression analysis:  Simple Linear Regression Multiple Linear RegressionRegression analysis:  Simple Linear Regression Multiple Linear Regression
Regression analysis: Simple Linear Regression Multiple Linear Regression
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call GirlsCash Payment 9602870969 Escort Service in Udaipur Call Girls
Cash Payment 9602870969 Escort Service in Udaipur Call Girls
 
Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.Eni 2024 1Q Results - 24.04.24 business.
Eni 2024 1Q Results - 24.04.24 business.
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
Lowrate Call Girls In Laxmi Nagar Delhi ❤️8860477959 Escorts 100% Genuine Ser...
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 

Webinar: Data Quality, Data Engineering, and Data Science

  • 1. Data Quality, Data Engineering and Data Science for Better Insights – Series of Questions 1 Tom Redman Prashanth Southekal
  • 2. Our plan • Look holistically at the union of data quality engineering, and science. • An open-ended discussion 2
  • 3. • Use The Leader’s Data Manifesto to start or continue important conversations about managing data assets • Use these conversations to initiate action within your organization to better manage data assets • Support this movement and show you are committed to change by signing the manifesto online at www.dataleaders.org © 2017 dataleaders.org
  • 4. Bad Data is a hidden killer Baseline: • Best estimate is 45% of newly-created data records have a critical error. • Best estimate is CoPDQ ~ 20% of revenue. In Data Science: • Impact is different: • Errors may “cancel out.” • Bad data  Bad decision/prediction  Impacts thousands (e.g., financial crisis) • Bad data  Bad algorithm  Damage potentially unlimited • etc • Aligning data sources is a far bigger challenge. • Aligning decision makers is a far bigger challenge • Etc ©DQS, 2000-2017
  • 5. Data for Business Performance My Book – Data for Business Performance is in line with you have just said. Specifically the book has three key elements that makes it special. 1. The book is holistic 2. The book is for practitioners 3. The book is technology agnostic 5
  • 6. Reference Data Master Data Transactional DataMetadata Relationships between Different Data Types 6 Business Data Technical Data © 2017 DBP-Institute
  • 7. 7 Example of Integrated Business Data Reference Data Reference Data Master Data Master Data Transactional DataReference Data Transactional Data © 2017 DBP-Institute
  • 8. Structured vs. Unstructured data Depending on the manner in which data is initially created or recorded, data can be categorized into two main forms. • Structured Data. Data that resides in a fixed field within a record or file is structured data. • Unstructured Data. Unstructured data is the data in its native state. i.e. data doesn't have a any predefined data structure when created. 8 Customer Identifier 10 Digit Numeric Code Description with 25 character Structured Data Un-Structured Data © 2017 DBP-Institute
  • 9. Taxonomy holds the key in Unstructured Data 9© 2017 DBP-Institute
  • 10. Data Science in the Big Picture “Data revolution” or not, we need to get better at practically everything: • Day-in, day-out work: Largely quality. • Management. Planning. • Put data to work in new and exciting ways. My current list: – Making better decisions – Innovation – Informationalization – Providing content – Infomediation – Creating and Leveraging asymmetries • Seeking out, leveraging, and protecting proprietary data. 10©DQS, 2000-2017
  • 11. Putting Data to work: End-to-end process ©DQS, 2000-2017 Data Discovery (data science) Delivery “Dollars” The D4 Process: Acquire and understand “potentially interesting” data Find something “truly interesting” in that data Deliver the discovery in the form of a product/service/report “Monetize” the discovery
  • 12. 12 Origination Capture Validation Processing Distribution Aggregation Interpretation Consumption Data Storage Data Security Dominance in the Data Lifecycle (DLC) Data Engineering Data Science 8 of the 10 stages in the DLC pertain to Data Engineering © 2017 DBP-Institute
  • 13. Data Engineering V/s Data Cleansing 13 Data Engineering Origination Capture Validation Processing Distribution Aggregation Data Cleansing 60% of the Effort in deriving Insights © 2017 DBP-Institute
  • 14. Where do the data scientists sit? Basic Process Improvements New, sophisticated algorithms Fundamental New Discovery In the line: And everyone is involved In a “lab” Analytical “sophistication” “Home” ©DQS, 2000-2017
  • 15. Data Lab and Data Factory • Lab for discovery, new products: Different management mind-set, people, goals • Factory for scale, control, profit: • Connect the two! 15©DQS, 2000-2017
  • 16. 16 Building blocks of Data Factory Manage Core business processes in the SoR Manage Reference and Master data with Standards Enable Data Integration using Standards Position Data Governance as a Business Function What’s required in the factory? 1 2 3 4 © 2017 DBP-Institute
  • 17. Most important takeaways: Prashanth: • Business Performance can be achieved any aligning data to the business goals, key questions, and KPIs. • There is no data management endeavour without a customer. • Data Quality is a journey and NOT a destination Tom: • The “data space” is advancing too slowly and it is time for data practitioners to push far harder. • The “easiest” place to begin is with quality. And the benefits stun. • Data practitioners must also take on the tough organizational issues. 17
  • 18. Our Profiles 18 Dr. Prashanth H Southekal is the Managing Principal of DBP-Institute. He brings over 20 years of Information Management from companies such as SAP AG, Accenture, Deloitte, P&G, and General Electric. Dr. Southekal has published three books on Information Management including "Data for Business Performance". Dr. Thomas C Redman "the Data Doc” is an Advisor at DBP- Institute. Dr Redman is a world renowned thought leader who has helped blue-chip companies such as Chevron, Shell, JP Morgan, and AT&T make big improvements in Information Management. He has written dozens of papers and five books including the most popular “Data Driven: Profiting from Your Most Important Business Asset”.