Keeping the Pulse
of Your Data:
Why You Need
Data Observability
Speakers
Paul Rasmussen
Principal Product Manager
Shalaish Koul
Principal Sales Engineer
Data Observability
• Introduction to Data Observability
• Why now?
• Use cases
• Overall approach
• Q&A
3
47%
of newly created
data records have at
least one critical error
68%
of organizations say
disparate data negatively
impacts their organization
84%
of CEOs say that they are concerned
about the integrity of the data they
are making decisions on
Data integrity is a business imperative
Building at Scale
• Semiconductor companies manufacture a microchip
with over 2 trillion transistors on less than 2 inches, and
double the capacity every 2 years?
• Auto companies build a car on a production line with
over 30,000 parts spanning different raw materials and
manufacturing processes?
• Software and Data Engineers develop, merge and
deploy millions of lines of code in near real time
continuous delivery pipelines?
5
• “W. Edwards Deming The Father of Quality Management” started the observability concept 100 years ago
• Observability is a key foundational concept of SPC, Lean, Six Sigma and any process dependent on building quality into
repetitive tasks
• Using statistical methods to control complex processes to ensure quality data products over time
1. Continually improves by tightening your limits and flagging data issues.
2. Identify special (infrequent) and common (bad data) root causes
3. Provides context into data with lineage, sourcing and parentage
4. Automatic action(s) such data quality remediation, model retraining, issue escalation and data pipeline activities
How? Observability
6
Why Now?
7
• Businesses are more data-driven
than ever
• Problematic events are infrequent
but can be catastrophic
• User’s data expertise has evolved
along with expectations to do more
with it
• Data proliferation and technology
diversification
• AI has evolved to support the
complexity of the problem
Cloud, on-premises,
hybrid cloud
Snowflake, Delta Lake,
Oracle, MS SQL Server,
Big Query, Redshift
Streaming data,
databases and files
SAP, Salesforce, and
ERP & CRM systems
Examples
QA is done at the
time of development
Random issues are
surfaced
Users find and
report defects
8
8
Typical Data Products and Pipelines
Traditionally, the quality of a data product or pipeline is ensured during the
development process and not throughout the operational lifecycle.
Data Product(s)
X
Data Source #1
?
Data Source #2
?
Data Source #3
?
Data Source #4
?
Create and/or
Source The Data
Transform
Data
Enrich / Blend /
Merge Data
Publish an
Expose Data
P
r
o
c
e
s
s
9
9
Data Pipelines with Observability
Data Observability tools the performance of data products and processes in order to
detect significant variations before they result in the creation of erroneous work product in reports,
analytics, insights and outcomes.
Data Source #1 Data Source #2 Data Source #3
!
Data Source #4
Create and/or
Source The Data
Transform
Data
Enrich / Blend /
Merge Data
Publish an
Expose Data
P
r
o
c
e
s
s
Issues identified and resolved prior to final product
O
b
s
e
r
v
e
Data Product(s)
Show this in action
11
Data
Observability
Impact of
Unexpected
Data
Data anomalies have downstream impacts, but not every
issue impacts the process in the same way.
The sooner you can detect anomalies, the sooner you
can assess the impacts and effectively remediate.
of your data with continuous measuring and monitoring
into your data landscape and dependencies with intuitive
self-discovery capabilities
when outliers and anomalies are identified using artificial intelligence
when identified by intelligent analysis
1
2
3
4
when issues occur by understanding the cause of
the issue
5
Data Observability benefits
12
Data Observability is proactive, not reactive
13
Data Observability and Quality
14
Rules
Metadata
• Alerts and dashboards for overall data health
trending and threshold analysis
• Anomaly detection based on volume, freshness,
distribution and schema metadata
• Predictive analysis simulating human intelligence
to identify potential adverse data integrity events
“Observability is the missing piece today to give our data stewards access
to data discovery insights without having to go to IT for queries or reports”
- Jean Paul Otte, CDO, Degroof Petercam
How Data Observability Works
Intelligent Analysis Identifies Anomalies
16
AI identifies
trends that
traditional
methods
cannot
easily find
Alerts and Impacts
17
Volume Alert
Impacts
Use Case Examples
19
Data
Observability
Impact of
Unexpected
Values
An incorrect currency type in the order created an
inflated revenue amount which would have resulted in
the incorrect total revenue amount.
The error was caused because the currency conversion
table was not updated.
20
Data
Observability
Unexpected
data volumes
impact
operations
A single-day spike of 500% in the dollar amount of orders
caused because the company expanded into a new
geography without notifying all affected areas within the
company.
21
Data
Observability
Data
Exploration
through Self
Service
Understanding your data assets and the changes in your
data assets. Explore critical data elements such customer,
products, etc.
How many critical data assets are complete, unique, etc.?
What kind of inconsistencies do we have in that data?
Use Case Recap
22
1. Data anomaly impacted
downstream processes
2. Impact of Unexpected Values
caused by an invalid currency type
3. Unexpected data values caused
by lack of communication internally
4. Data exploration to uncover data
inconsistencies
The modular, interoperable Precisely Data
Integrity Suite contains everything you need
to deliver accurate, consistent, contextual
data to your business - wherever and
whenever it’s needed.
23
24
Proactively uncover data
anomalies and take action
before they become costly
downstream issues
25
Questions?
Thank you
https://www.precisely.com/product/data-integrity/precisely-data-integrity-suite/data-observability

Keeping the Pulse of Your Data:  Why You Need Data Observability 

  • 1.
    Keeping the Pulse ofYour Data: Why You Need Data Observability
  • 2.
    Speakers Paul Rasmussen Principal ProductManager Shalaish Koul Principal Sales Engineer
  • 3.
    Data Observability • Introductionto Data Observability • Why now? • Use cases • Overall approach • Q&A 3
  • 4.
    47% of newly created datarecords have at least one critical error 68% of organizations say disparate data negatively impacts their organization 84% of CEOs say that they are concerned about the integrity of the data they are making decisions on Data integrity is a business imperative
  • 5.
    Building at Scale •Semiconductor companies manufacture a microchip with over 2 trillion transistors on less than 2 inches, and double the capacity every 2 years? • Auto companies build a car on a production line with over 30,000 parts spanning different raw materials and manufacturing processes? • Software and Data Engineers develop, merge and deploy millions of lines of code in near real time continuous delivery pipelines? 5
  • 6.
    • “W. EdwardsDeming The Father of Quality Management” started the observability concept 100 years ago • Observability is a key foundational concept of SPC, Lean, Six Sigma and any process dependent on building quality into repetitive tasks • Using statistical methods to control complex processes to ensure quality data products over time 1. Continually improves by tightening your limits and flagging data issues. 2. Identify special (infrequent) and common (bad data) root causes 3. Provides context into data with lineage, sourcing and parentage 4. Automatic action(s) such data quality remediation, model retraining, issue escalation and data pipeline activities How? Observability 6
  • 7.
    Why Now? 7 • Businessesare more data-driven than ever • Problematic events are infrequent but can be catastrophic • User’s data expertise has evolved along with expectations to do more with it • Data proliferation and technology diversification • AI has evolved to support the complexity of the problem Cloud, on-premises, hybrid cloud Snowflake, Delta Lake, Oracle, MS SQL Server, Big Query, Redshift Streaming data, databases and files SAP, Salesforce, and ERP & CRM systems Examples
  • 8.
    QA is doneat the time of development Random issues are surfaced Users find and report defects 8 8 Typical Data Products and Pipelines Traditionally, the quality of a data product or pipeline is ensured during the development process and not throughout the operational lifecycle. Data Product(s) X Data Source #1 ? Data Source #2 ? Data Source #3 ? Data Source #4 ? Create and/or Source The Data Transform Data Enrich / Blend / Merge Data Publish an Expose Data P r o c e s s
  • 9.
    9 9 Data Pipelines withObservability Data Observability tools the performance of data products and processes in order to detect significant variations before they result in the creation of erroneous work product in reports, analytics, insights and outcomes. Data Source #1 Data Source #2 Data Source #3 ! Data Source #4 Create and/or Source The Data Transform Data Enrich / Blend / Merge Data Publish an Expose Data P r o c e s s Issues identified and resolved prior to final product O b s e r v e Data Product(s)
  • 10.
  • 11.
    11 Data Observability Impact of Unexpected Data Data anomalieshave downstream impacts, but not every issue impacts the process in the same way. The sooner you can detect anomalies, the sooner you can assess the impacts and effectively remediate.
  • 12.
    of your datawith continuous measuring and monitoring into your data landscape and dependencies with intuitive self-discovery capabilities when outliers and anomalies are identified using artificial intelligence when identified by intelligent analysis 1 2 3 4 when issues occur by understanding the cause of the issue 5 Data Observability benefits 12
  • 13.
    Data Observability isproactive, not reactive 13
  • 14.
    Data Observability andQuality 14 Rules Metadata • Alerts and dashboards for overall data health trending and threshold analysis • Anomaly detection based on volume, freshness, distribution and schema metadata • Predictive analysis simulating human intelligence to identify potential adverse data integrity events “Observability is the missing piece today to give our data stewards access to data discovery insights without having to go to IT for queries or reports” - Jean Paul Otte, CDO, Degroof Petercam
  • 15.
  • 16.
    Intelligent Analysis IdentifiesAnomalies 16 AI identifies trends that traditional methods cannot easily find
  • 17.
  • 18.
  • 19.
    19 Data Observability Impact of Unexpected Values An incorrectcurrency type in the order created an inflated revenue amount which would have resulted in the incorrect total revenue amount. The error was caused because the currency conversion table was not updated.
  • 20.
    20 Data Observability Unexpected data volumes impact operations A single-dayspike of 500% in the dollar amount of orders caused because the company expanded into a new geography without notifying all affected areas within the company.
  • 21.
    21 Data Observability Data Exploration through Self Service Understanding yourdata assets and the changes in your data assets. Explore critical data elements such customer, products, etc. How many critical data assets are complete, unique, etc.? What kind of inconsistencies do we have in that data?
  • 22.
    Use Case Recap 22 1.Data anomaly impacted downstream processes 2. Impact of Unexpected Values caused by an invalid currency type 3. Unexpected data values caused by lack of communication internally 4. Data exploration to uncover data inconsistencies
  • 23.
    The modular, interoperablePrecisely Data Integrity Suite contains everything you need to deliver accurate, consistent, contextual data to your business - wherever and whenever it’s needed. 23
  • 24.
  • 25.
    Proactively uncover data anomaliesand take action before they become costly downstream issues 25
  • 26.
  • 27.