Data Lineage
Series: Foundational Strategies Trust in Big Data – Part 3
Webcast Audio
• Today’s webcast audio is streamed through your computer
speakers.
• If you need technical assistance with the web interface or audio,
please reach out to us using the Q&A box.
Questions Welcome
• Submit your questions at any time during the presentation using the
Q&A box.
• We will answer them during our Q&A session following the
presentation.
Recording and slides
• This webcast is being recorded. You will receive an email following
the webcast with a link to download both the recording and the
slides.
Housekeeping
Andy Reid
Director, Product Marketing
Arianna Valentini
Product Marketing Manager
What You Will Learn Today
• Review of the ingredients of successful Big Data
• What is the cost of lost data governance
• Overcoming data lineage challenges
• How one company is using DI + DQ for lineage that
fuels their anti-money laundering requirements
• What you can do in the next 90 days to take action on
data lineage
• Wrap up with Q&A
3
4
Ingredients of Successful Big Data
1. Clear Business Case 2. Extract Data 3. Understand Data 4. Trace Lineage
Data Governance
64%of IT executives have
trouble finding and cleaning
the right data for strategic
data projects
Sierra Venture, 2020
90%of executives are concerned
about the how misused data
can impact corporate
reputation
• PWC, 22nd Annual Global CEO Survey, 2019
Only 2%of firms consider
themselves fully CCPA
compliant today
International Association of Privacy Professionals,
October 2019
The Cost of Lost
Governance
GDPR Fines 2019: 27
$ 462,635,765https://alpin.io/blog/gdpr-fines-list/
December 15, 2019
The importance of data quality
and integration in the enterprise:
• Compliance
• Decision making
• Customer centricity
• Brand reputation
• Risk Mitigation
5
Goals and Challenges of
Data Governance
GOALS
• Regulatory compliance
• Understand data context,
meaning
• Accuracy, completeness,
consistency, relevancy,
timeliness, validity of data
CHALLENGES
• Multi-platform, data
volume and complexity
• Diversity and consistency of
sources
• Compliance demands:
broader, deeper & evolving
6
Regulation Pressures Continue to Grow
Broader and deeper compliance & regulationVolume and complexity of data is growing
May 2018 Jan 2020
7
8
Data Governance Requires a Multi-Faceted Approach
Quality Security Lineage
9
Why is Data Lineage
Important for Data
Governance?
• See linkages to external data sources and
targets
• Gain insight into the flow of data across
the enterprise
• Trace usage and assess the impact of
changes across the data lifecycle
• Diagnose problems faster
Transitioning to new
cloud deployments
Increasing data lineage
complexity
Rising data volumes,
sources, and variety
Growing regulatory
requirements
Challenges to
effective
Data Lineage
10
Growing Regulations
• Track data from access to integration to ensure sensitive
data is being used in a compliant way
• Regardless of the data source, mainframe, IBM i or cloud,
establish a process for lineage analysis
• See the flow of any piece of data through a job
• Consider how next-gen projects such as Machine
Learning might effect your data lineage processes
• Do you have what is needed for audits?
11
Data needs to meet quality levels but also be traced to original source
Rising Data Volumes, Sources, and Variety
• Consider how you will address data lineage for a growing
expanse of data
• Does the integration solutions you use today, create data
lineage challenges for source data?
• Ex. Mainframe data to a cloud data warehouse
• Establish data lineage processes that can cover requirements
for both batch and real-time data delivery
• Cannot forget data quality!
12
Regardless of complexities, continuous trusted data delivery is a must
Increasing Data Lineage Complexity
• Consider if you auditability and transparency in your current
data lineage processes
• Need full insight into the flow of data across the enterprise
• Is there a clear link to external data sources and targets?
• As data moves through its life cycle can you clearly trace usage
and assess?
13
As your environment complexity grows, you must have a data
lineage map to follow data throughout the enterprise
14
Remember Data Lineage is also Multi-Faceted
Business Technical
15
The Reality is…
Cloud is Here
46% of IT professionals have said that
cloud or hybrid-cloud computing
was part of their 2019 initiatives
Data Trends for 2019, Syncsort 2019
84% of organizations have a multi-
cloud strategy
State of the Cloud 2019, Flexera
Transitioning to New Cloud Deployments
• When moving from source to cloud target, you need to pass
source-to-cluster data lineage information on
• Understand how a hybrid, multi or full cloud deployment can
effect your data governance scalability
• Ask: How will this effect my current data lineage process?
• Consider which elements of your current DI/DQ strategy
need to adapt
16
Cloud deployments need to satisfy governance and compliance needs
Global Bank
Building an AML process with DI + DQ
Goal
Meet AML transaction monitoring
and Financial Conduct Authority
(FCA) compliance
Challenges
• Data volume too large,
diversely scattered to analyze
• Disparate data sources –
Mainframe, RDBMS, Cloud,
etc.
• Maximize the value/ROI of the
data lake
17
Requirements
• Consolidated and clean data
• End-to-end data lineage
• Secure integrations
• Unmodified mainframe data
for archive/backup
Global Bank
Results: Data Integration Driving Improved CX
Solution
• Connect CDC
• Connect for Big Data
• Trillium for Big Data
Benefits Achieved
• High performance AML
results
• Faster time to value
• Data lake is trusted source
• Data feeding critical
machine learning-based
fraud detection
What’s Next
• Expanding to additional
Customer Engagement
solutions and applications
18
Looking at the Next 90 Days…
• Determine if you have an understanding of your
organizational data
• Consider how you use data lineage to support
governance today
• How will you use business lineage AND technical
lineage to ensure governance?
19
Questions?
Foundational Strategies for Trust in Big Data Part 3: Data Lineage

Foundational Strategies for Trust in Big Data Part 3: Data Lineage

  • 1.
    Data Lineage Series: FoundationalStrategies Trust in Big Data – Part 3
  • 2.
    Webcast Audio • Today’swebcast audio is streamed through your computer speakers. • If you need technical assistance with the web interface or audio, please reach out to us using the Q&A box. Questions Welcome • Submit your questions at any time during the presentation using the Q&A box. • We will answer them during our Q&A session following the presentation. Recording and slides • This webcast is being recorded. You will receive an email following the webcast with a link to download both the recording and the slides. Housekeeping Andy Reid Director, Product Marketing Arianna Valentini Product Marketing Manager
  • 3.
    What You WillLearn Today • Review of the ingredients of successful Big Data • What is the cost of lost data governance • Overcoming data lineage challenges • How one company is using DI + DQ for lineage that fuels their anti-money laundering requirements • What you can do in the next 90 days to take action on data lineage • Wrap up with Q&A 3
  • 4.
    4 Ingredients of SuccessfulBig Data 1. Clear Business Case 2. Extract Data 3. Understand Data 4. Trace Lineage Data Governance
  • 5.
    64%of IT executiveshave trouble finding and cleaning the right data for strategic data projects Sierra Venture, 2020 90%of executives are concerned about the how misused data can impact corporate reputation • PWC, 22nd Annual Global CEO Survey, 2019 Only 2%of firms consider themselves fully CCPA compliant today International Association of Privacy Professionals, October 2019 The Cost of Lost Governance GDPR Fines 2019: 27 $ 462,635,765https://alpin.io/blog/gdpr-fines-list/ December 15, 2019 The importance of data quality and integration in the enterprise: • Compliance • Decision making • Customer centricity • Brand reputation • Risk Mitigation 5
  • 6.
    Goals and Challengesof Data Governance GOALS • Regulatory compliance • Understand data context, meaning • Accuracy, completeness, consistency, relevancy, timeliness, validity of data CHALLENGES • Multi-platform, data volume and complexity • Diversity and consistency of sources • Compliance demands: broader, deeper & evolving 6
  • 7.
    Regulation Pressures Continueto Grow Broader and deeper compliance & regulationVolume and complexity of data is growing May 2018 Jan 2020 7
  • 8.
    8 Data Governance Requiresa Multi-Faceted Approach Quality Security Lineage
  • 9.
    9 Why is DataLineage Important for Data Governance? • See linkages to external data sources and targets • Gain insight into the flow of data across the enterprise • Trace usage and assess the impact of changes across the data lifecycle • Diagnose problems faster
  • 10.
    Transitioning to new clouddeployments Increasing data lineage complexity Rising data volumes, sources, and variety Growing regulatory requirements Challenges to effective Data Lineage 10
  • 11.
    Growing Regulations • Trackdata from access to integration to ensure sensitive data is being used in a compliant way • Regardless of the data source, mainframe, IBM i or cloud, establish a process for lineage analysis • See the flow of any piece of data through a job • Consider how next-gen projects such as Machine Learning might effect your data lineage processes • Do you have what is needed for audits? 11 Data needs to meet quality levels but also be traced to original source
  • 12.
    Rising Data Volumes,Sources, and Variety • Consider how you will address data lineage for a growing expanse of data • Does the integration solutions you use today, create data lineage challenges for source data? • Ex. Mainframe data to a cloud data warehouse • Establish data lineage processes that can cover requirements for both batch and real-time data delivery • Cannot forget data quality! 12 Regardless of complexities, continuous trusted data delivery is a must
  • 13.
    Increasing Data LineageComplexity • Consider if you auditability and transparency in your current data lineage processes • Need full insight into the flow of data across the enterprise • Is there a clear link to external data sources and targets? • As data moves through its life cycle can you clearly trace usage and assess? 13 As your environment complexity grows, you must have a data lineage map to follow data throughout the enterprise
  • 14.
    14 Remember Data Lineageis also Multi-Faceted Business Technical
  • 15.
    15 The Reality is… Cloudis Here 46% of IT professionals have said that cloud or hybrid-cloud computing was part of their 2019 initiatives Data Trends for 2019, Syncsort 2019 84% of organizations have a multi- cloud strategy State of the Cloud 2019, Flexera
  • 16.
    Transitioning to NewCloud Deployments • When moving from source to cloud target, you need to pass source-to-cluster data lineage information on • Understand how a hybrid, multi or full cloud deployment can effect your data governance scalability • Ask: How will this effect my current data lineage process? • Consider which elements of your current DI/DQ strategy need to adapt 16 Cloud deployments need to satisfy governance and compliance needs
  • 17.
    Global Bank Building anAML process with DI + DQ Goal Meet AML transaction monitoring and Financial Conduct Authority (FCA) compliance Challenges • Data volume too large, diversely scattered to analyze • Disparate data sources – Mainframe, RDBMS, Cloud, etc. • Maximize the value/ROI of the data lake 17 Requirements • Consolidated and clean data • End-to-end data lineage • Secure integrations • Unmodified mainframe data for archive/backup
  • 18.
    Global Bank Results: DataIntegration Driving Improved CX Solution • Connect CDC • Connect for Big Data • Trillium for Big Data Benefits Achieved • High performance AML results • Faster time to value • Data lake is trusted source • Data feeding critical machine learning-based fraud detection What’s Next • Expanding to additional Customer Engagement solutions and applications 18
  • 19.
    Looking at theNext 90 Days… • Determine if you have an understanding of your organizational data • Consider how you use data lineage to support governance today • How will you use business lineage AND technical lineage to ensure governance? 19
  • 20.