SlideShare a Scribd company logo
1 of 45
Download to read offline
Paraskevi Zerva
Cognition & Knowledge Representation Lead
p.zerva@elsevier.com
Supporting GDPR Compliance through effectively governing
Data Lineage & Data Provenance
Context
❖ Introductions
❖ Definitions
❖ EDG Metadata Governance Platform & Use Cases
❖ EDG Showcase Data Lineage
❖ GDPR in a Nutshell
❖ How governing effectively Data Lineage supports GDPR Compliance
❖ GDPR Use Case for Time Limits of Personal Data Erasure – Data Retention
❖ GDPR Policies and Compliance
❖ GDPR Compliance Use Case
2
Introductions
❖ WhoAmI?
▪ Paraskevi Zerva
▪ Cognition & Knowledge Representation Lead (Entellect, Elsevier)
▪ Previously worked as an Information Architect for the Enterprise Data Governance at
JP Morgan & Chase.
▪ PhD in ``Provenance of Data for Compositions of Services’’.
❖ What’s my focus?
▪ Work on the the data governance strategy for Elsevier Entellect to support effective data
governance across Entellect’s software development life-cycle.
▪ Build a common representation for analysis & validation of Elsevier Entellect’s data.
▪ Consolidate data lineage & provenance information with other data assets to provide
a unified data governance ecosystem.
❖ What I am going to talk about ?
▪ How governing effectively data lineage/provenance supports compliance for GDPR within
the Enterprise Data Governance Platform.
3
Definitions
4
❖ Data governance:
▪ is a set of processes that ensures that data assets are efficiently managed and
enables gaining control and have a better understanding of your data,
▪ ensures that data can be trusted and organizations can show accountability about
their data assets with regards to data quality, retention, data lineage etc.,
▪ describes an evolutionary process for a company setting up the processes to handle
information so that it may be utilized by the entire organization,
▪ encompasses data/metadata collection, analysis and validation of rules involving
data (e.g., business (domain) rules, standards, data quality, entitlements, SOR, etc.)
❖ Data lineage refers to capturing the sequence of data flows involving a data element - it
can be represented visually to discover the movement of data artefact from its source to
its destination to understand where this originates from.
❖ Data provenance refers to the recording activity for the processing activities data (e.g.,
through provenance loggers).
❖ GDPR is the General Data Protection Regulation.
Enterprise Data Governance
❖ Unified platform for Corporate Technology to support the efficient data governance and
metadata management.
❖ Team’s mission:
✓ Integrate CT metadata from various sources in one place in a common way (RDF), regardless
of the input format
✓ Consolidates lineage/provenance information together with other metadata.
❖ EDG ingests different formats like XML, JSON, CSV (Collect)
❖ EDG translates the data/metadata into a common language format (RDF) (Standardize)
✓ Schemas are expressed as OWL ontologies.
✓ SHACL (shapes constraint language) is used for interface building.
and different user’s representation with the same underlying core schema.
✓ SPIN is used for transformation.
❖ We form a connected graph data structure queryable
across all internal and external reference datasets (Connect)
5
Collect
Standardize
Connect
Refine
Enterprise Data Governance Ecosystem
6
✓ Enterprise
Metadata
✓ LDMs/PDMs
✓ Req Reports
✓ Glossaries
✓ Taxonomies
✓ Codelists
✓ External
Standards
Ingestion
✓ Data Models
✓ Provenance
Logging
✓ Movement
✓ Feedback
Unified
Data/Metad
ata Hub
(EDG)
Sources
People, Processes, Tools, Services,
Conformed Data
✓ RACI
(roles)
✓ APIs
✓ Discovery
✓ Reporting
Uses
Data Governance Business Cases
❖ Capture/manage governance requirements for the complete portfolio of CT applications.
❖ Support the software development lifecycle, compliance/regulatory requirements (GDPR).
❖ Demonstrate Data Lineage* where the different data sources originated from, to showcase
accountability on control & understanding of the data for regulatory purposes.
❖ Exhibit Data provenance** of how the data is processed/transforms across the platform.
❖ Track data movement/data transfers between applications (Traceability***).
❖ Provide contextual alignment with firm-wide standards, taxonomies and glossaries.
❖ Provide validation capabilities for data quality and data accuracy.
❖ Exhibit accountability with regards to entitlements (by effectively governing data provenance).
❖ Ontologies are extended to provide crosswalks between models and ecosystems so we can
answer questions such as :
✓ Which applications contain (S)PI data affected by GDPR? (Regulatory Reporting)
✓ S. Arabia has changed its retention policy – what applications are impacted? (Reporting)
✓ What are the owners of particular data requirements documentation? (RACI)
7
* Data lineage refers to capturing the sequence of data flows involving a data element - it can be represented visually to
discover the data flow/movement from its source to its destination.
** Traceability indicates the ability to track a data construct back to the construct it was derived from e.g., the original
system where this was created
*** Data provenance refers to the recording activity (through provenance loggers)
EDG – Metadata Governance Model
❖ The diagram depicts how conceptually metadata from various sources is connected in EDG.
❖ Data Lineage flow connects the following artefacts:
➢ Business Terms (Data Dictionary Metadata)
➢ Data Requirements
➢ Logical Data Model Artefacts
➢ Physical Data Model Artefacts
➢ Application/Deployment (Technical) Metadata
❖ Data Traceability indicates the ability to
be able to track the links to another artefact.
❖ Data Provenance allows to track:
➢ Ownership/Entitlements/Access Control Metadata
➢ Business Capability/Process Metadata
8
Data Lineage
Data Traceability
Data Provenance
EDG – Showcase Data Lineage
9
Data Assets
10
Logical Data Model
11
Physical Data Model
12
Physical Database Realization
13
Link to Technical Metadata
14
Logical Data Model
15
Logical Entity
16
Logical Attribute
17
Link to Data Requirement
18
Link to Standard Glossary
19
Physical Data Model
20
Physical Table
21
Physical Column
22
Data Lineage Diagram Logical/Physical
Data Elements
23
Mapping to
Technical Asset
LDM Artefacts
PDM Artefacts
Mapping to
Data Requirement
Mapping to
Standard Glossary
PDM to LDM
mapping
Data Lineage Logical Data Elements
24
LDM Artefacts
Mapping to
Data Requirement
Mapping to
Standard Glossary
PDM to LDM
mapping
Data Lineage Physical Data Elements
25
Mapping to
Technical Asset
PDM Artefacts
GDPR in a Nutshell
❖ GDPR is a new set of EU guidelines governing how organizations handle personal data
replacing the current Data Protection Act (DPA) and was enforced from 25 May 2018.
❖ According to GDPR personal data should be processed:
➢ Fairly/lawfully
➢ Must retain accurate/kept up to date
➢ Kept no longer than is necessary (retention period)
➢ Processed in a secure way
❖ Controller and processor terms are used in GDPR to describe the parties involved in
processing personal data (PI).
❖ Controller: the party that decides what data is extracted, the purpose used, who is
involved in the processing.
✓ should be able to demonstrate compliance (accountability metrics).
✓ should be able to report on the purposes of processing/the categories PI it controls.
❖ Processor: the party responsible for processing the data on behalf of the controller.
✓ should maintain records of the categories of processing activities of PI & the means
in which it’s processed.
✓ should be able to report on the data transfers of personal data to a third country or an
international organization and can be held responsible for a data breach (requirement
for breach notification).
26
How governing Data Lineage Supports GDPR
Compliance
❖ GDPR Challenge:
➢ Record of personal data processing are required for evidencing/demonstrating compliance.
➢ Organizations are required to record every point where processing activities of personal data
takes place and showcase accountability.
❖ Solution:
➢ GDPR makes data governance even more critical on the lineage aspect.
➢ Governance of data lineage enables the understanding of your data-flow activities & to
identify and document legal justification for each type of activity.
➢ When data lineage is represented visually it allows discovery of the data flow/movement
from its source to destination via various changes and how the data is transformed.
➢ On top of that the GDPR requires to evidence records of personal data processing that
implies the need for Data Provenance.
➢ Data Provenance refers to the recording activity of how the data were derived/generated
and processed. It allows to verify that the process and steps used to obtain a result complies
with a set of given requirements.
➢ In our business case the given requirements are GDPR regulatory requirements therefore
data lineage and provenance become the tools to showcase accountability with regards to
GDPR compliance.
27
GDPR Governance Use Case
❖ GDPR Article 30 Data Requirement
➢ Provide time limits for erasure of the different categories of data required per record
retention policy.
❖ Regulatory requirement translated to the creation of a report and accountability
metrics that:
➢ Returns applications in scope of GDPR for Corporate Technology.
➢ Returns the record class code in scope applications based on the record retention
policies per country.
➢ Notifies application owners in case there are changes on record retention updates and
verifies compliance of new changes with regards to GDPR regulatory requirements.
28
* Record class codes are used to determine how long to keep each record for each jurisdiction.
**A record class code (RCC) is a category used to group similar types of records in JPMC’s master record retention schedule.
*** Record retention requirements are categorized by record class code by county and in some case by the business function
of the record.
Retention Conceptual Model
29
ia.jpmc: SEAL_103249 a
edg:BusinessApplication
ia.jpmc.gov:GRM_FUN1030 a
ia.jpmc:RecordRetentionClass
ia.jpmc:application
RecordRetentionClass
ia.jpmc.gov:GRM_FUN10300-AE
a ia.jpmc:dataRetentionPolicy
ia.jpmc:dataRetentionPolicy
SOR: SEAL
SOR: Retention Manager Record Retention Policy Ontology
Technical Standard Ontology
GDPR In Scope Business Application
30
GDPR compliance information
Provenance information
GDPR in Scope – contains pi
OBK1060 | Payroll Services (GRM) PAY100 | Employee Compensation Contribution (GRM)record retention class
Record Retention Class
Record Retention Code
31
Data Retention Record Class
data retention
policy
Record Retention Policy
32
country:
retention
period:
disposition:
retention
event:
retention
period unit:
EDG – RCC Code Diagram
33
data
retention
policy
EDG – RCC Policy Diagram
34
country:
disposition:
retention
period unit:
retention
period:
retention
event:
Query RCC Data for GDPR in scope Apps
prefix rdfs: <http://www.w3.org/20000/01/rdf-schema#>
prefix edg: <http://www.edg.topbraid.solutions/model/>
prefix ia.jpmc: <http://ia.jpmc.com/dg/>
SELECT DISTINCT ?appId ?appName ?gdprScopr ?lob ?rccClassCode ?rccLabel ?rccPolicy
FROM <urn:x-evn-master:seal>
FROM <urn:x-evn-master:grm>
WHERE
{
{
?app a edg:BusinessApplication .
?app edg:name ?appName .
?app edg:identifier ?appId .
?app ia.jpmc:inScopeForGDPR ?gdprScope .
?app ia.jpmc:lineOfBusiness ?lob .
FILTER regex(?gdprScope, “YES”)
FILTER regex(?lineOfBusiness, “CT”)
}
}
35
appId appName gdprScope lob rccClassCode rccPolicy
35632 Payroll Application YES CT GRM_AUD_PAY_1060 Payroll Services
38537 KPMG Link – Global
Business Travel
YES CT GRM_AUD_PAY_1080 Payroll Accounting
SPIN Rules (1) - Inferencing
#STEP 401: Create Record Classes
CONSTRUCT
{
?recordClassCodeU a ia.jpmc:DataRetentionRecordClass .
?recordClassCodeU rdfs:label ?rccLabel.
?recordClassCodeU edg:identifier ?rccClassCode.
?recordClassCodeU edg:name ?rccName.
?recordClassCodeU edg:description ?rccDescription.
?recordClassCodeU ia.jpmc.go.dataRetentionPolicy ?rccPolicyU.
}
WHERE
{
?this a RetentionExport:RetentionExport.
BIND (spl:object (?this, RetentionExport:recordClassCode) AS ?rccClassCode .
BIND (spl:object (?this, RetentionExport:country) AS ?country .
BIND (spl:object (?this, RetentionExport:countryCode) AS ?countryCode .
BIND (spl:object (?this, RetentionExport:recordClassName) AS ?rccName .
BIND (spl:object (?this, RetentionExport:recordClassDescription) AS ?rccDescription.
BIND (ia.jpmc:BuildDataRetentionPolicyClassURI (?recordClassCode) AS ?recordClassCodeU.
?countryU country:countryId ?RDICountryCode .
BIND (str(?RDICountryCode) AS cntryCodeLabel).
FILTER (?countryCode = ?cntryCodeLabel) .
BIND (fn:concat(?recordClassCode, "|", ?recordClassName, "(GRM)") AS ?rccLabel).
BIND (ia.jpmc:BuildDataRetentionPolicyRecordURI(?recordClassCode, ?RDICountryCode) AS ?rccPolicyU).
}
36
SPIN Rules (2) - Inferencing
#STEP 402: Create Record Retention Policy
CONSTRUCT
{
?rccPolicyU a edg:DataRetentionPolicy .
?rccPolicyU rdfs:label ?policyLabel.
?rccPolicyU edg:identifier ?policyIdentifier.
?rccPolicyU edg:name ?policyName.
?rccPolicyU ia.jpmc.go:retentionDisposition ?disposition.
?rccPolicyU ia.jpmc.go:retentionEvent ?retentionEvent.
?rccPolicyU ia.jpmc.go:retentionPeriod ?retentionPeriod.
?rccPolicyU ia.jpmc.go:retentionPeriodUnit ?retentionPeriodUnit.
?rccPolicyU edg:country ?country.
}
WHERE
{
?this a RetentionExport:RetentionExport.
BIND (spl:object (?this, RetentionExport:policyIdentifier) AS ?policyIdentifier.
BIND (spl:object (?this, RetentionExport:retentionDisposition) AS ?disposition .
BIND (spl:object (?this, RetentionExport:retentionEvent) AS ?retentionEvent .
BIND (spl:object (?this, RetentionExport:retentionPeriod) AS ?retentionPeriod .
BIND (spl:object (?this, RetentionExport:retentionPeriodUnit) AS ?retentionPeriodUnit.
BIND (spl:object (?this, RetentionExport:country) AS ?country .
BIND (ia.jpmc:BuildDataRetentionPolicyClassURI (?recordClassCode) AS
BIND (fn:concat(?policyIdentifier, "|", ?policyName, "(GRM)") AS ?policyLabel).
BIND (ia.jpmc:BuildDataRetentionPolicyURI(?policyIdentifier, ?country, ?policyName) AS ?rccPolicyU).
}
37
SHACL Property Constraint
38
GDPR POLICY & COMPLIANCE
39
GDPR POLICIES & COMPLIANCE
❖ Policies define guidelines for handling and implementing specific security or
regulatory issues.
❖ With focus on the policy requirements for data protection we have built a
policy/compliance model:
➢ Aiming on validating GDPR compliance for the compliance objects under policy target.
➢ Showcasing accountability with regards to GDPR policy requirements.
Objectives
❖ Merge the gap between GDPR legislation obligations and operational level
technology controls using semantic modelling to model the critical policy and
compliance aspects.
❖ Use inferencing to preserve accountability of processing activities that
handle PI data subject to regulatory compliance.
40
GDPR RETENTION COMPLIANCE
41
edg:Policy
edg:DataPolicy
rdfs:subclass
rdfs:subclass
edg:ComplianceAspect
edg:Policy
Requirement
ia.jpmc:DataRetention
RecordClass
ia.jpmc:categorized
ByCountry
ia.jpmc:categorized
ByBusinessFunction
ia.jpmc:Country ia.jpmc:Business
Function
edg:compliesWith
edg:DataRetention
Policy
rdf:type
ia.jpmc:dataRetentionPolicy
edg:RequirementAsset
rdf:typeedg:hasRequirement
edg:GDPRRegulatory
Requirement
rdfs:subclass
GDPR REGULATORY COMPLIANCE EXAMPLE
42
Thanks for attending ☺
Q & A
Name: Paraskevi Zerva
Email: p.zerva@elsevier.com
Linkedin: linkedin.com/in/paraskevizerva_Profile_URL
43
BACKUP SLIDES
44
Extending Data Lineage to Data Provenance
45
prov:Activity
Non-Linear Activity Linear Activity
Workflow
Pipeline
rdfs:subclass rdfs:subclass
rdfs:subclass
composedOf/contains
Software Program
Software Program
Execution
computationOf
executionOf
Software Program
Computation
runsOn
prov:Agent
prov:Server
rdfs:subclass
prov:Entity
rdfs:subclass
Processable
Processable
input
subproperty of
prov:wasUsed
output
subproperty of
prov:wasGeneratedBy
prov:wasDerivedFrom

More Related Content

What's hot

GraphQL and its schema as a universal layer for database access
GraphQL and its schema as a universal layer for database accessGraphQL and its schema as a universal layer for database access
GraphQL and its schema as a universal layer for database accessConnected Data World
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricCambridge Semantics
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Cambridge Semantics
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBMongoDB
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingCambridge Semantics
 
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...Connected Data World
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsOntotext
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceCambridge Semantics
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Cambridge Semantics
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data ModelingVital.AI
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESDATAVERSITY
 
Graph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleGraph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleCambridge Semantics
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run GraphVaticle
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data SmarterMatheus Mota
 
Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceVital.AI
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jNeo4j
 

What's hot (20)

GraphQL and its schema as a universal layer for database access
GraphQL and its schema as a universal layer for database accessGraphQL and its schema as a universal layer for database access
GraphQL and its schema as a universal layer for database access
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
Knowledge Graph Discussion: Foundational Capability for Data Fabric, Data Int...
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Solving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDBSolving the Disconnected Data Problem in Healthcare Using MongoDB
Solving the Disconnected Data Problem in Healthcare Using MongoDB
 
Tara Raafat
Tara RaafatTara Raafat
Tara Raafat
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
 
Building Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 stepsBuilding Knowledge Graphs in 10 steps
Building Knowledge Graphs in 10 steps
 
Knowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data ScienceKnowledge Graph for Machine Learning and Data Science
Knowledge Graph for Machine Learning and Data Science
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
Applying Data Engineering and Semantic Standards to Tame the "Perfect Storm" ...
 
Vital AI: Big Data Modeling
Vital AI: Big Data ModelingVital AI: Big Data Modeling
Vital AI: Big Data Modeling
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDES
 
Graph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise ScaleGraph-based Discovery and Analytics at Enterprise Scale
Graph-based Discovery and Analytics at Enterprise Scale
 
Power of the Run Graph
Power of the Run GraphPower of the Run Graph
Power of the Run Graph
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Optimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data ScienceOptimizing the
 Data Supply Chain
 for Data Science
Optimizing the
 Data Supply Chain
 for Data Science
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
 

Similar to Supporting GDPR Compliance through effectively governing Data Lineage and Data Provenance

Henninger_MakingReferenceDataMoreMeaningful-Final
Henninger_MakingReferenceDataMoreMeaningful-FinalHenninger_MakingReferenceDataMoreMeaningful-Final
Henninger_MakingReferenceDataMoreMeaningful-FinalScott Henninger
 
RFT for Business Intelligence and Data Strategy
RFT for Business Intelligence and Data StrategyRFT for Business Intelligence and Data Strategy
RFT for Business Intelligence and Data StrategySustainableEnergyAut
 
Intro to big data and applications -day 3
Intro to big data and applications -day 3Intro to big data and applications -day 3
Intro to big data and applications -day 3Parviz Vakili
 
Credit Suisse, Reference Data Management on a Global Scale
Credit Suisse, Reference Data Management on a Global ScaleCredit Suisse, Reference Data Management on a Global Scale
Credit Suisse, Reference Data Management on a Global ScaleOrchestra Networks
 
Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...DataWorks Summit
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationDenodo
 
Information architecture overview
Information architecture overviewInformation architecture overview
Information architecture overviewJames M. Dey
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfAhmedHany Sayed
 
Michael Josephs
Michael JosephsMichael Josephs
Michael JosephsdaveGBE
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesSlideTeam
 
data collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxdata collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxSourabhkumar729579
 
General Data Protection Regulation (GDPR) Implications for Canadian Firms
General Data Protection Regulation (GDPR) Implications for Canadian FirmsGeneral Data Protection Regulation (GDPR) Implications for Canadian Firms
General Data Protection Regulation (GDPR) Implications for Canadian Firmsaccenture
 
CXAIR for Data Migration
CXAIR for Data MigrationCXAIR for Data Migration
CXAIR for Data MigrationConnexica
 
Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architectureCosta Pissaris
 
Data privacy and security in uae
Data privacy and security in uaeData privacy and security in uae
Data privacy and security in uaeRishalHalid1
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIDenodo
 

Similar to Supporting GDPR Compliance through effectively governing Data Lineage and Data Provenance (20)

Henninger_MakingReferenceDataMoreMeaningful-Final
Henninger_MakingReferenceDataMoreMeaningful-FinalHenninger_MakingReferenceDataMoreMeaningful-Final
Henninger_MakingReferenceDataMoreMeaningful-Final
 
RFT for Business Intelligence and Data Strategy
RFT for Business Intelligence and Data StrategyRFT for Business Intelligence and Data Strategy
RFT for Business Intelligence and Data Strategy
 
Intro to big data and applications -day 3
Intro to big data and applications -day 3Intro to big data and applications -day 3
Intro to big data and applications -day 3
 
Credit Suisse, Reference Data Management on a Global Scale
Credit Suisse, Reference Data Management on a Global ScaleCredit Suisse, Reference Data Management on a Global Scale
Credit Suisse, Reference Data Management on a Global Scale
 
Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data Virtualization
 
Information architecture overview
Information architecture overviewInformation architecture overview
Information architecture overview
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Introduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdfIntroduction to data interoperability across the data value chain.pdf
Introduction to data interoperability across the data value chain.pdf
 
Michael Josephs
Michael JosephsMichael Josephs
Michael Josephs
 
DataPlatform.pptx
DataPlatform.pptxDataPlatform.pptx
DataPlatform.pptx
 
Big Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation SlidesBig Data Analytics Architecture PowerPoint Presentation Slides
Big Data Analytics Architecture PowerPoint Presentation Slides
 
data collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxdata collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptx
 
General Data Protection Regulation (GDPR) Implications for Canadian Firms
General Data Protection Regulation (GDPR) Implications for Canadian FirmsGeneral Data Protection Regulation (GDPR) Implications for Canadian Firms
General Data Protection Regulation (GDPR) Implications for Canadian Firms
 
CXAIR for Data Migration
CXAIR for Data MigrationCXAIR for Data Migration
CXAIR for Data Migration
 
Big data governance
Big data governanceBig data governance
Big data governance
 
Building the enterprise data architecture
Building the enterprise data architectureBuilding the enterprise data architecture
Building the enterprise data architecture
 
Data privacy and security in uae
Data privacy and security in uaeData privacy and security in uae
Data privacy and security in uae
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 

More from Connected Data World

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenConnected Data World
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine LearningConnected Data World
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3Connected Data World
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data ModelConnected Data World
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the WebConnected Data World
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGOConnected Data World
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?Connected Data World
 

More from Connected Data World (20)

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van Harmelen
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine Learning
 
Graphs in sustainable finance
Graphs in sustainable financeGraphs in sustainable finance
Graphs in sustainable finance
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data Model
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the Web
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property Graphs
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGO
 
What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?What are we Talking About, When we Talk About Ontology?
What are we Talking About, When we Talk About Ontology?
 

Recently uploaded

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Supporting GDPR Compliance through effectively governing Data Lineage and Data Provenance

  • 1. Paraskevi Zerva Cognition & Knowledge Representation Lead p.zerva@elsevier.com Supporting GDPR Compliance through effectively governing Data Lineage & Data Provenance
  • 2. Context ❖ Introductions ❖ Definitions ❖ EDG Metadata Governance Platform & Use Cases ❖ EDG Showcase Data Lineage ❖ GDPR in a Nutshell ❖ How governing effectively Data Lineage supports GDPR Compliance ❖ GDPR Use Case for Time Limits of Personal Data Erasure – Data Retention ❖ GDPR Policies and Compliance ❖ GDPR Compliance Use Case 2
  • 3. Introductions ❖ WhoAmI? ▪ Paraskevi Zerva ▪ Cognition & Knowledge Representation Lead (Entellect, Elsevier) ▪ Previously worked as an Information Architect for the Enterprise Data Governance at JP Morgan & Chase. ▪ PhD in ``Provenance of Data for Compositions of Services’’. ❖ What’s my focus? ▪ Work on the the data governance strategy for Elsevier Entellect to support effective data governance across Entellect’s software development life-cycle. ▪ Build a common representation for analysis & validation of Elsevier Entellect’s data. ▪ Consolidate data lineage & provenance information with other data assets to provide a unified data governance ecosystem. ❖ What I am going to talk about ? ▪ How governing effectively data lineage/provenance supports compliance for GDPR within the Enterprise Data Governance Platform. 3
  • 4. Definitions 4 ❖ Data governance: ▪ is a set of processes that ensures that data assets are efficiently managed and enables gaining control and have a better understanding of your data, ▪ ensures that data can be trusted and organizations can show accountability about their data assets with regards to data quality, retention, data lineage etc., ▪ describes an evolutionary process for a company setting up the processes to handle information so that it may be utilized by the entire organization, ▪ encompasses data/metadata collection, analysis and validation of rules involving data (e.g., business (domain) rules, standards, data quality, entitlements, SOR, etc.) ❖ Data lineage refers to capturing the sequence of data flows involving a data element - it can be represented visually to discover the movement of data artefact from its source to its destination to understand where this originates from. ❖ Data provenance refers to the recording activity for the processing activities data (e.g., through provenance loggers). ❖ GDPR is the General Data Protection Regulation.
  • 5. Enterprise Data Governance ❖ Unified platform for Corporate Technology to support the efficient data governance and metadata management. ❖ Team’s mission: ✓ Integrate CT metadata from various sources in one place in a common way (RDF), regardless of the input format ✓ Consolidates lineage/provenance information together with other metadata. ❖ EDG ingests different formats like XML, JSON, CSV (Collect) ❖ EDG translates the data/metadata into a common language format (RDF) (Standardize) ✓ Schemas are expressed as OWL ontologies. ✓ SHACL (shapes constraint language) is used for interface building. and different user’s representation with the same underlying core schema. ✓ SPIN is used for transformation. ❖ We form a connected graph data structure queryable across all internal and external reference datasets (Connect) 5 Collect Standardize Connect Refine
  • 6. Enterprise Data Governance Ecosystem 6 ✓ Enterprise Metadata ✓ LDMs/PDMs ✓ Req Reports ✓ Glossaries ✓ Taxonomies ✓ Codelists ✓ External Standards Ingestion ✓ Data Models ✓ Provenance Logging ✓ Movement ✓ Feedback Unified Data/Metad ata Hub (EDG) Sources People, Processes, Tools, Services, Conformed Data ✓ RACI (roles) ✓ APIs ✓ Discovery ✓ Reporting Uses
  • 7. Data Governance Business Cases ❖ Capture/manage governance requirements for the complete portfolio of CT applications. ❖ Support the software development lifecycle, compliance/regulatory requirements (GDPR). ❖ Demonstrate Data Lineage* where the different data sources originated from, to showcase accountability on control & understanding of the data for regulatory purposes. ❖ Exhibit Data provenance** of how the data is processed/transforms across the platform. ❖ Track data movement/data transfers between applications (Traceability***). ❖ Provide contextual alignment with firm-wide standards, taxonomies and glossaries. ❖ Provide validation capabilities for data quality and data accuracy. ❖ Exhibit accountability with regards to entitlements (by effectively governing data provenance). ❖ Ontologies are extended to provide crosswalks between models and ecosystems so we can answer questions such as : ✓ Which applications contain (S)PI data affected by GDPR? (Regulatory Reporting) ✓ S. Arabia has changed its retention policy – what applications are impacted? (Reporting) ✓ What are the owners of particular data requirements documentation? (RACI) 7 * Data lineage refers to capturing the sequence of data flows involving a data element - it can be represented visually to discover the data flow/movement from its source to its destination. ** Traceability indicates the ability to track a data construct back to the construct it was derived from e.g., the original system where this was created *** Data provenance refers to the recording activity (through provenance loggers)
  • 8. EDG – Metadata Governance Model ❖ The diagram depicts how conceptually metadata from various sources is connected in EDG. ❖ Data Lineage flow connects the following artefacts: ➢ Business Terms (Data Dictionary Metadata) ➢ Data Requirements ➢ Logical Data Model Artefacts ➢ Physical Data Model Artefacts ➢ Application/Deployment (Technical) Metadata ❖ Data Traceability indicates the ability to be able to track the links to another artefact. ❖ Data Provenance allows to track: ➢ Ownership/Entitlements/Access Control Metadata ➢ Business Capability/Process Metadata 8 Data Lineage Data Traceability Data Provenance
  • 9. EDG – Showcase Data Lineage 9
  • 14. Link to Technical Metadata 14
  • 18. Link to Data Requirement 18
  • 19. Link to Standard Glossary 19
  • 23. Data Lineage Diagram Logical/Physical Data Elements 23 Mapping to Technical Asset LDM Artefacts PDM Artefacts Mapping to Data Requirement Mapping to Standard Glossary PDM to LDM mapping
  • 24. Data Lineage Logical Data Elements 24 LDM Artefacts Mapping to Data Requirement Mapping to Standard Glossary PDM to LDM mapping
  • 25. Data Lineage Physical Data Elements 25 Mapping to Technical Asset PDM Artefacts
  • 26. GDPR in a Nutshell ❖ GDPR is a new set of EU guidelines governing how organizations handle personal data replacing the current Data Protection Act (DPA) and was enforced from 25 May 2018. ❖ According to GDPR personal data should be processed: ➢ Fairly/lawfully ➢ Must retain accurate/kept up to date ➢ Kept no longer than is necessary (retention period) ➢ Processed in a secure way ❖ Controller and processor terms are used in GDPR to describe the parties involved in processing personal data (PI). ❖ Controller: the party that decides what data is extracted, the purpose used, who is involved in the processing. ✓ should be able to demonstrate compliance (accountability metrics). ✓ should be able to report on the purposes of processing/the categories PI it controls. ❖ Processor: the party responsible for processing the data on behalf of the controller. ✓ should maintain records of the categories of processing activities of PI & the means in which it’s processed. ✓ should be able to report on the data transfers of personal data to a third country or an international organization and can be held responsible for a data breach (requirement for breach notification). 26
  • 27. How governing Data Lineage Supports GDPR Compliance ❖ GDPR Challenge: ➢ Record of personal data processing are required for evidencing/demonstrating compliance. ➢ Organizations are required to record every point where processing activities of personal data takes place and showcase accountability. ❖ Solution: ➢ GDPR makes data governance even more critical on the lineage aspect. ➢ Governance of data lineage enables the understanding of your data-flow activities & to identify and document legal justification for each type of activity. ➢ When data lineage is represented visually it allows discovery of the data flow/movement from its source to destination via various changes and how the data is transformed. ➢ On top of that the GDPR requires to evidence records of personal data processing that implies the need for Data Provenance. ➢ Data Provenance refers to the recording activity of how the data were derived/generated and processed. It allows to verify that the process and steps used to obtain a result complies with a set of given requirements. ➢ In our business case the given requirements are GDPR regulatory requirements therefore data lineage and provenance become the tools to showcase accountability with regards to GDPR compliance. 27
  • 28. GDPR Governance Use Case ❖ GDPR Article 30 Data Requirement ➢ Provide time limits for erasure of the different categories of data required per record retention policy. ❖ Regulatory requirement translated to the creation of a report and accountability metrics that: ➢ Returns applications in scope of GDPR for Corporate Technology. ➢ Returns the record class code in scope applications based on the record retention policies per country. ➢ Notifies application owners in case there are changes on record retention updates and verifies compliance of new changes with regards to GDPR regulatory requirements. 28 * Record class codes are used to determine how long to keep each record for each jurisdiction. **A record class code (RCC) is a category used to group similar types of records in JPMC’s master record retention schedule. *** Record retention requirements are categorized by record class code by county and in some case by the business function of the record.
  • 29. Retention Conceptual Model 29 ia.jpmc: SEAL_103249 a edg:BusinessApplication ia.jpmc.gov:GRM_FUN1030 a ia.jpmc:RecordRetentionClass ia.jpmc:application RecordRetentionClass ia.jpmc.gov:GRM_FUN10300-AE a ia.jpmc:dataRetentionPolicy ia.jpmc:dataRetentionPolicy SOR: SEAL SOR: Retention Manager Record Retention Policy Ontology Technical Standard Ontology
  • 30. GDPR In Scope Business Application 30 GDPR compliance information Provenance information GDPR in Scope – contains pi OBK1060 | Payroll Services (GRM) PAY100 | Employee Compensation Contribution (GRM)record retention class Record Retention Class
  • 31. Record Retention Code 31 Data Retention Record Class data retention policy
  • 33. EDG – RCC Code Diagram 33 data retention policy
  • 34. EDG – RCC Policy Diagram 34 country: disposition: retention period unit: retention period: retention event:
  • 35. Query RCC Data for GDPR in scope Apps prefix rdfs: <http://www.w3.org/20000/01/rdf-schema#> prefix edg: <http://www.edg.topbraid.solutions/model/> prefix ia.jpmc: <http://ia.jpmc.com/dg/> SELECT DISTINCT ?appId ?appName ?gdprScopr ?lob ?rccClassCode ?rccLabel ?rccPolicy FROM <urn:x-evn-master:seal> FROM <urn:x-evn-master:grm> WHERE { { ?app a edg:BusinessApplication . ?app edg:name ?appName . ?app edg:identifier ?appId . ?app ia.jpmc:inScopeForGDPR ?gdprScope . ?app ia.jpmc:lineOfBusiness ?lob . FILTER regex(?gdprScope, “YES”) FILTER regex(?lineOfBusiness, “CT”) } } 35 appId appName gdprScope lob rccClassCode rccPolicy 35632 Payroll Application YES CT GRM_AUD_PAY_1060 Payroll Services 38537 KPMG Link – Global Business Travel YES CT GRM_AUD_PAY_1080 Payroll Accounting
  • 36. SPIN Rules (1) - Inferencing #STEP 401: Create Record Classes CONSTRUCT { ?recordClassCodeU a ia.jpmc:DataRetentionRecordClass . ?recordClassCodeU rdfs:label ?rccLabel. ?recordClassCodeU edg:identifier ?rccClassCode. ?recordClassCodeU edg:name ?rccName. ?recordClassCodeU edg:description ?rccDescription. ?recordClassCodeU ia.jpmc.go.dataRetentionPolicy ?rccPolicyU. } WHERE { ?this a RetentionExport:RetentionExport. BIND (spl:object (?this, RetentionExport:recordClassCode) AS ?rccClassCode . BIND (spl:object (?this, RetentionExport:country) AS ?country . BIND (spl:object (?this, RetentionExport:countryCode) AS ?countryCode . BIND (spl:object (?this, RetentionExport:recordClassName) AS ?rccName . BIND (spl:object (?this, RetentionExport:recordClassDescription) AS ?rccDescription. BIND (ia.jpmc:BuildDataRetentionPolicyClassURI (?recordClassCode) AS ?recordClassCodeU. ?countryU country:countryId ?RDICountryCode . BIND (str(?RDICountryCode) AS cntryCodeLabel). FILTER (?countryCode = ?cntryCodeLabel) . BIND (fn:concat(?recordClassCode, "|", ?recordClassName, "(GRM)") AS ?rccLabel). BIND (ia.jpmc:BuildDataRetentionPolicyRecordURI(?recordClassCode, ?RDICountryCode) AS ?rccPolicyU). } 36
  • 37. SPIN Rules (2) - Inferencing #STEP 402: Create Record Retention Policy CONSTRUCT { ?rccPolicyU a edg:DataRetentionPolicy . ?rccPolicyU rdfs:label ?policyLabel. ?rccPolicyU edg:identifier ?policyIdentifier. ?rccPolicyU edg:name ?policyName. ?rccPolicyU ia.jpmc.go:retentionDisposition ?disposition. ?rccPolicyU ia.jpmc.go:retentionEvent ?retentionEvent. ?rccPolicyU ia.jpmc.go:retentionPeriod ?retentionPeriod. ?rccPolicyU ia.jpmc.go:retentionPeriodUnit ?retentionPeriodUnit. ?rccPolicyU edg:country ?country. } WHERE { ?this a RetentionExport:RetentionExport. BIND (spl:object (?this, RetentionExport:policyIdentifier) AS ?policyIdentifier. BIND (spl:object (?this, RetentionExport:retentionDisposition) AS ?disposition . BIND (spl:object (?this, RetentionExport:retentionEvent) AS ?retentionEvent . BIND (spl:object (?this, RetentionExport:retentionPeriod) AS ?retentionPeriod . BIND (spl:object (?this, RetentionExport:retentionPeriodUnit) AS ?retentionPeriodUnit. BIND (spl:object (?this, RetentionExport:country) AS ?country . BIND (ia.jpmc:BuildDataRetentionPolicyClassURI (?recordClassCode) AS BIND (fn:concat(?policyIdentifier, "|", ?policyName, "(GRM)") AS ?policyLabel). BIND (ia.jpmc:BuildDataRetentionPolicyURI(?policyIdentifier, ?country, ?policyName) AS ?rccPolicyU). } 37
  • 39. GDPR POLICY & COMPLIANCE 39
  • 40. GDPR POLICIES & COMPLIANCE ❖ Policies define guidelines for handling and implementing specific security or regulatory issues. ❖ With focus on the policy requirements for data protection we have built a policy/compliance model: ➢ Aiming on validating GDPR compliance for the compliance objects under policy target. ➢ Showcasing accountability with regards to GDPR policy requirements. Objectives ❖ Merge the gap between GDPR legislation obligations and operational level technology controls using semantic modelling to model the critical policy and compliance aspects. ❖ Use inferencing to preserve accountability of processing activities that handle PI data subject to regulatory compliance. 40
  • 41. GDPR RETENTION COMPLIANCE 41 edg:Policy edg:DataPolicy rdfs:subclass rdfs:subclass edg:ComplianceAspect edg:Policy Requirement ia.jpmc:DataRetention RecordClass ia.jpmc:categorized ByCountry ia.jpmc:categorized ByBusinessFunction ia.jpmc:Country ia.jpmc:Business Function edg:compliesWith edg:DataRetention Policy rdf:type ia.jpmc:dataRetentionPolicy edg:RequirementAsset rdf:typeedg:hasRequirement edg:GDPRRegulatory Requirement rdfs:subclass
  • 43. Thanks for attending ☺ Q & A Name: Paraskevi Zerva Email: p.zerva@elsevier.com Linkedin: linkedin.com/in/paraskevizerva_Profile_URL 43
  • 45. Extending Data Lineage to Data Provenance 45 prov:Activity Non-Linear Activity Linear Activity Workflow Pipeline rdfs:subclass rdfs:subclass rdfs:subclass composedOf/contains Software Program Software Program Execution computationOf executionOf Software Program Computation runsOn prov:Agent prov:Server rdfs:subclass prov:Entity rdfs:subclass Processable Processable input subproperty of prov:wasUsed output subproperty of prov:wasGeneratedBy prov:wasDerivedFrom