Page 1
Enterprise Data Governance
Leveraging Knowledge Graph & AI in support of a
data-driven organization
Aftab Iqbal, PhD
Information Architect
Page 2
Actively manage our data
and make it a first class function
Data Strategy – Our Mission
Generating Business Value
Transparency and access are necessary to deliver value
Well described data environments are our “data mission” and are linked to our
business strategy and core operations
Reducing Barriers to Access
Regulatory Compliance
Financial and privacy regulations are increasing in complexity
Regulators expect dependable, consistent reporting
We must safeguard our client and firm data
Democratize data analytics capabilities
Data Lake is our platform for easily accessible data
Page 3
Key themes to our
Data Strategy
Records
Management
Regional
Compliance
Landscape
Documentation
Data
Protection
 Centralized tooling to automate
data management activities
 Robust scanning, identification,
and masking capabilities
 Metadata management through
the entire lifecycle
Data Management Approach
Data Lake
Governance
Archive
Service
Page 4
Why Data Management?
Data Management is
for everyone!
?
Page 5
Hard to find data … when you
have a lot of it
Why Data Management?
UNSTRUCTURED
JPMC TECHNOLOGY LOCATIONS
StorageTiers
File
(NAS)
Block
(SAN)
Object
(S3)
Mainframe
Storage
Public Cloud
ContentTypes
STRUCTURED
End User
Device
(e.g. Laptop)
Relational Hadoop
Other Non Relational
TechLocations
Branch
Tapes
Data Center
Other
Other
SaaS
Time Series
Page 6
Vision
Make Data a first
class function
To precisely
understand
what data we
have and,
where it goes
HOW
WHY
Better data > better information
Better information > better decisions
Better decisions > business value
AI/ML
Data Standards
Business Glossary
Processes (DC-SDLC)
Platform (data catalog)
Page 7
Data Management Drivers
Data Landscape
What data
do I need?
What data do
we have?
Where is my
data from?
Where should my
data come from?
What data should
be shared most?
Data
Requirements
Data In
Place
Data In
Motion (Lineage)
Authority
(ADS, SoR)
Reference
Data
Reducing Barriers to Access
Page 8
Strategic components in the
Data Lifecycle
Ideally, consume conformed data
from Authoritative Data Source
Re-Use / Use shared services,
build only when needed
Approach activities in lowest
risk manner possible
Minimize duplicative and / or
redundant data transformation
Present data once through a
single mechanism
Only duplicate data if
absolutely necessary
Data Management Foundation
Page 9
How We Do It?
Technology
Processes
Meta Data
APIs
Application Landscape
Knowledge Graph
Page 10
Mapping our Data Landscape
Page 11
Mapping our Data Landscape
Page 12
Mapping our Data Landscape
Page 13
Mapping our Data Landscape
Page 14
Mapping our Data Landscape
Page 15
Insights – Application Complexity
Application comparison by the number of logical
attributes and physical columns
Page 16
Insights – Upstream Dependency for an Application
Page 17
Insights – Data Flows between 2 Applications
Page 18
Knowledge Graph
Data in Place
Data in Motion
Data in Situation
Regulations
Key Takeaways & Future Directions
Data Profiles
ML/AI
• Identify and protect
sensitive data
• Reduce digital footprint
by archiving and
destroying data
• …
Page 19
Q & A
aftab.iqbal@jpmchase.com
https://www.linkedin.com/in/aftabiqbal

Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a data-driven organization

  • 1.
    Page 1 Enterprise DataGovernance Leveraging Knowledge Graph & AI in support of a data-driven organization Aftab Iqbal, PhD Information Architect
  • 2.
    Page 2 Actively manageour data and make it a first class function Data Strategy – Our Mission Generating Business Value Transparency and access are necessary to deliver value Well described data environments are our “data mission” and are linked to our business strategy and core operations Reducing Barriers to Access Regulatory Compliance Financial and privacy regulations are increasing in complexity Regulators expect dependable, consistent reporting We must safeguard our client and firm data Democratize data analytics capabilities Data Lake is our platform for easily accessible data
  • 3.
    Page 3 Key themesto our Data Strategy Records Management Regional Compliance Landscape Documentation Data Protection  Centralized tooling to automate data management activities  Robust scanning, identification, and masking capabilities  Metadata management through the entire lifecycle Data Management Approach Data Lake Governance Archive Service
  • 4.
    Page 4 Why DataManagement? Data Management is for everyone! ?
  • 5.
    Page 5 Hard tofind data … when you have a lot of it Why Data Management? UNSTRUCTURED JPMC TECHNOLOGY LOCATIONS StorageTiers File (NAS) Block (SAN) Object (S3) Mainframe Storage Public Cloud ContentTypes STRUCTURED End User Device (e.g. Laptop) Relational Hadoop Other Non Relational TechLocations Branch Tapes Data Center Other Other SaaS Time Series
  • 6.
    Page 6 Vision Make Dataa first class function To precisely understand what data we have and, where it goes HOW WHY Better data > better information Better information > better decisions Better decisions > business value AI/ML Data Standards Business Glossary Processes (DC-SDLC) Platform (data catalog)
  • 7.
    Page 7 Data ManagementDrivers Data Landscape What data do I need? What data do we have? Where is my data from? Where should my data come from? What data should be shared most? Data Requirements Data In Place Data In Motion (Lineage) Authority (ADS, SoR) Reference Data Reducing Barriers to Access
  • 8.
    Page 8 Strategic componentsin the Data Lifecycle Ideally, consume conformed data from Authoritative Data Source Re-Use / Use shared services, build only when needed Approach activities in lowest risk manner possible Minimize duplicative and / or redundant data transformation Present data once through a single mechanism Only duplicate data if absolutely necessary Data Management Foundation
  • 9.
    Page 9 How WeDo It? Technology Processes Meta Data APIs Application Landscape Knowledge Graph
  • 10.
    Page 10 Mapping ourData Landscape
  • 11.
    Page 11 Mapping ourData Landscape
  • 12.
    Page 12 Mapping ourData Landscape
  • 13.
    Page 13 Mapping ourData Landscape
  • 14.
    Page 14 Mapping ourData Landscape
  • 15.
    Page 15 Insights –Application Complexity Application comparison by the number of logical attributes and physical columns
  • 16.
    Page 16 Insights –Upstream Dependency for an Application
  • 17.
    Page 17 Insights –Data Flows between 2 Applications
  • 18.
    Page 18 Knowledge Graph Datain Place Data in Motion Data in Situation Regulations Key Takeaways & Future Directions Data Profiles ML/AI • Identify and protect sensitive data • Reduce digital footprint by archiving and destroying data • …
  • 19.
    Page 19 Q &A aftab.iqbal@jpmchase.com https://www.linkedin.com/in/aftabiqbal