SlideShare a Scribd company logo
Who changed my data?
Need for data governance and
provenance in a streaming world
Digital capability requires granular control of all data assets.
Dinesh Chandrasekhar
Director, Product Marketing
Paige Bartley
Senior Analyst, Data and
Enterprise Intelligence
Ovum | TMT intelligence | informa2 Copyright © Informa PLC
Ovum | TMT intelligence | informa3 Copyright © Informa PLC
Business challenges in achieving digital capability include:
 Reproducibility of analytics results
 Debugging of models and algorithms
 Ensuring correct access rights to data
 Consistent application of data policies
 Meeting regulatory compliance requirements
 Unifying data across repositories and silos
 Finding the right data at the right time
Digital Capability Depends on Full Control of Data
Addressing these
challenges requires
understanding how
data changes over time.
Ovum | TMT intelligence | informa4 Copyright © Informa PLC
Governance and Transparency of
Data Assets is More Important
than Ever
Ovum | TMT intelligence | informa5 Copyright © Informa PLC
More Data:
 Economics of storage have made keeping data cheap.
 New data types – sensor data, etc. – need to be combined with historical data.
More Users:
 Self-service era means more data consumers and more frequent data access.
 Varying users have varying access rights and privileges.
 More users means more proliferation of data versions.
More Complexity:
 Data repositories have become more distributed, and data sources more varied.
 Data resides in more locations than ever before, in the cloud and on-prem.
Factors Within the Enterprise
Ovum | TMT intelligence | informa6 Copyright © Informa PLC
More Regulatory Pressure
Regulations such as GDPR have indirect requirements for tracking lineage.
 Article 30 requirements for record keeping necessitate knowledge of provenance.
More Competitive Pressure
 Leverage of data is increasingly a competitive differentiator.
 Pace of change is accelerating, and comprehensive understanding of data is critical.
 Disruptors are emerging from unlikely industries, using data to their advantage.
Factors External to the Enterprise
Ovum | TMT intelligence | informa7 Copyright © Informa PLC
 Article 4: Definition of Personal Data
A person can be identified indirectly or directly
Data sources can be combined to make personal data
 Article 9: Processing of Special Categories of Personal Data
Processing of biometric data is highly restricted
Many types of sensors produce biometric data
 Article 30: Records of Processing Activities
“Who, what, when, where, and why” of processing
Need deep understanding of metadata and data lineage.
GDPR doesn’t differentiate between data-in-motion and data-at-rest! “Who changed what” is critical.
Lineage and provenance, while not directly required by GDPR, are critical to meeting requirements.
GDPR’s Specific Requirements for Data
Ovum | TMT intelligence | informa8 Copyright © Informa PLC
<<
<<
From an analytics standpoint, reaping the benefits of
big data means investing in data management and
governance. Without the correct people, processes, and
infrastructure, more casual business users will likely
struggle to see the benefits of big data technologies.
Laurent-Olivier Lioté
Analyst, Data and Enterprise Intelligence, Ovum
Ovum | TMT intelligence | informa9 Copyright © Informa PLC
A Holistic View of Data Requires Both Data-in-Motion and Data-at-Rest
Data at Rest Data in Motion
Contextual
Understanding
of Data
Ovum | TMT intelligence | informa10 Copyright © Informa PLC
Having a common enterprise metadata framework allows data of different types and from different sources
to be managed consistently.
A common metadata framework allows for:
 Common search and lineage for datasets
 Lifecycle management from ingestion to disposition
 Metadata exchange with other metadata tools
 Analysis of data usage and access trends
 Consistent application of access rights
 Analysis of behavior and anomalies
How Do We Do This? Metadata Management is Necessary for Governance
Metadata
Creation
Metadata
Enrichment
Metadata
Analysis
Ovum | TMT intelligence | informa11 Copyright © Informa PLC
The data lake, if properly managed, can support a common metadata framework which underpins enterprise data.
 Data-in-motion
 Data-at-rest
 Structured data
 Unstructured data
Common management of metadata allows for streamlined control and
visibility into data. Better control of data results in better business outcomes.
The Managed Data Lake Can Support a Common Metadata Framework
All metadata, managed together.
Ovum | TMT intelligence | informa12 Copyright © Informa PLC
The enterprise increasingly wants to analyze all data, both in-motion and at-rest, in context with each other.
Governance and lineage for data-in-motion allows for:
 Audit and regulatory compliance
 Insight into data history and provenance
 Comprehensive lifecycle management
 Security and access controls
 Better quality data = better analytics
Governance standards for data-in-motion need to match those for data-at-rest.
Governance Standards Need to be Equal
Common Metadata Framework
Data-in-Motion Data-at-Rest
Data Management Platform
13 © Hortonworks Inc. 2011–2018. All rights reserved
Changing face of data
Challenges and Solutions
14 © Hortonworks Inc. 2011–2018. All rights reserved
The New Way of Business Is Fueled By Connected Data
• Connected Customers,
Vehicles, Devices
• Socially crowd-sourced
requirements
• Digital design and
analysis
• Digital prototypes and
tests (simulations)
• Connected Factories,
Sensors, Devices
• Human-robotic
interaction
• 3D-printing on
demand
• Connected Trucks,
Inventory
• Location, traffic,
weather-aware
distribution
• Real-time inventory
visibility
• Dynamic rerouting
• Connected Customers,
Devices
• Omni- channel
demand sensing
• Real-Time
Recommendations
• Connected Assets
• Remote service
monitoring & delivery
• Predictive
maintenance
• OTA Updates
DEVELOPMENT MANUFACTURING DISTRIBUTION MARKETING/SALES SERVICE
15 © Hortonworks Inc. 2011–2018. All rights reserved
Today’s Digital Enterprises
RFID TRACKERS AND
NANO-DEVICES
to give you visibility into
movement of your goods
MOBILE NOTIFICATIONS
to inform you of shipment
delay from a supplier
BLOCKCHAINS
to give complete trust and
provenance in your supply
chain
VIRTUAL ASSISTANTS
to enhance your customer
experience
AI-POWERED CHATBOTS
to improve your customer
support functions
ELECTRONIC B2B
EXCHANGES
to streamline order processing
with partners
16
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Modern Data Architecture
DATA CENTER
Machine
Learning/
Artificial
Intelligence
Telemetry –
Connected
Devices
Time Series
Databases
Stream Analytics
Deep Historical
Analysis
Exception
Monitoring
Legacy/
Operational
Data
Sensors,
Control
Systems
Cyber
Security
Edge
Analytics
Social Mobile
IoT
IoT
CLOUD
Geo Location
17
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Data Challenges
Cannot get a
360 VIEW of
your customer? DROWNING in
data lakes?
TOO MUCH DATA
coming in from
TOO MANY
SOURCES and
devices?
New business
initiatives leading
to EXCESSIVE IT
COSTS?`
MOST IMPORTANTLY…
Don’t have the right data at the right time to make the right decision?
18
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
G L O B A L D ATA M A N A G E M E N T
DATA
SOURCES
DATA CENTER CLOUD EDGE
Exception
Monitoring
360 View of
Operations
Cyber
Security
Telemetry –
Connected
Devices
Time Series
Sensors,
Control
Systems
Telemetry –
Connected
Devices
Sensors,
Control
Systems
Time Series
Exception
Monitoring
Cyber
Security
Legacy/
Operational
Data
Global Data Management Enables Modern Data Architecture
19
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Data Management Challenges
• Dealing with multi-clouds
• Avoiding cloud/ vendor lock-in
• Future proofing your architecture
• Common view of security, governance
• Manage all data, regardless of type or location
• Maximize data re-use for multiple workloads
DATA
SOURCES
DATA CENTER CLOUD EDGE
Exception
Monitoring
360 View of
Operations
Cyber
Security
Telemetry –
Connected
Devices
Time Series
Sensors,
Control
Systems
20
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Global Data Management Platform
DATA
SOURCES
DATA CENTER CLOUD EDGE
Exception
Monitoring
360 View of
Operations
Cyber
Security
Telemetry –
Connected
Devices
Time Series
Sensors,
Control
Systems
DATA-IN-MOTION DATA-AT-REST
MANAGE, SECURE, GOVERN, CONSUME
21
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Global Data Management - Powering Innovation
MODERN DATA USE CASES
EDW
OPTIMIZATION
CYBERSECURITY DATA SCIENCE
ADVANCED
ANALYTICS
IOT/ STREAMING
ANALYTICS
DATA
SOURCES
DATA CENTER CLOUD EDGE
Exception
Monitoring
360 View of
Operations
Cyber
Security
Telemetry –
Connected
Devices
Time Series
Sensors,
Control
Systems
DATA-IN-MOTION DATA-AT-REST
MANAGE, SECURE, GOVERN, CONSUME
22
© Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information.
Apache NiFi Overview
• Guaranteed delivery
• Data buffering
- Backpressure
- Pressure release
• Prioritized queuing
• Flow specific QoS
- Latency vs. throughput
- Loss tolerance
• Data provenance
• Supports push and pull
models
• Recovery/recording
a rolling log of fine-
grained history
• Visual command and
control
• Flow templates
• Pluggable/multi-role
security
• Designed for extension
• Clustering
23 © Hortonworks Inc. 2011–2018. All rights reserved
Watch real time flow of data: Data Provenance in Apache NiFi
Select Data Provenance
24 © Hortonworks Inc. 2011–2018. All rights reserved
Easily access and trace changes to dataflow in Apache NiFi
25 © Hortonworks Inc. 2011–2018. All rights reserved
Apache Atlas
• Enterprise data
governance
• Integration with
Apache NiFi
• Integration with
Apache Ranger
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Policy RulesTaxonomies
Tag Based
Policies
Data Lifecycle
Management
Real Time Tag BasedAccess Control
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Energy
PPDM
Retail
PCI
PII
Other
CWM SERVICE: DATA STEWARD STUDIODSS
Discover&
Fingerprint
Data
Smart
Enterprise
Search
Data & Metadata
Security
Data Lineage &
Impact Analysis
Enterprise
Data
Catalog
Organize&
CurateData
26 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

More Related Content

What's hot

Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journeyDataWorks Summit
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacDataWorks Summit
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Jeffrey T. Pollock
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceDataWorks Summit
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationHortonworks
 
Risk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growthRisk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growthDataWorks Summit
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industryParviz Iskhakov
 
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...DataWorks Summit
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...DataWorks Summit/Hadoop Summit
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationDatabricks
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Journey to Big Data: Main Issues, Solutions, Benefits
Journey to Big Data: Main Issues, Solutions, BenefitsJourney to Big Data: Main Issues, Solutions, Benefits
Journey to Big Data: Main Issues, Solutions, BenefitsDataWorks Summit
 
Adapting to the exponential development of technology
Adapting to the exponential development of technologyAdapting to the exponential development of technology
Adapting to the exponential development of technologyDataWorks Summit
 

What's hot (20)

Hilton's enterprise data journey
Hilton's enterprise data journeyHilton's enterprise data journey
Hilton's enterprise data journey
 
Harnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie MacHarnessing the Power of Big Data at Freddie Mac
Harnessing the Power of Big Data at Freddie Mac
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!Klarna Tech Talk - Mind the Data!
Klarna Tech Talk - Mind the Data!
 
The Manulife Journey
The Manulife JourneyThe Manulife Journey
The Manulife Journey
 
Fighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial IntelligenceFighting Financial Crime with Artificial Intelligence
Fighting Financial Crime with Artificial Intelligence
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
Risk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growthRisk listening: monitoring for profitable growth
Risk listening: monitoring for profitable growth
 
Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics Hybrid Cloud Strategy for Big Data and Analytics
Hybrid Cloud Strategy for Big Data and Analytics
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industry
 
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
ING's Customer-Centric Data Journey from Community Idea to Private Cloud Depl...
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Journey to Big Data: Main Issues, Solutions, Benefits
Journey to Big Data: Main Issues, Solutions, BenefitsJourney to Big Data: Main Issues, Solutions, Benefits
Journey to Big Data: Main Issues, Solutions, Benefits
 
Accelerate Return on Data
Accelerate Return on DataAccelerate Return on Data
Accelerate Return on Data
 
Adapting to the exponential development of technology
Adapting to the exponential development of technologyAdapting to the exponential development of technology
Adapting to the exponential development of technology
 

Similar to Who changed my data? Need for data governance and provenance in a streaming world

Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationDenodo
 
Michael Josephs
Michael JosephsMichael Josephs
Michael JosephsdaveGBE
 
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...confluent
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationDenodo
 
Big Data: Trends, Applications and Potentials
Big Data: Trends, Applications and PotentialsBig Data: Trends, Applications and Potentials
Big Data: Trends, Applications and PotentialsCharles Mok
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonIBM Danmark
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeDenodo
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 
Internet of things ecosystem: The quest for value
Internet of things ecosystem: The quest for valueInternet of things ecosystem: The quest for value
Internet of things ecosystem: The quest for valueDeloitte United States
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Stuart Blair
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsDenodo
 
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingDenodo
 
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...MicheleNati
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphVaticle
 
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...webwinkelvakdag
 
Chapter 6 Technology and Data Platforms.pptx
Chapter 6 Technology and Data Platforms.pptxChapter 6 Technology and Data Platforms.pptx
Chapter 6 Technology and Data Platforms.pptxMinHtetAung5
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Sciencedlamb3244
 
Big Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedBig Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedMatt Stubbs
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)Denodo
 

Similar to Who changed my data? Need for data governance and provenance in a streaming world (20)

Reinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital TransformationReinvent Your Data Management Strategy for Successful Digital Transformation
Reinvent Your Data Management Strategy for Successful Digital Transformation
 
Michael Josephs
Michael JosephsMichael Josephs
Michael Josephs
 
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...
Compliance in Motion: Aligning Data Governance Initiatives with Business Obje...
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data Virtualization
 
Sgcp14dunlea
Sgcp14dunleaSgcp14dunlea
Sgcp14dunlea
 
Big Data: Trends, Applications and Potentials
Big Data: Trends, Applications and PotentialsBig Data: Trends, Applications and Potentials
Big Data: Trends, Applications and Potentials
 
Big Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter JönssonBig Data & Analytics, Peter Jönsson
Big Data & Analytics, Peter Jönsson
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 
Internet of things ecosystem: The quest for value
Internet of things ecosystem: The quest for valueInternet of things ecosystem: The quest for value
Internet of things ecosystem: The quest for value
 
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
Fast Data and Architecting the Digital Enterprise Fast Data drivers, componen...
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
 
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...
Personal Data Receipts - Michele Nati - Lead Technologist Privacy and Trust -...
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge Graph
 
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
 
Chapter 6 Technology and Data Platforms.pptx
Chapter 6 Technology and Data Platforms.pptxChapter 6 Technology and Data Platforms.pptx
Chapter 6 Technology and Data Platforms.pptx
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
Big Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance ReimaginedBig Data LDN 2017: Data Governance Reimagined
Big Data LDN 2017: Data Governance Reimagined
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 

Recently uploaded (20)

Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 

Who changed my data? Need for data governance and provenance in a streaming world

  • 1. Who changed my data? Need for data governance and provenance in a streaming world Digital capability requires granular control of all data assets. Dinesh Chandrasekhar Director, Product Marketing Paige Bartley Senior Analyst, Data and Enterprise Intelligence
  • 2. Ovum | TMT intelligence | informa2 Copyright © Informa PLC
  • 3. Ovum | TMT intelligence | informa3 Copyright © Informa PLC Business challenges in achieving digital capability include:  Reproducibility of analytics results  Debugging of models and algorithms  Ensuring correct access rights to data  Consistent application of data policies  Meeting regulatory compliance requirements  Unifying data across repositories and silos  Finding the right data at the right time Digital Capability Depends on Full Control of Data Addressing these challenges requires understanding how data changes over time.
  • 4. Ovum | TMT intelligence | informa4 Copyright © Informa PLC Governance and Transparency of Data Assets is More Important than Ever
  • 5. Ovum | TMT intelligence | informa5 Copyright © Informa PLC More Data:  Economics of storage have made keeping data cheap.  New data types – sensor data, etc. – need to be combined with historical data. More Users:  Self-service era means more data consumers and more frequent data access.  Varying users have varying access rights and privileges.  More users means more proliferation of data versions. More Complexity:  Data repositories have become more distributed, and data sources more varied.  Data resides in more locations than ever before, in the cloud and on-prem. Factors Within the Enterprise
  • 6. Ovum | TMT intelligence | informa6 Copyright © Informa PLC More Regulatory Pressure Regulations such as GDPR have indirect requirements for tracking lineage.  Article 30 requirements for record keeping necessitate knowledge of provenance. More Competitive Pressure  Leverage of data is increasingly a competitive differentiator.  Pace of change is accelerating, and comprehensive understanding of data is critical.  Disruptors are emerging from unlikely industries, using data to their advantage. Factors External to the Enterprise
  • 7. Ovum | TMT intelligence | informa7 Copyright © Informa PLC  Article 4: Definition of Personal Data A person can be identified indirectly or directly Data sources can be combined to make personal data  Article 9: Processing of Special Categories of Personal Data Processing of biometric data is highly restricted Many types of sensors produce biometric data  Article 30: Records of Processing Activities “Who, what, when, where, and why” of processing Need deep understanding of metadata and data lineage. GDPR doesn’t differentiate between data-in-motion and data-at-rest! “Who changed what” is critical. Lineage and provenance, while not directly required by GDPR, are critical to meeting requirements. GDPR’s Specific Requirements for Data
  • 8. Ovum | TMT intelligence | informa8 Copyright © Informa PLC << << From an analytics standpoint, reaping the benefits of big data means investing in data management and governance. Without the correct people, processes, and infrastructure, more casual business users will likely struggle to see the benefits of big data technologies. Laurent-Olivier Lioté Analyst, Data and Enterprise Intelligence, Ovum
  • 9. Ovum | TMT intelligence | informa9 Copyright © Informa PLC A Holistic View of Data Requires Both Data-in-Motion and Data-at-Rest Data at Rest Data in Motion Contextual Understanding of Data
  • 10. Ovum | TMT intelligence | informa10 Copyright © Informa PLC Having a common enterprise metadata framework allows data of different types and from different sources to be managed consistently. A common metadata framework allows for:  Common search and lineage for datasets  Lifecycle management from ingestion to disposition  Metadata exchange with other metadata tools  Analysis of data usage and access trends  Consistent application of access rights  Analysis of behavior and anomalies How Do We Do This? Metadata Management is Necessary for Governance Metadata Creation Metadata Enrichment Metadata Analysis
  • 11. Ovum | TMT intelligence | informa11 Copyright © Informa PLC The data lake, if properly managed, can support a common metadata framework which underpins enterprise data.  Data-in-motion  Data-at-rest  Structured data  Unstructured data Common management of metadata allows for streamlined control and visibility into data. Better control of data results in better business outcomes. The Managed Data Lake Can Support a Common Metadata Framework All metadata, managed together.
  • 12. Ovum | TMT intelligence | informa12 Copyright © Informa PLC The enterprise increasingly wants to analyze all data, both in-motion and at-rest, in context with each other. Governance and lineage for data-in-motion allows for:  Audit and regulatory compliance  Insight into data history and provenance  Comprehensive lifecycle management  Security and access controls  Better quality data = better analytics Governance standards for data-in-motion need to match those for data-at-rest. Governance Standards Need to be Equal Common Metadata Framework Data-in-Motion Data-at-Rest Data Management Platform
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved Changing face of data Challenges and Solutions
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved The New Way of Business Is Fueled By Connected Data • Connected Customers, Vehicles, Devices • Socially crowd-sourced requirements • Digital design and analysis • Digital prototypes and tests (simulations) • Connected Factories, Sensors, Devices • Human-robotic interaction • 3D-printing on demand • Connected Trucks, Inventory • Location, traffic, weather-aware distribution • Real-time inventory visibility • Dynamic rerouting • Connected Customers, Devices • Omni- channel demand sensing • Real-Time Recommendations • Connected Assets • Remote service monitoring & delivery • Predictive maintenance • OTA Updates DEVELOPMENT MANUFACTURING DISTRIBUTION MARKETING/SALES SERVICE
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved Today’s Digital Enterprises RFID TRACKERS AND NANO-DEVICES to give you visibility into movement of your goods MOBILE NOTIFICATIONS to inform you of shipment delay from a supplier BLOCKCHAINS to give complete trust and provenance in your supply chain VIRTUAL ASSISTANTS to enhance your customer experience AI-POWERED CHATBOTS to improve your customer support functions ELECTRONIC B2B EXCHANGES to streamline order processing with partners
  • 16. 16 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Modern Data Architecture DATA CENTER Machine Learning/ Artificial Intelligence Telemetry – Connected Devices Time Series Databases Stream Analytics Deep Historical Analysis Exception Monitoring Legacy/ Operational Data Sensors, Control Systems Cyber Security Edge Analytics Social Mobile IoT IoT CLOUD Geo Location
  • 17. 17 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Data Challenges Cannot get a 360 VIEW of your customer? DROWNING in data lakes? TOO MUCH DATA coming in from TOO MANY SOURCES and devices? New business initiatives leading to EXCESSIVE IT COSTS?` MOST IMPORTANTLY… Don’t have the right data at the right time to make the right decision?
  • 18. 18 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. G L O B A L D ATA M A N A G E M E N T DATA SOURCES DATA CENTER CLOUD EDGE Exception Monitoring 360 View of Operations Cyber Security Telemetry – Connected Devices Time Series Sensors, Control Systems Telemetry – Connected Devices Sensors, Control Systems Time Series Exception Monitoring Cyber Security Legacy/ Operational Data Global Data Management Enables Modern Data Architecture
  • 19. 19 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Data Management Challenges • Dealing with multi-clouds • Avoiding cloud/ vendor lock-in • Future proofing your architecture • Common view of security, governance • Manage all data, regardless of type or location • Maximize data re-use for multiple workloads DATA SOURCES DATA CENTER CLOUD EDGE Exception Monitoring 360 View of Operations Cyber Security Telemetry – Connected Devices Time Series Sensors, Control Systems
  • 20. 20 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Global Data Management Platform DATA SOURCES DATA CENTER CLOUD EDGE Exception Monitoring 360 View of Operations Cyber Security Telemetry – Connected Devices Time Series Sensors, Control Systems DATA-IN-MOTION DATA-AT-REST MANAGE, SECURE, GOVERN, CONSUME
  • 21. 21 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Global Data Management - Powering Innovation MODERN DATA USE CASES EDW OPTIMIZATION CYBERSECURITY DATA SCIENCE ADVANCED ANALYTICS IOT/ STREAMING ANALYTICS DATA SOURCES DATA CENTER CLOUD EDGE Exception Monitoring 360 View of Operations Cyber Security Telemetry – Connected Devices Time Series Sensors, Control Systems DATA-IN-MOTION DATA-AT-REST MANAGE, SECURE, GOVERN, CONSUME
  • 22. 22 © Hortonworks, Inc. 2011-2018. All rights reserved. | Hortonworks confidential and proprietary information. Apache NiFi Overview • Guaranteed delivery • Data buffering - Backpressure - Pressure release • Prioritized queuing • Flow specific QoS - Latency vs. throughput - Loss tolerance • Data provenance • Supports push and pull models • Recovery/recording a rolling log of fine- grained history • Visual command and control • Flow templates • Pluggable/multi-role security • Designed for extension • Clustering
  • 23. 23 © Hortonworks Inc. 2011–2018. All rights reserved Watch real time flow of data: Data Provenance in Apache NiFi Select Data Provenance
  • 24. 24 © Hortonworks Inc. 2011–2018. All rights reserved Easily access and trace changes to dataflow in Apache NiFi
  • 25. 25 © Hortonworks Inc. 2011–2018. All rights reserved Apache Atlas • Enterprise data governance • Integration with Apache NiFi • Integration with Apache Ranger Apache Atlas Knowledge Store Audit Store ModelsType-System Policy RulesTaxonomies Tag Based Policies Data Lifecycle Management Real Time Tag BasedAccess Control REST API Services Search Lineage Exchange Healthcare HIPAA HL7 Financial SOX Dodd-Frank Energy PPDM Retail PCI PII Other CWM SERVICE: DATA STEWARD STUDIODSS Discover& Fingerprint Data Smart Enterprise Search Data & Metadata Security Data Lineage & Impact Analysis Enterprise Data Catalog Organize& CurateData
  • 26. 26 © Hortonworks Inc. 2011–2018. All rights reserved Thank you

Editor's Notes

  1. Let’s step away from compliance, regulation, and requirements, and look at the major trends and drivers within the enterprise. Governance and provenance are often discussed as “checkbox” requirements, rather than as enablers. ICT Enterprise Insights survey identified “create digital capability” and “manage security, identity, and privacy” as the top two IT trends in the enterprise. What do these trends have in common?
  2. There are three pillars to creating digital capability. The first pillar is the creation of the digital platform and infrastructure itself. The second pillar is the creation of the ability to effectively exploit and utilize data. The third pillar is the development of the enterprise's innovation process and methodology for the digital age. All three are underpinned by a clearly articulated digital strategy.
  3. Article 4: Any information relating to an identified or identifiable natural person; a natural person can be identified indirectly or directly , and the enterprise needs to be cautious with combining data sources to ensure that innocuous information doesn’t become personal information Article 9: Processing of biometric data for the purpose of uniquely identifying a person is inherently prohibited, unless certain conditions are met, and this applies to several types of data in motion: sensor data from wearables, medical devices, and fitness devices. Article 30: Must document purposes of processing, transfers of data to non-EU countries, and the envisaged time limits for erasure of the data
  4. Data policies are applied and encoded at the metadata level. Metadata, or data about data, is critical to providing a common foundation for understanding the qualities of data residing in different systems and to provide lineage and cataloging capabilities. A shared or common metadata framework, where all metadata is managed together, allows data to be centrally searched, tracked, and monitored regardless of its "home" repository.
  5. To make this a reality, the same governance standards need to be applied to all enterprise data equally. There needs to be a single platform environment where data-in-motion and data-at-rest can be managed together, with a common metadata framework. All data-in-motion sources need a way to be ingested into this platform, with provenance and lineage tracked as they flow in.
  6. TALK TRACK Hortonworks Powers the Future of Data: data-in-motion, data-at-rest, and Modern Data Applications. [NEXT SLIDE]
  7. Data is often referred to as the fuel of today’s businesses. In reality, every business has data and perhaps can have access to the same types of data than most of their competitors. The real paradigm is not data but who uses it smarter with greater effect. And that usage often rely on connecting the data dots across your organization. By connecting customers to products to channels through which they interact of prefer to interact we can drive better customer experiences – resulting in better loyalty and hopefully better revenues. Every industry is being transformed through these connected use cases.
  8. 1) Data is in multiple places (data centers that the company owns, cloud, owned by a third party,). 2) Different data in different places (data in your databases – numbers – data from sensors in a connected product not arranged in a database; 3) data flowing back and forth between data center and cloud. Talking points: There is a an entire new world being created by combining lots of data with break through tools. Data could be on-premises and in the cloud Data is moving from sensors in real time across our data fabric and giving us precise instrumentation of what happened just before an event as well as after the event. This is true for customers buying on the web as well as products that might fail. We can run our machine learning and deep learning on these vast repositories of data And we can push these models down to the edges to automate decision Note: For us as a community and as a company, we need to continue to innovate around the core technology, while thinking about how we enable 3 personas to be successful. This is the logical evolution and transformation that’s happening now.
  9. You need to holistically manage all the data in all places, then begin to move our platform into place
  10. You need to holistically manage all the data in all places, then begin to move our platform into place
  11. You need to holistically manage all the data in all places, then begin to move our platform into place
  12. HDF provides very fine-grained, high fidelity reporting about the origins of data, how it was used, who used it etc.