New York City
9th June, 2016
Logical Data Warehouse,
Data Lakes, and Data
Services Marketplaces
Agenda1.Introductions
2.Logical Data Warehouse and Data Lakes
3.Coffee Break
4.Data Services Marketplaces
5.Q&A
Data Services Marketplace
New York City
June 2016
Agenda1.Data Services Marketplace
2.Data Services Demo
3.Addressing the Challenges
4.Customer Success Stories
5.Q&A
Data, Data, Everywhere…
• Organizations are awash with data, but…
• How do I know what data is available?
• What’s its structure?
• How do I know how good it is?
• How do I access the data?
• Data Services Marketplaces address these
questions
• Provide a mechanism for end users and
developers to find and access data
• For reports, applications, analytics, etc.
And not a drop of it to read!
5
What is a Data Services Marketplace?
A single place where consumers of data –
developers or end users – can search for, find,
and access data, that is available to them, as a
service.
6
Data Services Marketplace
7
Enterprise Apps
SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc.
Operational
Systems
Analytical
Systems
Big Data External/SaaS
Systems
Virtual
Data Marts Virtual ODS
Reusable
Data Services
Metadata Scheduling & Delivery Usage Stats
Enterprise Data
Service Registry
Data Services
Layer
Enterprise Data Service Registry
• Catalog of data available to consumers
• Metadata for data ‘services’
• Format and structure of data, description of data and attributes
• Data lineage information – where does the data come from?
• Access permissions for data services
• Enforcing privacy policies for PII
• Monitoring and auditing of data usage
• Monitoring and managing QoS/SLA
• Knowing who is access data, when and how…
8
Virtual Data Services Layer
A data access layer that abstracts underlying data sources and
exposes them as discrete services to form a ‘data API’
 Different users and developers across the enterprise can access data in a
secure and managed fashion and share a common data ‘model’
 Provides secure and managed access to data across the enterprise
 Provides consistency of data
 Hides complexity, format, and location of actual data sources
 Supports many consumption protocols and patterns
Example: Single data access layer for all development teams to avoid
‘hunting down and interpreting data differently by project’
9
Data Services Layer
10
Enterprise Apps
SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc.
Operational
Systems
Analytical
Systems
Big Data External/SaaS
Systems
Benefits of Data Services
• Agility
• Rapid development, service reuse, quicker time-to-value
• Data Integration
• Combine data to provide data ‘as needed’ not ‘as stored’
• Aligned with logical data models
• Data Quality
• Data consistency, common ‘model’
• Single Point of Interaction
• Users don’t need direct access to data sources, better management and
security
11
Challenges of Data Services
• Security
• How secure is the data? How is access controlled?
• Privacy
• How is PII protected? How can you audit access compliance?
• Performance/QoS
• Does the data services layer ‘get in the way’? How does it impact
performance? And QoS/SLAs?
• Data Governance and Veracity
• How do you know that the data is ‘good’?
12
13
Implementing Data Services
• Data services can be implemented using a
number of different technologies:
1. ESB/SOA
2. ETL
3. MDM
4. Data Virtualization
• Typically it will be one or more of the above
Different Technologies
14
Data Services with Data Virtualization
• Optimized for data services
• Configuration and not coding
• Rapid development and time-to-value
• Supports multiple delivery styles
• Real-time/right-time, batch/file, etc.
• Multiple protocols – SQL (JDBC/ODBC), Web Services (REST/SOAP), …
• Complements other technologies
• MDM exposed as services through data virtualization
• Combined with an ESB for process flows
The Foundation for the Data Services Marketplace
Data Services Demo
Addressing the Challenges
Challenges of Data Services
• Security & Privacy
• How secure is the data? How is access controlled?
• How is PII protected? How can you audit access compliance?
• Performance & QoS
• Does the data services layer ‘get in the way’? How does it impact
performance?
• How can we control the resources to comply with SLAs?
• Data Governance & Veracity
• How do we know that the data is ‘good’?
17
Security & Privacy
Challenges of Data Services
18
19
Security in Denodo
Overview
Authentication
• Pass-through authentication
• Kerberos and Windows SSO
• OAuth, SPNEGO
Authentication
• Standard JDBC/ODBC security
• Kerberos and Windows SSO
• Web Service security
LDAP
Active Directory
Role based Authentication
Guest, employee, corporate
Schema-wide Permissions
Data Specific Permissions
(Row, Column level, Masking)
Policy Based Security
Data in motion
• SSL/TLS
Data in motion
• SSL/TLS
Encrypted
data at rest
• Cache
• Swap
20
Security in Denodo
Data in Motion – secure channels
 Using SSL/TLS
 Client-to-Denodo and Denodo-to-source
 Available for all protocols (JDBC, ODBC, ADO.NET and WS)
 WS security: Basic, Digest, SPNEGO (Kerberos), integration with LDAP
Data at Rest – secure storage
 Cache: third party database. Can leverage its own encryption mechanism
 Swapping to disk: serialized temporarily stored in a configurable folder that can be
encrypted by the OS
Encryption/Decryption
 Support for custom decryption for files and web services
 Transparent integration with RDBMs encryption
Securing data
21
Security in Denodo
Authentication
 Native and LDAP/Active Directory based
 Support for Kerberos and Windows SSO
Authorization
 Virtual Database
 View
 Row and Column level authorization
 Masking
 Custom policies for specific security constrains and integration with external policy servers
Roles
 Integration with LDAP/AD groups
 Role hierarchies supported
Pass-through session credentials
 Leverage existing source privileges
Authentication and Authorization
Role-Based Granular Privileges
22
Security In Denodo
Advanced Selective Data Masking
23
Security In Denodo
Advanced Selective Data Masking
24
Security In Denodo
25
Custom
Policy
Conditions satisfied
Security: applies custom security
policies
• If person accessing data has role of
'Supervisor' and location is 'New
York', then show compensation
information for employees in the
New York office only.
Enforcement: rejects/filters
queries by specified criteria like
user priority, cost, time of day etc.
• If the production batch window runs
from 3 am - 6 am, there is
increased load on production
servers at this time. So, all queries
on these servers can be blocked
during this time to prevent failure of
a process.
Data consuming users, Apps
Query
Accept / add filters
Reject
Security - Custom Policies
Interception of queries before they are executed
Performance & QoS
Challenges of Data Services
26
27
Resource Manager
Apply resource restrictions based on a set of rules
 Rules classify sessions into groups
 By user, role, application, IP, time of the day, etc.
 E.g. Connections from application ‘app1’ coming from users with role
‘reporting’ are assigned to a group
 Apply restrictions for each group.
 Change priority, change concurrency settings, change max timeouts, etc
Controlled Resource Allocation
28
Resource Manager
Controlled Resource Allocation
1 Defines a rule that will be
triggered for “app1” and users
with the role “reporting”
2 For those request that fulfill the rule, if the
CPU usage is greater than 85%, will apply the
following:
• Reduce thread priority
• Reduce the number of concurrent requests
• Limit the number of queued queries
29
Performance Features
Data Provisioning Layer
Selective Materialization
Intelligent Caching of only the most relevant and often used
information
Streaming & pagination
Operate on data in streaming mode for a low memory
footprint. Paginate responses to control the size of datasets
Parallelism
Parallel access to disparate sources to minimize latency
NESTED JOINs for concurrent access to sources with
restricted query capabilities
Optimized Resource Management
Smart allocation of resources to handle high concurrency
Throttling to control and mitigate source impact
Resource plans based on rules
30
Quality of Service in Real Scenarios
• Multinational insurance & reinsurance company
• Average response time of 80-100ms
• 200+ concurrent queries
• 2 nodes – 4 cores each
• Global semiconductor chip manufacturer
• Enterprise-wide data access layer
• 200+ developers trained in Denodo
• ~50 data sources, +90 data services published
• Response times under 120ms, well in compliance with their internal SLAs
(200-300ms)
• 128+ cores in production
Data Provisioning Layer
Data Governance & Veracity
Challenges of Data Services
31
32
Enterprise Data Governance
Understand the “source of truth” and transformations of every piece of data in the
model
Data lineage
33
Enterprise Data Governance
Understand the “source of truth” and transformations of every piece of data in the
model
Data lineage
Customer Success Stories
35
DrillingInfo
• SaaS-based platform that provides business intelligence and
decision support technology
• Facilitates faster, smarter decisions for the oil and gas upstream
E&P industry
• HQs in Austin, Texas. More than 400 employees on 5 continents
• Services 3,000+ companies globally
Overview
36
DrillingInfo
Architecture
37
-Jay Heydt, Manager, Drillinginfo
As a data and business intelligence provider, one of our biggest
challenges is the need to rapidly sell the data that we acquire. The
Denodo Platform enables us to build and deliver data services to our
internal and external consumers within 3–4 hours instead of the 1–2
weeks that would take with ETL”
40
Guardian Life
• Large mutual life insurer with $7.3 billion in capital and $1.5 billion in operating
income in 2015.
• Founded in 1860, the company has paid dividends to policyholders every year
since 1868.
• ~8,000 employees and a over 3,000 financial representatives in 70+ agencies
nationwide.
• Offerings:
• Life insurance
• Disability income insurance
• Annuities
• Investments to dental, vision, and 401(k) plans.
Overview
Enterprise Data Marketplace
41
Enterprise Data Marketplace
42
Enterprise Data Marketplace
43
Enterprise Data Marketplace
44
Q&A
Thanks!
www.denodo.com info@denodo.com
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical,
including photocopying and microfilm, without prior the written authorization from Denodo Technologies.

Data Services Marketplace

  • 1.
    New York City 9thJune, 2016 Logical Data Warehouse, Data Lakes, and Data Services Marketplaces
  • 2.
    Agenda1.Introductions 2.Logical Data Warehouseand Data Lakes 3.Coffee Break 4.Data Services Marketplaces 5.Q&A
  • 3.
    Data Services Marketplace NewYork City June 2016
  • 4.
    Agenda1.Data Services Marketplace 2.DataServices Demo 3.Addressing the Challenges 4.Customer Success Stories 5.Q&A
  • 5.
    Data, Data, Everywhere… •Organizations are awash with data, but… • How do I know what data is available? • What’s its structure? • How do I know how good it is? • How do I access the data? • Data Services Marketplaces address these questions • Provide a mechanism for end users and developers to find and access data • For reports, applications, analytics, etc. And not a drop of it to read! 5
  • 6.
    What is aData Services Marketplace? A single place where consumers of data – developers or end users – can search for, find, and access data, that is available to them, as a service. 6
  • 7.
    Data Services Marketplace 7 EnterpriseApps SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc. Operational Systems Analytical Systems Big Data External/SaaS Systems Virtual Data Marts Virtual ODS Reusable Data Services Metadata Scheduling & Delivery Usage Stats Enterprise Data Service Registry Data Services Layer
  • 8.
    Enterprise Data ServiceRegistry • Catalog of data available to consumers • Metadata for data ‘services’ • Format and structure of data, description of data and attributes • Data lineage information – where does the data come from? • Access permissions for data services • Enforcing privacy policies for PII • Monitoring and auditing of data usage • Monitoring and managing QoS/SLA • Knowing who is access data, when and how… 8
  • 9.
    Virtual Data ServicesLayer A data access layer that abstracts underlying data sources and exposes them as discrete services to form a ‘data API’  Different users and developers across the enterprise can access data in a secure and managed fashion and share a common data ‘model’  Provides secure and managed access to data across the enterprise  Provides consistency of data  Hides complexity, format, and location of actual data sources  Supports many consumption protocols and patterns Example: Single data access layer for all development teams to avoid ‘hunting down and interpreting data differently by project’ 9
  • 10.
    Data Services Layer 10 EnterpriseApps SQL (JDBC/ODBC), RESTful Web Services, SOAP, JMS, etc. Operational Systems Analytical Systems Big Data External/SaaS Systems
  • 11.
    Benefits of DataServices • Agility • Rapid development, service reuse, quicker time-to-value • Data Integration • Combine data to provide data ‘as needed’ not ‘as stored’ • Aligned with logical data models • Data Quality • Data consistency, common ‘model’ • Single Point of Interaction • Users don’t need direct access to data sources, better management and security 11
  • 12.
    Challenges of DataServices • Security • How secure is the data? How is access controlled? • Privacy • How is PII protected? How can you audit access compliance? • Performance/QoS • Does the data services layer ‘get in the way’? How does it impact performance? And QoS/SLAs? • Data Governance and Veracity • How do you know that the data is ‘good’? 12
  • 13.
    13 Implementing Data Services •Data services can be implemented using a number of different technologies: 1. ESB/SOA 2. ETL 3. MDM 4. Data Virtualization • Typically it will be one or more of the above Different Technologies
  • 14.
    14 Data Services withData Virtualization • Optimized for data services • Configuration and not coding • Rapid development and time-to-value • Supports multiple delivery styles • Real-time/right-time, batch/file, etc. • Multiple protocols – SQL (JDBC/ODBC), Web Services (REST/SOAP), … • Complements other technologies • MDM exposed as services through data virtualization • Combined with an ESB for process flows The Foundation for the Data Services Marketplace
  • 15.
  • 16.
  • 17.
    Challenges of DataServices • Security & Privacy • How secure is the data? How is access controlled? • How is PII protected? How can you audit access compliance? • Performance & QoS • Does the data services layer ‘get in the way’? How does it impact performance? • How can we control the resources to comply with SLAs? • Data Governance & Veracity • How do we know that the data is ‘good’? 17
  • 18.
    Security & Privacy Challengesof Data Services 18
  • 19.
    19 Security in Denodo Overview Authentication •Pass-through authentication • Kerberos and Windows SSO • OAuth, SPNEGO Authentication • Standard JDBC/ODBC security • Kerberos and Windows SSO • Web Service security LDAP Active Directory Role based Authentication Guest, employee, corporate Schema-wide Permissions Data Specific Permissions (Row, Column level, Masking) Policy Based Security Data in motion • SSL/TLS Data in motion • SSL/TLS Encrypted data at rest • Cache • Swap
  • 20.
    20 Security in Denodo Datain Motion – secure channels  Using SSL/TLS  Client-to-Denodo and Denodo-to-source  Available for all protocols (JDBC, ODBC, ADO.NET and WS)  WS security: Basic, Digest, SPNEGO (Kerberos), integration with LDAP Data at Rest – secure storage  Cache: third party database. Can leverage its own encryption mechanism  Swapping to disk: serialized temporarily stored in a configurable folder that can be encrypted by the OS Encryption/Decryption  Support for custom decryption for files and web services  Transparent integration with RDBMs encryption Securing data
  • 21.
    21 Security in Denodo Authentication Native and LDAP/Active Directory based  Support for Kerberos and Windows SSO Authorization  Virtual Database  View  Row and Column level authorization  Masking  Custom policies for specific security constrains and integration with external policy servers Roles  Integration with LDAP/AD groups  Role hierarchies supported Pass-through session credentials  Leverage existing source privileges Authentication and Authorization
  • 22.
  • 23.
    Advanced Selective DataMasking 23 Security In Denodo
  • 24.
    Advanced Selective DataMasking 24 Security In Denodo
  • 25.
    25 Custom Policy Conditions satisfied Security: appliescustom security policies • If person accessing data has role of 'Supervisor' and location is 'New York', then show compensation information for employees in the New York office only. Enforcement: rejects/filters queries by specified criteria like user priority, cost, time of day etc. • If the production batch window runs from 3 am - 6 am, there is increased load on production servers at this time. So, all queries on these servers can be blocked during this time to prevent failure of a process. Data consuming users, Apps Query Accept / add filters Reject Security - Custom Policies Interception of queries before they are executed
  • 26.
    Performance & QoS Challengesof Data Services 26
  • 27.
    27 Resource Manager Apply resourcerestrictions based on a set of rules  Rules classify sessions into groups  By user, role, application, IP, time of the day, etc.  E.g. Connections from application ‘app1’ coming from users with role ‘reporting’ are assigned to a group  Apply restrictions for each group.  Change priority, change concurrency settings, change max timeouts, etc Controlled Resource Allocation
  • 28.
    28 Resource Manager Controlled ResourceAllocation 1 Defines a rule that will be triggered for “app1” and users with the role “reporting” 2 For those request that fulfill the rule, if the CPU usage is greater than 85%, will apply the following: • Reduce thread priority • Reduce the number of concurrent requests • Limit the number of queued queries
  • 29.
    29 Performance Features Data ProvisioningLayer Selective Materialization Intelligent Caching of only the most relevant and often used information Streaming & pagination Operate on data in streaming mode for a low memory footprint. Paginate responses to control the size of datasets Parallelism Parallel access to disparate sources to minimize latency NESTED JOINs for concurrent access to sources with restricted query capabilities Optimized Resource Management Smart allocation of resources to handle high concurrency Throttling to control and mitigate source impact Resource plans based on rules
  • 30.
    30 Quality of Servicein Real Scenarios • Multinational insurance & reinsurance company • Average response time of 80-100ms • 200+ concurrent queries • 2 nodes – 4 cores each • Global semiconductor chip manufacturer • Enterprise-wide data access layer • 200+ developers trained in Denodo • ~50 data sources, +90 data services published • Response times under 120ms, well in compliance with their internal SLAs (200-300ms) • 128+ cores in production Data Provisioning Layer
  • 31.
    Data Governance &Veracity Challenges of Data Services 31
  • 32.
    32 Enterprise Data Governance Understandthe “source of truth” and transformations of every piece of data in the model Data lineage
  • 33.
    33 Enterprise Data Governance Understandthe “source of truth” and transformations of every piece of data in the model Data lineage
  • 34.
  • 35.
    35 DrillingInfo • SaaS-based platformthat provides business intelligence and decision support technology • Facilitates faster, smarter decisions for the oil and gas upstream E&P industry • HQs in Austin, Texas. More than 400 employees on 5 continents • Services 3,000+ companies globally Overview
  • 36.
  • 37.
    37 -Jay Heydt, Manager,Drillinginfo As a data and business intelligence provider, one of our biggest challenges is the need to rapidly sell the data that we acquire. The Denodo Platform enables us to build and deliver data services to our internal and external consumers within 3–4 hours instead of the 1–2 weeks that would take with ETL”
  • 38.
    40 Guardian Life • Largemutual life insurer with $7.3 billion in capital and $1.5 billion in operating income in 2015. • Founded in 1860, the company has paid dividends to policyholders every year since 1868. • ~8,000 employees and a over 3,000 financial representatives in 70+ agencies nationwide. • Offerings: • Life insurance • Disability income insurance • Annuities • Investments to dental, vision, and 401(k) plans. Overview
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
    Thanks! www.denodo.com info@denodo.com © CopyrightDenodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.