Watch full webinar here: https://bit.ly/33B3Zy0
Your business stakeholders need more data, faster data, more relevant, and more timely data. They are pushing for a data and analytics self-service model to speed up getting key insights. The CEO has made it clear that this is critical to the continued growth and health of the business...this is your top priority.
You've done your research and you really like the concept of using Data Virtualization as a data access layer, supporting the type of self-service wanted by the business. The fact that you have data spread across different departments, as well as in the Cloud, makes this idea even more attractive. This will allow you to deliver more data and more timely data to your users - both for analytics and for operational purposes.
However, you have some doubts; you want to deliver more data, but you’re worried about data security and data privacy. You don't want everyone to access just any data...there has to be control.
A self-service model where you unleash your 'power users' and the potential impact on backend systems seems worrying. You don't think that Data Virtualization will give you both the self-service capabilities that your users want and the control that you need.
But is this true? Do you lose control when you use Data Virtualization? Does your data access become a free-for-all?
Join us for our webinar as we explore the conflicting challenges of openness and control with Data Virtualization and bust the myths surrounding whether data virtualization can provide the control you need.
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Myth Busters VI: Data Virtualization makes access easier – but what control do I have?
1. W E B I N A R S E R I E S
Data Virtualization makes
access easier – but what
control do I have?
2. W E B I N A R S E R I E S
Data Virtualization makes
access easier – but what
control do I have?
Paul Moxon
SVP Data Architectures & Chief Evangelist
Denodo
17th February 2022
5. 5
Previous Myths in the Series
I. Data Virtualization can’t perform with large data sets and complex queries.
II. BI Tools and Data Virtualization are Interchangeable
III. I’m building a Data Lake. I don’t need Data Virtualization
IV. I Access My Data Through APIs. Data Virtualization Can’t Do This.
V. My ETL Tools Do Everything That I Need. Why Do I Need Data Virtualization?
https://www.denodo.com/en/webinar-serie/myth-busters-webinar-series
6. Agenda
1. Today’s Myth
2. Origins of the Myth
3. Just the Facts, Ma’am
4. The Proof is in the Pudding
5. Conclusions
6. Q&A
7. Next Steps
Agenda
9. 9
Business Self-Service is the ’holy grail’
of BI and Analytics: Give the users the
tools to perform their own analysis
without waiting for IT to prepare data
sets for them.
There have been many attempts…and
many failures…
10. 10
Challenges of Self-Service Initiatives
NO STANDARDS OR GOVERNANCE
A TOWER OF BABEL
• Too many reports
• Duplicate reports
• Conflicting data
• sers don’t trust reports
• Data extract hell
11. 11
Myth: Self-Service = Losing Control of Data
• Losing control of the data
• Who is using the data?
• What data are they using?
• How are they using it?
• Losing control of the source systems
• How can we control the impact of backend
systems?
• The $4,500 query
• Controlling *how* the data is queried
13. 13
Four ‘Losing Control’ Issues
1. Losing control of the data
• Who is accessing the data?
• What data are they accessing?
2. Losing control of how the data is used
• How are they using the data?
3. Losing control of backend systems
• Runaway queries, undue load affecting backend
system’s SLAs, etc.
4. Losing control of how user query the data
• The $4,500 query
Data access controls in
Data Virtualization Platform
Standardized (curated) data available
through Data Virtualization Platform
Control over processing pushed
down to underlying systems
Control over scope of queries i.e.
avoid unfiltered ‘SELECT *’ queries
To Maintain Control...
14. 14
1. Maintain Control of the Data
• Built-in Role-Based Access Controls (RBAC)
• Control what data can be accessed based on user roles
• Fine-grained privileges – View, Column, Row, Value
• Global Security Policies
• Tag-based policies to control data access
• More flexible than RBAC alone – apply easily to views
and columns
• Reject queries; mask, redact, or hide values
• Pass-thru credentials
• Leverages existing access controls in underlying systems
• User credentials are pass down so that source level
privileges can be enforced
15. 15
2. Maintain Control of How the Data is Used
• Data Virtualization as Semantic Layer
• Common taxonomy
• Precalculated data values (e.g. asset valuation)
• Expose ‘Approved’ virtual data sets
• Provide definitive data sets for all to use
• Avoid conflicting reports because of different data
definitions
• Monitor who is using data sets
• Recommendation Engines in Data Virtualization
Platform will suggest best data sets to users
16. 16
3. Maintain Control of the Backend Systems
• Control impact on source systems using Resource
Manager
• Create resource restrictions to control interactions
with source systems
• Limit query concurrency, duration, priorities, rows
in result set
• Create rules to determine when restrictions are
active
• Based on user role, access method, time of day, etc.
17. 17
4. Maintain Control of How the Data is Queried
• Force query filters to restrict scope of query
• Stop the unconstrained ‘SELECT *’ query
• View Parameters can be used to require the user to
specify a filter value
• Force a restricted scope of query
• Use for self-service views to control usage
• Unrestricted views for ‘trusted’ users
• No forced filtering
• For users who understand implications of their queries
on cost and performance in underlying system
19. 19
Scenario
• Users in a global insurance company
• Customer data held in US (Redshift/AWS) and Germany
(CosmosDB/Azure)
• US Customer Service agents (acme_cs_us) should only see US
customer data
• PII data (e.g. SSN) must be masked or redacted
• German Customer Service agents (acme_cs_de) should only see
German customer data
• PII data (e.g. Steuer-ID) must be masked or redacted
• Data analysts (acme_analyst) at Corporate HQ can see all
customer data
• Customer data must be normalized
• PII data must be masked or redacted
• Queries must be constrained in scope (to ‘state’)
20. 20
Demo Flow
• From Customer Service agent perspective, ensure that RBAC only allows
them to see appropriate customer data (US or German customers,
respectively)
• Use Global Security Policies to perform dynamic data masking and redaction
of customer PII data
• Use Resource Manager to demonstrate how we can control impact on
backend systems
• e.g. restrict number of rows in results returned
• Use View Parameter to force restrictions on queries on combined customer
data
• e.g. constrain view to customers in user-specified state
23. 23
Self-Service Doesn’t Mean Loss of Control of Data
• Data Virtualization Platforms do make data more available to users
• They are an essential component of any self-service initiative
• Data Virtualization Platforms contain many features that allow you to keep control of
your data
• Role-Based Access Control and tag-based Global Security Policies
• Resource Manager to limit pressure on underlying source systems
• View Parameters for constrained ‘self-service’ views
• Data Virtualization delivers Self-Service with Guardrails