Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

In:Confidence 2019 - Balancing the conflicting objectives of data access and data privacy

45 views

Published on

Shane Lamont, Chief Technology Officer - Big Data and Cloud at HSBC Data Services, talks about how to balance conflicting objectives of data access and data privacy on the In:Confidence 2019 main stage (April 4th at Printworks, London).

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

In:Confidence 2019 - Balancing the conflicting objectives of data access and data privacy

  1. 1. 1 Balancing Data Access and Data Privacy in Analytics environments InConfidence April 2019 PUBLIC
  2. 2. 2 Balancing Data Access and Data Privacy What environments will cover today? PUBLIC Production Environments Real data, real risks, very valuable, needs protection. Normally one + plus appropriate contingency measures Test Environments Development Environments Synthetic or anonymised data used for developers to test software. May have many of these for short periods of time. Real or pre-built synthetic data. Used to verify that the system works as intended. Normally one per project / activity. Environment Types Environment Contents & use
  3. 3. 3 Balancing Data Access and Data Privacy in Analytics Environments We want to provide customers with great services through analytics, what’s the challenge? PUBLIC Customers Services Data Scientists Access Controls Data I want great services and data privacy I want data access to provide great services I want your data! I want to know how much my [boss, neighbour, father-in-law] earns and what they spend it on
  4. 4. 4 Balancing Data Access and Data Privacy We add controls, what’s the challenge? PUBLIC Does Product placement analyst need Salary? Account? Does Credit Card Analyst need Counterparty? Does a Financial Crime Analyst need everything? first last e-mail nid dob address occupation Bob Smith bob@smith.com UK-151 23-Sep-67 999 Letsby Avenue Policeman Iva House iva@house.com UK-23B 07-Nov-74 23b Maddup Avenue Homeowner … … .. … … … … S Holmes sherlock@d.com UK-221B 06-Jan-54 221B Baker Street Detective DR/CR Amount Counterparty Type Country Account DR 100 Greengrocer CARD UK 12348943 CR 3000 HSBC SAL UK 23954804 DR 500 Airplane Company CARD ES 23452345 DR 500 Political Party DD EG 33445566 Does Marketing Analyst need National ID? Customers Accounts / Transactions Controls = Identity + Approved Limited Access to sensitive fields
  5. 5. 5 Role first2 last e-mail nid dob address occupation Marketing Product Credit Fin Crime Others …… Balancing Data Access and Data Privacy We add a few views, what’s the challenge? PUBLIC However, there is complexity from may dimensions Roles Views for each role – simple example with 4 views Technology Geography Business Regulatory Data Privacy Marketing Product Credit Fin Crime
  6. 6. 6 Balancing Data Access and Data Privacy We need a lot of views, how do we do that? PUBLIC Cloud helps with all of these, but context aware helps with complexity Files (redacted or not) Database views Types of views C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 Context Aware
  7. 7. 7 Balancing Data Access and Data Privacy If ‘context aware’ views can help, how could they work in theory (and in practice)? PUBLIC 1. Identify your critical data elements (CDEs) 2. Catalogue / map your data to your CDEs 3. Identify your roles and map them the CDEs 4. Add context to Roles and Data 5. Create an access control layer (ACL) 6. Access the data only through the ACL Context Content • “What you are allowed to access” • Organisational level • Identifies data sets and roles • E.g. UK Analysts see UK data • “What you are allowed to see” • Data set level • Your role can see these fields • E.g. UK Analysts see name, DoB, …. Approach (MVP1.0….)
  8. 8. 8 Balancing Data Access and Data Privacy Practical steps towards context aware views (1) PUBLIC 1. Identify CDEs 2. Map data to CDEs 3. Identify roles and privileges 4. Add context to Roles and Data 5. Create access control layer (ACL) 6. Call the ACL 7. Return the data Critical Data Element Name e-mail National ID Date of Birth Address Occupation ….. Non Critical Data Elements Shoe size Comments Country of residence Everything else … … … Role Allowed to see Marketing Name Marketing Address Financial Crime Name Financial Crime e-mail Financial Crime National ID Financial Crime Date of Birth Financial Crime Address Financial Crime Occupation Credit Analyst Address Credit Analyst Occupation Product e-mail Data Identifier Is CDE CDE Type DB.TABLE.COLUMN Y/N List UK.CUSTDB.FNAME Y Name UK.CUSTDB.EMAIL1 Y e-mail UK.CUSTDB.EMAIL2 Y e-mail UK.CUSTDB.DOB Y Date of Birth UK.CUSTDB.PADDR Y Address UK.CUSTDB.OCC Y Occupation UK.CUSTDB.Preference N N/A UK.CUSTDB.INTERESTS N N/A UK.CUSTDB.FOOD N N/A ……. 1 2 3
  9. 9. 9 Balancing Data Access and Data Privacy Practical steps towards context aware views (2) PUBLIC 1. Identify CDE 2. Identify roles and privileges 3. Map data to CDEs 4. Add context to Roles and Data 5. Create access control layer (ACL) 6. Call the ACL 7. Return the data A. Marketing role can access UK & US. HK Customers is ‘HK’ owned. No deal. B. Bob Smith is in Marketing. Marketing is allowed to see UK data. UK Customers? Deal. 4 This context helps you with decisions for access / no access Entity Key Entity Value Element Key Element Value Role Marketing Allowed Countries UK Role Marketing Allowed Countries US Role Marketing_HK Allowed Countries HK Role Marketing Member Bob Smith Person Bob Smith Home Country UK DataSet UK Customers Data Owner UK DataSet US Customers Data Owner US DataSet HK Customers Data Owner HK B) A)
  10. 10. 10 Balancing Data Access and Data Privacy A simple access control layer PUBLIC 1. Identify CDE 2. Identify roles and privileges 3. Map data to CDEs 4. Add context to Roles and Data 5. Create access control layer (ACL) 6. Call the ACL 7. Return the data CDE? Allowed? Y Return field N Access context and data dictionary Return field Y Redact field N Return redacted field Data returned Request Data 5 6 Context Content 7
  11. 11. 11 Role first2 last e-mail nid dob address occupation Marketing Bob Smith XXX@XXX.XXX REDACT 01-Jan-00 999 Letsby Avenue REDACT Marketing Iva House XXX@XXX.XXX REDACT 01-Jan-00 23b Maddup Avenue REDACT Marketing S Holmes XXX@XXX.XXX REDACT 01-Jan-00 221B Baker Street REDACT Product REDACT REDACT bob@smith.com REDACT 01-Jan-00 REDACT REDACT Product REDACT REDACT iva@house.com REDACT 01-Jan-00 REDACT REDACT Product REDACT REDACT sherlock@d.com REDACT 01-Jan-00 REDACT REDACT Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 999 Letsby Avenue Policeman Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 23b Maddup Avenue Homeowner Credit REDACT REDACT XXX@XXX.XXX REDACT 01-Jan-00 221B Baker Street Detective Fin Crime Bob Smith bob@smith.com UK-151 23-Sep-67 999 Letsby Avenue Policeman Fin Crime Iva House iva@house.com UK-23B 07-Nov-74 23b Maddup Avenue Homeowner Fin Crime S Holmes sherlock@d.com UK-221B 06-Jan-54 221B Baker Street Detective Balancing Data Access and Data Privacy What would the data returned look like to different roles? PUBLIC
  12. 12. 12 Balancing Data Access and Data Privacy Wrap up and thoughts for the audience Summary • We want to improve customer service through great analytics • But we need to ensure we have appropriate controls • Views help with this, but traditional approaches create lots of complexity and admin • 'Context aware views' are one way of managing this complexity and reducing admin • Considering data context & data content enables granular roles • An access layer that brokers data request & response provides gateway control • Solutions can be simplified (greatly) with consideration of different environments Takeaway thoughts • Do you need this? • How would you implement this? • What contexts are important to you? • How would this apply to streaming in Prod? PUBLIC
  13. 13. PUBLIC

×