Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
MySQL Data Masking
Georgi “Joro” Kodinov
MySQL SrvGen Team Lead
In MySQL Enterprise
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracle’s products remains at the sole discretion of Oracle.
2
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Agenda
3
 What is Data Masking And Why Should I Care ?
 MySQL Enterprise Masking
 Questions ? Suggestions ?
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
What is Data Masking ?
"Data masking is the process of hiding original data with random characters
or data" Wikipedia
4
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Why Should I Care ?
5
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Because of This Guy !
6
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 7
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 8
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Why Should I Care Again ?
9
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Regulatory Compliance
• Regulations
– PCI – DSS: Payment Card Data
– HIPAA: Privacy of Health Data
– Sarbanes Oxley, GLBA, The USA Patriot Act:
Financial Data, NPI "personally identifiable financial information"
– FERPA – Student Data
– EU General Data Protection Directive: Protection of Personal Data (GDPR)
– Data Protection Act (UK): Protection of Personal Data
• Requirements
– Continuous Monitoring (Users, Schema, Backups, etc.)
– Data Protection (Encryption, Privilege Management, etc.)
– Data Retention (Backups, User Activity, etc.)
– Data Auditing (User activity, etc.)
10
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Cost of Data Breaches
11
Source: Ponemon Institute, 2018
$1.9M
$2.8M
$4.6M
$6.3M
$0
$1,000,000
$2,000,000
$3,000,000
$4,000,000
$5,000,000
$6,000,000
$7,000,000
Less than 10,000 10,000 to 25,000 25,001 to 50,000 Greater than
50,000
Records
Small to Medium Breaches
$199M
$279M
$325M
$350M
$0
$50,000,000
$100,000,000
$150,000,000
$200,000,000
$250,000,000
$300,000,000
$350,000,000
$400,000,000
20 Million 30 Million 40 Million 50 Million
Records
Mega Breaches
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Agenda
12
 What is Data Masking And Why Should I Care ?
 MySQL Enterprise Masking
 Questions ? Suggestions ?
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
• Data Masking
– String masking
– Dictionary based replacement
– Specific masking
• SSN
• Payment card : Strict/Relaxed
• Random Data Generators
– Random number within a range
– Email
– Payment card (Luhn check compliant)
– SSN
– Dictionary based generation
13
MySQL Enterprise Masking in a Nutshell
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 14
Keep the first
symbol,
“X” the others Keep the Last 4 Symbols,
“*” the others
Replace anything but
the last 12 symbols
with ‘-’
Replace the first five
symbols with ‘?’
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 15
Mask a credit card
number
Same, but leave the
issuer ID too
Mask a Social
Security Number
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 16
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | 17
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Masking
The Recap
18
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Masking and De-Identification
• String data masking
– Mask a substring within a string : ArthXXXXnt
– Mask substrings at the beginning and at the end :
• XXthurDeXX
• SSN masking : XXXX-XX-1234
• Payment Card masking
– Strict: XXXXXXXXXXXXXXX7395, Relaxed: 493812XXXXXXXXX7395
• Dictionary based masking
– gen_blacklist(“007”, “00designations”, “Cover_identity”) => Universal Exports
19
Data Masking
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Masking and De-Identification
• Random data within range
– gen_range(10000, 20000) => 12503
• Email : kajsm.hamskdk@example.com
• Payment card : 7389026626032990
– Configurable length : 12 to 19 digits
• SSN : 915-63-3858
• US Phone number : 1-555-3456-332
20
Random Data Generation
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
MySQL Enterprise Masking and De-Identification
• Load multiple dictionaries
– Maps dictionary file => dictionary name
– In memory data for faster retrieval
• Generation based on dictionary data
– gen_dictionary(“periodictable”) => Oxygen
– If 007 on the blacklist then substitute otherwise provide random value
• Blacklisted – 007 – thus randomly substituted from Jobs Dictionary
– gen_blacklist(“007”, “Job_mask", “Jobs") => “Accountant”
• Not blacklisted – Administrator – thus passes through
– gen_blacklist(“Administrator”, “Job_mask", “Jobs") => “Administrator”
21
Dictionary based data generation, data blacklists
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
22
Enterprise
Security Architecture
 Workbench
•Model
•Data
•Audit Data
•User Management
  Enterprise Monitor
•Identifies Vulnerabilities
•Security hardening policies
•Monitoring & Alerting
•User Monitoring
•Password Monitoring
•Schema Change Monitoring
•Backup Monitoring
Data Encryption
•TDE
•Encryption
•PKI
 Firewall
 Enterprise Authentication
•SSO - LDAP, AD, PAM
 Network Encryption
 Enterprise Audit
•Powerful Rules Engine
 Audit Vault
 Strong Authentication
 Access Controls
 Assess
 Prevent
 Detect
 Recover
 Enterprise Backup
•Encrypted
 HA
•Innodb Cluster
Thread Pool
•Attack minimization
 Key Vault
•Protect Keys
 Enterprise
Masking & De-Identification
•Masking
•Substitute/Subset
•Random Formatted Data
•Blacklisted Data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |
Agenda
23
 What is Data Masking And Why Should I Care ?
 MySQL Enterprise Masking
 Questions ? Suggestions ?
MySQL Enterprise Data Masking

MySQL Enterprise Data Masking

  • 1.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | MySQL Data Masking Georgi “Joro” Kodinov MySQL SrvGen Team Lead In MySQL Enterprise
  • 2.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 2
  • 3.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Agenda 3  What is Data Masking And Why Should I Care ?  MySQL Enterprise Masking  Questions ? Suggestions ?
  • 4.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | What is Data Masking ? "Data masking is the process of hiding original data with random characters or data" Wikipedia 4
  • 5.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Why Should I Care ? 5
  • 6.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Because of This Guy ! 6
  • 7.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 7
  • 8.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 8
  • 9.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Why Should I Care Again ? 9
  • 10.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Regulatory Compliance • Regulations – PCI – DSS: Payment Card Data – HIPAA: Privacy of Health Data – Sarbanes Oxley, GLBA, The USA Patriot Act: Financial Data, NPI "personally identifiable financial information" – FERPA – Student Data – EU General Data Protection Directive: Protection of Personal Data (GDPR) – Data Protection Act (UK): Protection of Personal Data • Requirements – Continuous Monitoring (Users, Schema, Backups, etc.) – Data Protection (Encryption, Privilege Management, etc.) – Data Retention (Backups, User Activity, etc.) – Data Auditing (User activity, etc.) 10
  • 11.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Cost of Data Breaches 11 Source: Ponemon Institute, 2018 $1.9M $2.8M $4.6M $6.3M $0 $1,000,000 $2,000,000 $3,000,000 $4,000,000 $5,000,000 $6,000,000 $7,000,000 Less than 10,000 10,000 to 25,000 25,001 to 50,000 Greater than 50,000 Records Small to Medium Breaches $199M $279M $325M $350M $0 $50,000,000 $100,000,000 $150,000,000 $200,000,000 $250,000,000 $300,000,000 $350,000,000 $400,000,000 20 Million 30 Million 40 Million 50 Million Records Mega Breaches
  • 12.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Agenda 12  What is Data Masking And Why Should I Care ?  MySQL Enterprise Masking  Questions ? Suggestions ?
  • 13.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | • Data Masking – String masking – Dictionary based replacement – Specific masking • SSN • Payment card : Strict/Relaxed • Random Data Generators – Random number within a range – Email – Payment card (Luhn check compliant) – SSN – Dictionary based generation 13 MySQL Enterprise Masking in a Nutshell
  • 14.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 14 Keep the first symbol, “X” the others Keep the Last 4 Symbols, “*” the others Replace anything but the last 12 symbols with ‘-’ Replace the first five symbols with ‘?’
  • 15.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 15 Mask a credit card number Same, but leave the issuer ID too Mask a Social Security Number
  • 16.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 16
  • 17.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 17
  • 18.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | MySQL Enterprise Masking The Recap 18
  • 19.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | MySQL Enterprise Masking and De-Identification • String data masking – Mask a substring within a string : ArthXXXXnt – Mask substrings at the beginning and at the end : • XXthurDeXX • SSN masking : XXXX-XX-1234 • Payment Card masking – Strict: XXXXXXXXXXXXXXX7395, Relaxed: 493812XXXXXXXXX7395 • Dictionary based masking – gen_blacklist(“007”, “00designations”, “Cover_identity”) => Universal Exports 19 Data Masking
  • 20.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | MySQL Enterprise Masking and De-Identification • Random data within range – gen_range(10000, 20000) => 12503 • Email : kajsm.hamskdk@example.com • Payment card : 7389026626032990 – Configurable length : 12 to 19 digits • SSN : 915-63-3858 • US Phone number : 1-555-3456-332 20 Random Data Generation
  • 21.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | MySQL Enterprise Masking and De-Identification • Load multiple dictionaries – Maps dictionary file => dictionary name – In memory data for faster retrieval • Generation based on dictionary data – gen_dictionary(“periodictable”) => Oxygen – If 007 on the blacklist then substitute otherwise provide random value • Blacklisted – 007 – thus randomly substituted from Jobs Dictionary – gen_blacklist(“007”, “Job_mask", “Jobs") => “Accountant” • Not blacklisted – Administrator – thus passes through – gen_blacklist(“Administrator”, “Job_mask", “Jobs") => “Administrator” 21 Dictionary based data generation, data blacklists
  • 22.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | 22 Enterprise Security Architecture  Workbench •Model •Data •Audit Data •User Management   Enterprise Monitor •Identifies Vulnerabilities •Security hardening policies •Monitoring & Alerting •User Monitoring •Password Monitoring •Schema Change Monitoring •Backup Monitoring Data Encryption •TDE •Encryption •PKI  Firewall  Enterprise Authentication •SSO - LDAP, AD, PAM  Network Encryption  Enterprise Audit •Powerful Rules Engine  Audit Vault  Strong Authentication  Access Controls  Assess  Prevent  Detect  Recover  Enterprise Backup •Encrypted  HA •Innodb Cluster Thread Pool •Attack minimization  Key Vault •Protect Keys  Enterprise Masking & De-Identification •Masking •Substitute/Subset •Random Formatted Data •Blacklisted Data
  • 23.
    Copyright © 2017,Oracle and/or its affiliates. All rights reserved. | Agenda 23  What is Data Masking And Why Should I Care ?  MySQL Enterprise Masking  Questions ? Suggestions ?

Editor's Notes

  • #9 Mega breaches involving millions of compromised records continue to make headlines. For example: The Equifax breach revealed the names, Social Security numbers, birth dates, and addresses of almost half of the total U.S. population. Around 400,000 U.K. customers were also reportedly affected. Final findings revealed a total of 145.5 million exposed records. At SingHealth, Singapore’s largest healthcare group, the nonmedical personal data of 1.5 million patients was reportedly accessed, including their national identification number, address, and date of birth as part of the attack. The stolen data also included the outpatient medical data of 160,000 patients. In March of this year, the athletic wear company Under Armour disclosed that data tied to its fitness app was breached this year, affecting 150 million user accounts. Users' usernames, email addresses and passwords were affected In August of this year, British Airways said that names, addresses, email addresses, and sensitive payment card details from 380,000 transactions were all compromised. Though people have reached a seeming point of desensitization to news citing a data breach, protecting user data has become increasingly important amid stricter regulation implementation. Companies are no longer just required to announce that their systems have been breached but also pay fines that can reach up to 4 percent of their annual turnover should they deal with the data belonging to European Union (EU) citizens in accordance with the General Data Protection Regulation (GDPR) requirements. Sources -------------- https://www.trendmicro.com/vinfo/us/security/news/cyber-attacks/data-breach-101
  • #11 So how many in the room are dealing with regulations and guidelines? How many are dealing with multiple. This is just a subset of regulations that your company may need to comply with. The new kid on the block is GDPR. If you deal with the EU – no matter where your company resides – you need to comply to it.
  • #12  Data breaches continue to be costlier and result in more consumer records being lost or stolen, year after year. In 2017 there were over 1500 data breaches in the United States alone and over 170 million records exposed. A data breach involving more than one million compromised records, is referred to as a mega breach. A mega breach of 1 million records yields an average total cost of $40 million A mega breach of 50 million records yields an average total cost of $350 million While we continue to hear about mega breaches the cost of smaller breaches is also in the millions of dollars. What contributes to these costs is: Detection activities such Forensics & Auditing Services Notification Costs, including communicating with Regulators Legal Costs and regulatory fines Lost business and company reputation ---------------- Sources https://databreachcalculator.mybluemix.net/assets/2018_Global_Cost_of_a_Data_Breach_Report.pdf https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/)
  • #17 CC numbers not real !
  • #20 gen_blacklist() – searches for first arg in dict1 and returns a random element from dict2 if found otherwise the original arg gen_dictionary() – random element from a dictionary.