Dogs and Masks:
The Challenges of Deidentifying
and Masking data
Sandy Dunn,CISO Blue Cross of Idaho
August 2, 2018
*** Disclaimer ***
This presentation views and opinions are my own, and do not represent the views or endorsement of my
employer Blue Cross of Idaho.All the information is publicly available.
https://www.pbs.org/newshour/show/lifestyle-choices-could-raise-your-health-insurance-rates
Last Presentation Summary
My job as CISO Data is the NewOil
Leverage similar
historical problems
Don’t Do Security
Stuff without
looking at the
problem holistically
Data Governance
Roles and
Responsibilities
CISO
Topics
Capturing
requirements
Example
methodology
Definitions and
terminology
Open discussion
Expand on Data
Governance Roles
and Responsibilities
Resources for
deidentification and
masking
1. Names
2. All geographical subdivisions smaller than a State
3. All elements of dates (except year) for dates directly related to an individual, including birth date,
admission date, discharge date, date of death;
4. Phone numbers
5. Fax numbers
6. Electronic mail addresses
7. Social Security numbers
8. Medical record numbers
9. Health plan beneficiary numbers
HIPAA PHI: List of 18 Identifiers
Capturing Requirements
10. Account numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers, including license plate numbers
13. Device identifiers and serial numbers
14. Web Universal Resource Locators (URLs)
15. Internet Protocol (IP) address numbers
16. Biometric identifiers, including finger and voice prints
17. Full face photographic images and any comparable images and
18. Any other unique identifying number, characteristic, or code
(note this does not mean the unique code assigned by the investigator to code the data)
State Data Breach
Federal laws related to cybersecurity are sector-specific, meaning
they apply only to a particular industry such as financial or healthcare.
Idaho Data Breach Laws:
Notification Requirements and Penalties
Idaho state law requires businesses to notify affected individuals of a breach as soon as possible, unless a
“good-faith, reasonable, and prompt” investigation reveals that the personal information has not and
will not be misused.
This law also applies to businesses that maintain personal data for another entity.
Businesses that fail to notify can be fined up to $25,000 per breach.
Definition of Protected Information :Combination of (1) name or other identifying info, PLUS (2) one or
more of these "data" elements: SSN; driver's license number; or account number, credit card number,
debit card number if accompanied by PIN, password, or access codes
Notification required only if breaches “materially compromise the security, confidentiality, or integrity
of” PI.
Notification can be written, phone, or electronic
https://hitrustalliance.net/documents/hitrust2017/presentations/May-11-1130am-HITRUST-DeID-Framework_FINAL.pdf
Terms
Data masking or data obfuscation is the process of hiding original data with random or altered characters that
makes the resulting data un-traceable to the original.
• Static data tables are loaded to a separate environment. Data masking rules are applied to stable (inactive) data . Dev / test
• On-the-fly data is transferred from environment to environment without data touching a disk on its way. The same technique is applied to
"Dynamic Data Masking" but one record at a time. Most useful for CI/D environments. It sends small subsets of masked testing data from
production to development / test.
• Dynamic happens at runtime, on-demand. It is attribute-based and policy-driven
Techniques
• Substitution another authentic looking value is substituted for the existing value
• Shuffling similar to the substitution method but it derives the substitution set from the same column of data that is being masked. In very
simple terms, the data is randomly shuffled within the column
• Number and date variance – If the overall data set needs to retain demographic and actuarial data integrity applying a random numeric
variance of +/- 120 days to date fields would preserve the date distribution but still prevent traceability back to a known entity based on their
known actual date or birth or a known date value of whatever record is being masked
• Encryption key used to grant visibility to the data
• Masking out character scrambling or masking out of certain fields
Synthetic or hypothetical data completely made up data
https://en.wikipedia.org/wiki/Data_masking
DiscussionTopics
How do we get started in driving the importance of Data Security throughout the company?
What does leadership need to do to drive Data Security effectiveness and ensure that Data Security is moving forward?
What is the most important Data Security item we should focus on today?
How do you recommend setting up and managing system access?
What is your process to identify, track and classify data?
How do you work around “Shadow IT” when it comes to Data Security?
Network Segmentation
License issues
Structured vs Unstructured
Information Classification
Data Governance
BusinessOwner Legal /
Compliance /
Enterprise Risk
Data
Governance
Cybersecurity
Data
Stewardship
Identify data
roles &
responsibility
Define Requirements SME Audit / Enforce
Structured /
Unstructured
Own process /
workflow
Requirements How Find / Enforce
Data
Classification
Public
Restricted
Confidential
Do Define Monitor use Enforce
Implement
Controls
Data Quality Only Good Data Enforce Requirements How
Data
Management
Building the full
data lifecycle
Do Requirements How Protect
Links toTools and Papers
NISTIR 8053 De_Identification of Personal Information https://nvlpubs.nist.gov/nistpubs/ir/2015/nist.ir.8053.pdf
HiTrust De-Identification Framework https://ecfsapi.fcc.gov/file/60001569792.pdf
A BeginnersGuide to Data Masking - Imperva HTTP://www.poer.ro/wp-
content/uploads/2018/01/Camouflage_Data_Masking_Beginners.pdf
Practical Implications of Sharing Data: A Primer on Data
Privacy,Anonymization, and De-Identification
https://support.sas.com/resources/papers/proceedings15/1884-2015.pdf
Securing Sensitive Data in Databases & Datalakes Using Cirro
Data Puppy
https://s3.amazonaws.com/cirro.com/downloads/cirro-data-migrator-
whitepaper.pdf

Data goverance two_8.2.18 - copy

  • 1.
    Dogs and Masks: TheChallenges of Deidentifying and Masking data Sandy Dunn,CISO Blue Cross of Idaho August 2, 2018 *** Disclaimer *** This presentation views and opinions are my own, and do not represent the views or endorsement of my employer Blue Cross of Idaho.All the information is publicly available.
  • 2.
  • 3.
    Last Presentation Summary Myjob as CISO Data is the NewOil Leverage similar historical problems Don’t Do Security Stuff without looking at the problem holistically Data Governance Roles and Responsibilities CISO
  • 4.
    Topics Capturing requirements Example methodology Definitions and terminology Open discussion Expandon Data Governance Roles and Responsibilities Resources for deidentification and masking
  • 5.
    1. Names 2. Allgeographical subdivisions smaller than a State 3. All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; 4. Phone numbers 5. Fax numbers 6. Electronic mail addresses 7. Social Security numbers 8. Medical record numbers 9. Health plan beneficiary numbers HIPAA PHI: List of 18 Identifiers Capturing Requirements 10. Account numbers 11. Certificate/license numbers 12. Vehicle identifiers and serial numbers, including license plate numbers 13. Device identifiers and serial numbers 14. Web Universal Resource Locators (URLs) 15. Internet Protocol (IP) address numbers 16. Biometric identifiers, including finger and voice prints 17. Full face photographic images and any comparable images and 18. Any other unique identifying number, characteristic, or code (note this does not mean the unique code assigned by the investigator to code the data)
  • 6.
    State Data Breach Federallaws related to cybersecurity are sector-specific, meaning they apply only to a particular industry such as financial or healthcare.
  • 7.
    Idaho Data BreachLaws: Notification Requirements and Penalties Idaho state law requires businesses to notify affected individuals of a breach as soon as possible, unless a “good-faith, reasonable, and prompt” investigation reveals that the personal information has not and will not be misused. This law also applies to businesses that maintain personal data for another entity. Businesses that fail to notify can be fined up to $25,000 per breach. Definition of Protected Information :Combination of (1) name or other identifying info, PLUS (2) one or more of these "data" elements: SSN; driver's license number; or account number, credit card number, debit card number if accompanied by PIN, password, or access codes Notification required only if breaches “materially compromise the security, confidentiality, or integrity of” PI. Notification can be written, phone, or electronic
  • 8.
  • 9.
    Terms Data masking ordata obfuscation is the process of hiding original data with random or altered characters that makes the resulting data un-traceable to the original. • Static data tables are loaded to a separate environment. Data masking rules are applied to stable (inactive) data . Dev / test • On-the-fly data is transferred from environment to environment without data touching a disk on its way. The same technique is applied to "Dynamic Data Masking" but one record at a time. Most useful for CI/D environments. It sends small subsets of masked testing data from production to development / test. • Dynamic happens at runtime, on-demand. It is attribute-based and policy-driven Techniques • Substitution another authentic looking value is substituted for the existing value • Shuffling similar to the substitution method but it derives the substitution set from the same column of data that is being masked. In very simple terms, the data is randomly shuffled within the column • Number and date variance – If the overall data set needs to retain demographic and actuarial data integrity applying a random numeric variance of +/- 120 days to date fields would preserve the date distribution but still prevent traceability back to a known entity based on their known actual date or birth or a known date value of whatever record is being masked • Encryption key used to grant visibility to the data • Masking out character scrambling or masking out of certain fields Synthetic or hypothetical data completely made up data https://en.wikipedia.org/wiki/Data_masking
  • 10.
    DiscussionTopics How do weget started in driving the importance of Data Security throughout the company? What does leadership need to do to drive Data Security effectiveness and ensure that Data Security is moving forward? What is the most important Data Security item we should focus on today? How do you recommend setting up and managing system access? What is your process to identify, track and classify data? How do you work around “Shadow IT” when it comes to Data Security? Network Segmentation License issues Structured vs Unstructured Information Classification
  • 12.
    Data Governance BusinessOwner Legal/ Compliance / Enterprise Risk Data Governance Cybersecurity Data Stewardship Identify data roles & responsibility Define Requirements SME Audit / Enforce Structured / Unstructured Own process / workflow Requirements How Find / Enforce Data Classification Public Restricted Confidential Do Define Monitor use Enforce Implement Controls Data Quality Only Good Data Enforce Requirements How Data Management Building the full data lifecycle Do Requirements How Protect
  • 13.
    Links toTools andPapers NISTIR 8053 De_Identification of Personal Information https://nvlpubs.nist.gov/nistpubs/ir/2015/nist.ir.8053.pdf HiTrust De-Identification Framework https://ecfsapi.fcc.gov/file/60001569792.pdf A BeginnersGuide to Data Masking - Imperva HTTP://www.poer.ro/wp- content/uploads/2018/01/Camouflage_Data_Masking_Beginners.pdf Practical Implications of Sharing Data: A Primer on Data Privacy,Anonymization, and De-Identification https://support.sas.com/resources/papers/proceedings15/1884-2015.pdf Securing Sensitive Data in Databases & Datalakes Using Cirro Data Puppy https://s3.amazonaws.com/cirro.com/downloads/cirro-data-migrator- whitepaper.pdf