Isaca new delhi india privacy and big data

634 views

Published on

big data,cloud,isaca,out,privacy laws

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
634
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
22
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Isaca new delhi india privacy and big data

  1. 1. Bridging the Gap Between Privacy and Big Data Ulf Mattsson, CTO Protegrity ulf.mattsson AT protegrity.com
  2. 2. Ulf Mattsson, CTO Protegrity 20 years with IBM • Research & Development & Global Services Inventor • Encryption, Tokenization & Intrusion Prevention Involvement • PCI Security Standards Council (PCI SSC) • American National Standards Institute (ANSI) X9 • Encryption & Tokenization • International Federation for Information Processing • IFIP WG 11.3 Data and Application Security • ISACA New York Metro chapter 2
  3. 3. 3
  4. 4. Agenda 1. What is Big Data & Cloud? 2. Risk & Drivers for Data Security 3. The Evolution of Data Security Methods 4. Data De-Identification 5. Off-Shoring & Outsourcing 6. Use Cases & Case Studies 4
  5. 5. Who is Protegrity? Proven enterprise data protection software leader since the 90’s. Business driven by compliance • PCI (Payment Card Industry) • PII (Personally Identifiable Information) • PHI (Protected Health Information) – HIPAA • State and Industry Privacy Laws Servicing many Industries • Retail, Hospitality, Travel and Transportation • Financial Services, Insurance, Banking • Healthcare • Telecommunications, Media and Entertainment • Manufacturing and Government
  6. 6. Big Data
  7. 7. What is Big Data? Hadoop • Designed to handle the emerging “4 V’s” • Massively Parallel Processing (MPP) • Elastic scale • Usually Read-Only • Allows for data insights on massive, heterogeneous data sets • Includes an ecosystem of components: Hive Pig Other Application Layers MapReduce HDFS Storage Layers Physical Storage 7
  8. 8. Has Your Organization Already Invested in Big Data? Source: Gartner 8
  9. 9. Cloud 9
  10. 10. Cloud Services Services usually provided by a third party • Can be virtual, public, private, or hybrid Increasing adoption – up 12% from 2012* Often an outsourced solution, sometimes cross-border Allows for greater accessibility of data and low overhead *Source: GigaOM
  11. 11. Cloud Services and Models Source: NIST, CSA
  12. 12. Drivers for Data Security 12
  13. 13. Drivers for Data Security Regulations & Laws • Payment Card Industry Data Security Standard (PCI DSS) • National Privacy Laws • Cross-Border & Outsourcing Privacy Laws Expanding Threat Landscape • Hackers & APT • Internal Threats & Rogue Privileged Users • Excessive Privilege or Security Negligence Sensitive Data Insight & Usability • Unprotected Sensitive or Restricted Data is Unusable for Marketing, Monetization, Outsourcing, etc. Vulnerabilities in Emerging Technologies 13
  14. 14. Regulations & Laws PCI DSS 14
  15. 15. PCI Data Security Standards Council Founded in 2006, comprised of four major credit card brands Each card brand enforcement program issues fines, fees and schedule deadlines • Visa's Cardholder Information Security Program (CISP) http://www.visa.com/cisp • MasterCard's Site Data Protection (SDP) program http://www.mastercard.com/us/sdp/index.html • Discover's Discover Information Security and Compliance (DISC) program http://www.discovernetwork.com/fraudsecurity/disc.html • American Express Data Security Operating Policy (DSOP) http://www.americanexpress.com/datasecurity 15
  16. 16. PCI DSS Build and maintain a secure network. 1. 2. Install and maintain a firewall configuration to protect data Do not use vendor-supplied defaults for system passwords and other security parameters Protect cardholder data. 3. 4. Protect stored data Encrypt transmission of cardholder data and sensitive information across public networks Maintain a vulnerability management program. 5. 6. Use and regularly update anti-virus software Develop and maintain secure systems and applications Implement strong access control measures. 7. 8. Restrict access to data by business need-to-know Assign a unique ID to each person with computer access Restrict physical access to cardholder data 9. Regularly monitor and test networks. Maintain an information security policy. 16 10. Track and monitor all access to network resources and cardholder data 11. Regularly test security systems and processes 12. Maintain a policy that addresses information security
  17. 17. PCI DSS 3.0 Protection of cardholder data in memory Clarification of key management dual control and split knowledge Recommendations on making PCI DSS business-asusual and best practices Security policy and operational procedures added Increased password strength New requirements for point-of-sale terminal security More robust requirements for penetration testing 17
  18. 18. PCI DSS Cloud Guidelines Relevant to all sensitive data that is outsourced to cloud 1. Clients retain responsibility for the data they put in the cloud 2. Public-cloud providers often have multiple data centers, which may often be in multiple countries or regions 3. The client may not know the location of their data, or the data may exist in one or more of several locations at any particular time 4. A client may have little or no visibility into the controls 5. In a public-cloud environment, one client’s data is typically stored with data belonging to multiple other clients. This makes a public cloud an attractive target for attackers 18
  19. 19. Regulations & Laws National Privacy Laws 19
  20. 20. National Privacy Laws - USA Heath Information Portability and Accountability Act – HIPAA 1. Names 11. Certificate/license numbers 2. All geographical subdivisions smaller than a State 12. Vehicle identifiers and serial numbers 3. All elements of dates (except year) related to individual 13. Device identifiers and serial numbers 4. Phone numbers 14. Web Universal Resource Locators (URLs) 5. Fax numbers 6. Electronic mail addresses 7. Social Security numbers 15. Internet Protocol (IP) address numbers 8. Medical record numbers 16. Biometric identifiers, including finger prints 9. Health plan beneficiary numbers 17. Full face photographic images 10. Account numbers 20 18. Any other unique identifying number
  21. 21. Privacy Laws 54 International Privacy Laws 30 United States Privacy Laws 21
  22. 22. National Privacy Laws - India Information Technology Act – 2000 (IT Act) • Requires that the corporate body and Data Processor implement reasonable security practices and standards • IS/ISO/IEC 27001 requirements recognized Information Technology Act – 2008 (Amended IT Act) • Damages for negligence and wrongful gain or loss • Criminal punishment for disclosing Sensitive Personal Information (SPI) India Privacy Law – 2011 • Expanded definition of SPI to passwords, financial data, health data, medical treatment records, and more Right to Privacy Bill – 2013 (Proposed) • Increased jail terms & fines for disclosure of SPI • Addresses data handled for foreign clients 22
  23. 23. Regulations & Laws Cross-Border & Outsourcing Laws 23
  24. 24. Cross-Border & Outsourcing Laws The laws of the sending country apply to data sent across international borders, including outsourced operations • i.e. National Privacy Laws APEC Cross-Border Privacy Laws • Non-binding privacy enforcement in Asia-Pacific region 24
  25. 25. Expanding Threat Landscape
  26. 26. 26
  27. 27. Cyber Criminals Cost India USD 4 Billion Source: Symantec 2013 27
  28. 28. 28
  29. 29. http://www.ey.com/Publication/vwLUAssets/EY_-_2013_Global_Information_Security_Survey/$FILE/EY-GISS-Under-cyber-attack.pdf 29
  30. 30. Sensitive Data Insight & Usability 30
  31. 31. Vulnerabilities in Emerging Technologies 31
  32. 32. Holes in Big Data… Source: Gartner 32
  33. 33. Many Ways to Hack Big Data BI Reporting RDBMS Hackers Pig (Data Flow) Hive (SQL) Sqoop Unvetted Applications Or Ad Hoc Processes MapReduce (Job Scheduling/Execution System) Hbase (Column DB) HDFS (Hadoop Distributed File System) Source: http://nosql.mypopescu.com/post/1473423255/apache-hadoop-and-hbase 33 Avro (Serialization) Zookeeper (Coordination) ETL Tools Privileged Users
  34. 34. The Insider Threat 34
  35. 35. Sensitive Data Insight & Usability Big Data and Cloud environments are designed for access and deep insight into vast data pools Data can monetized not only by marketing analytics, but through sale or use by a third party The more accessible and usable the data is, the greater this ROI benefit can be Security concerns and regulations are often viewed as opponents to data insight 35
  36. 36. Big Data Vulnerabilities and Concerns Big Data (Hadoop) was designed for data access, not security Security in a read-only environment introduces new challenges Massive scalability and performance requirements Sensitive data regulations create a barrier to usability, as data cannot be stored or transferred in the clear Transparency and data insight are required for ROI on Big Data 36
  37. 37. Cloud Vulnerabilities and Concerns Public cloud security is often not visible to the client, but client is still responsible for security Greater access to shared data sets by more users creates additional points of vulnerability Data redundancy for high availability, often across multiple data centers, increases vulnerability Virtualization can create numerous security issues Transparency and data insight are required for ROI How do you lock this? 37
  38. 38. Security Improving but We Are Losing Ground 38
  39. 39. Breach Discovery Methods Verizon 2013 Data-breach-investigations-report 39
  40. 40. The Evolution of Data Security Methods 40
  41. 41. Evolution of Data Security Methods Coarse Grained Security • Access Controls • Volume Encryption • File Encryption Fine Grained Security • Access Controls • Field Encryption (AES & ) • Masking • Tokenization • Vaultless Tokenization 41 Time
  42. 42. Use of Enabling Technologies Access controls 1% Database activity monitoring 18% Database encryption 30% Backup / Archive encryption 21% Data masking 28% 28% Application-level encryption 7% 29% Tokenization 22% 91% 47% 35% 39% 23% Evaluating 42
  43. 43. DC6 Access Control Risk High – Old and flawed: Minimal access levels so people can only carry out their jobs Low – I Low 43 I High Access Privilege Level
  44. 44. Slide 43 DC6 I have no idea what this graph is supposed to represent Daniel Crum, 11/6/2013
  45. 45. Applying the protection profile to the content of data fields allows for a wider range of authority options 44
  46. 46. How the New Approach is Different Risk High – Old: Minimal access levels – Least Privilege to avoid high risks New: Much greater flexibility and lower risk in data accessibility Low – I Low 45 I High Access Privilege Level
  47. 47. Reduction of Pain with New Protection Techniques Pain & TCO High Input Value: 3872 3789 1620 3675 Strong Encryption Output: !@#$%a^.,mhu7///&*B()_+!@ AES, 3DES Format Preserving Encryption DTP, FPE 8278 2789 2990 2789 Format Preserving Vault-based Tokenization 8278 2789 2990 2789 Greatly reduced Key Management Vaultless Tokenization Low No Vault 1970 46 2000 2005 2010 8278 2789 2990 2789
  48. 48. Fine Grained Security: Encryption of Fields Production Systems Non-Production Systems 47 Encryption of fields • Reversible • Policy Control (authorized / Unauthorized Access) • Lacks Integration Transparency • Complex Key Management • Example: !@#$%a^.,mhu7///&*B()_+!@
  49. 49. Fine Grained Security: Masking of Fields Production Systems Non-Production Systems 48 Masking of fields • Not reversible • No Policy, Everyone can access the data • Integrates Transparently • No Complex Key Management • Example: 0389 3778 3652 0038
  50. 50. Fine Grained Security: Tokenization of Fields Production Systems Tokenization (Pseudonymization) • No Complex Key Management • Business Intelligence • Example: 0389 3778 3652 0038 • Reversible • Policy Control (Authorized / Unauthorized Access) • Not Reversible • Integrates Transparently Non-Production Systems 49
  51. 51. Fine Grained Data Security Methods Tokenization and Encryption are Different Encryption Used Approach Tokenization Cipher System Code System Cryptographic algorithms Cryptographic keys Code books Index tokens Source: McGraw-HILL ENCYPLOPEDIA OF SCIENCE & TECHNOLOGY 50
  52. 52. Fine Grained Data Security Methods Vault-based vs. Vaultless Tokenization Vault-based Tokenization Footprint Large, Expanding. Small, Static. High Availability, Disaster Recovery Complex, expensive replication required. No replication required. Distribution Practically impossible to distribute geographically. Easy to deploy at different geographically distributed locations. Reliability Prone to collisions. No collisions. Performance, Latency, and Scalability 51 Vaultless Tokenization Will adversely impact performance & scalability. Little or no latency. Fastest industry tokenization.
  53. 53. The Future of Tokenization PCI DSS 3.0 • Split knowledge and dual control PCI SSC Tokenization Task Force • Tokenization and use of HSM Card Brands – Visa, MC, AMEX … • Tokens with control vectors ANSI X9 • Tokenization and use of HSM 52
  54. 54. Security of Different Protection Methods Security Level High Low I I I Basic Format AES CBC Vaultless Data Preserving Encryption Data Tokenization 53 I Encryption Standard Tokenization
  55. 55. Speed of Different Protection Methods Transactions per second* 10 000 000 1 000 000 100 000 10 000 1 000 100 I I I I Vault-based Format AES CBC Vaultless Data Preserving Encryption Data Tokenization Encryption Standard Tokenization *: Speed will depend on the configuration 54
  56. 56. Risk Adjusted Data Protection There is always a trade-off between security and usability. Data Security Methods Performance Storage Security Transparency System without data protection Monitoring + Blocking + Obfuscation Data Type Preservation Encryption Strong Encryption Vaultless Tokenization Hashing Anonymisation Worst 55 Best
  57. 57. Data De-Identification 56
  58. 58. What is de-identification of identifiable data? The solution to protecting Identifiable data is to properly deidentify it. Personally Identifiable Information Health Information / Financial Information Personally Identifiable Information Health Information / Financial Information Redact the information – remove it. The identifiable portion of the record is de-identified with any number of protection methods such as masking, tokenization, encryption, redacting (removed), etc. The method used will depend on your use case and the reason that you are de-identifying the data. 57
  59. 59. Identifiable Sensitive Information Field Real Data Tokenized / Pseudonymized Name Joe Smith csu wusoj Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA Date of Birth 12/25/1966 01/02/1966 Telephone 760-278-3389 760-389-2289 E-Mail Address joe.smith@surferdude.org eoe.nwuer@beusorpdqo.org SSN 076-39-2778 937-28-3390 CC Number 3678 2289 3907 3378 3846 2290 3371 3378 Business URL www.surferdude.com www.sheyinctao.com Fingerprint Encrypted Photo Encrypted X-Ray Encrypted Healthcare / Financial Services 58 Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc. Financial Services Consumer Products and activities Protection methods can be equally applied to the actual healthcare data, but not needed with de-identification
  60. 60. De-Identified Sensitive Data Field Real Data Tokenized / Pseudonymized Name Joe Smith csu wusoj Address 100 Main Street, Pleasantville, CA 476 srta coetse, cysieondusbak, CA Date of Birth 12/25/1966 01/02/1966 Telephone 760-278-3389 760-389-2289 E-Mail Address joe.smith@surferdude.org eoe.nwuer@beusorpdqo.org SSN 076-39-2778 076-28-3390 CC Number 3678 2289 3907 3378 3846 2290 3371 3378 Business URL www.surferdude.com www.sheyinctao.com Fingerprint Encrypted Photo Encrypted X-Ray Encrypted Healthcare / Financial Services 59 Dr. visits, prescriptions, hospital stays and discharges, clinical, billing, etc. Financial Services Consumer Products and activities Protection methods can be equally applied to the actual data, but not needed with de-identification
  61. 61. How Should I Secure Different Data? Use Case Tokenization of Fields Encryption of Files Simple – Card Holder Data PII PCI Personally Identifiable Information Complex – Protected Health Information I Un-structured 60 PHI I Structured Type of Data
  62. 62. Research Brief Tokenization Gets Traction Aberdeen has seen a steady increase in enterprise use of tokenization for protecting sensitive data over encryption Nearly half of the respondents (47%) are currently using tokenization for something other than cardholder data Over the last 12 months, tokenization users had 50% fewer security-related incidents than tokenization nonusers 61 Author: Derek Brink, VP and Research Fellow, IT Security and IT GRC
  63. 63. Vaultless Tokenization & Data Insight The business intelligence exposed through Vaultless Tokenization can allow many users and processes to perform job functions on protected data Extreme flexibility in data de-identification can allow responsible data monetization Data remains secure throughout data flows, and can maintain a one-to-one relationship with the original data for analytic processes 62
  64. 64. Use Cases for Coarse & Fine Grained Security 63
  65. 65. Off-shoring & Outsourcing
  66. 66. Privacy Impacts BPO & Offshore Business Solutions Business Process Outsourcing (BPO) • Business Processes • E.g. Loans, Mortgages, Call Centre, Claims Processing, ERP, etc. • Application Development • Need to de-identify Data for Testing and Development Off-Shoring • Same as Outsourcing, but data is sent for business functions (like call center, etc.) off-shore. Laws governing your ability to send real data to 3rd parties are already restrictive, and becoming more so Penalties for infringement are growing more severe Risk of data breaches and data theft is increased 65
  67. 67. Examples Major Bank in EU wants to centralise EDW operations in a single country and therefore send customer data from country A to country B. Privacy Laws in country A prohibit this. Private Bank in Europe wants to offshore Finance Operations. Privacy Law prohibits transfer of citizen data to India. Retail Bank in Scandinavia wants to offshore Customer Services. Privacy law prevents transfer of citizen data to the Far East. 66
  68. 68. Case Studies
  69. 69. Protegrity Use Case: UniCredit CHALLENGES The primary challenge was to protect PII – names and addresses, phone and email, policy and account numbers, birth dates, etc. – to the satisfaction of EU Cross Border Data Security requirements. This included incoming source data from various European banking entities, and existing data within those systems, which would be consolidated at the Italian HQ.
  70. 70. Case Study - Large US Chain Store Reduced cost • 50 % shorter PCI audit Quick deployment • Minimal application changes • 98 % application transparent Top performance • Performance better than encryption Stronger security 69
  71. 71. Case Study: Large Chain Store Why? Reduce compliance cost by 50% • 50 million Credit Cards, 700 million daily transactions • Performance Challenge: 30 days with Basic to 90 minutes with Vaultless Tokenization • End-to-End Tokens: Started with the D/W and expanding to stores • Lower maintenance cost – don’t have to apply all 12 requirements • Better security – able to eliminate several business and daily reports • Quick deployment • Minimal application changes • 98 % application transparent 70
  72. 72. Aadhaar/UID Big Data Use Case
  73. 73. Aadhaar Data Stores Shard 0 Shard a Shard 2 Shard 6 Shard d Shard 1 Shard f Shard 9 Solr cluster (all enrolment records/documents – selected demographics only) Shard 2 Shard 3 Shard 4 Shard 5 UID master (sharded) Mongo cluster Data Node 1 LUN 1 Region Ser. 10 Data Node 10 LUN 2 Low latency indexed read (Documents per sec), High latency random search (seconds per read) (all enrolment records/documents – demographics + photo) Enrolment DB MySQL (all UID generated records - demographics only, track & trace, enrolment status ) HBase Region Ser. 1 Low latency indexed read (Documents per sec), Low latency random search (Documents per sec) Region Ser. .. Data Node .. LUN 3 Region Ser. 20 (all enrolment biometric templates) High read throughput (MB per sec), Low-to-Medium latency read (milli-seconds per read) (all raw packets) High read throughput (MB per sec), High latency read (seconds per read) NFS Data Node 20 LUN 4 HDFS Low latency indexed read (milliseconds per read), High latency random search (seconds per read) Moderate read throughput, High latency read (seconds per read) (all archived raw packets)
  74. 74. Protegrity Summary Proven enterprise data security software and innovation leader • Sole focus on the protection of data • Patented Technology, Continuing to Drive Innovation Cross-industry applicability • • Financial Services, Insurance, Banking • Healthcare • Telecommunications, Media and Entertainment • 74 Retail, Hospitality, Travel and Transportation Manufacturing and Government
  75. 75. Please contact us for more information Ulf.Mattsson@protegrity.com Info@protegrity.com Elaine.Evans@protegrity.com www.protegrity.com

×