• Save
Securing Big Dta: Lock it Down or Liberate
Upcoming SlideShare
Loading in...5
×
 

Securing Big Dta: Lock it Down or Liberate

on

  • 164 views

 

Statistics

Views

Total Views
164
Views on SlideShare
164
Embed Views
0

Actions

Likes
0
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Mark <br /> Cardinal Health is a multi-billion dollar healthcare services company. Actually, we like to say we’re the business behind healthcare because we focus on making it more cost-effective so our customers can focus on their patients. We work with pharmacies, hospitals, doctor’s offices, surgery centers and clinical labs- basically anywhere healthcare services are offered. <br /> <br /> As a leading provider of products and services in the healthcare supply chain, we have the broadest view of healthcare in the industry: <br /> We have more than 33,000 employees with direct operations around the world <br /> We deliver products and services to 100,000 locations daily <br /> 85 percent of hospitals in the U.S. use Cardinal Health products and services <br /> We supply pharmaceuticals to fill 25 percent of branded prescriptions in the U.S. <br /> In fact, a third of all distributed pharmaceutical, laboratory and medical products in the U.S. and Puerto Rico flow through the Cardinal Health supply chain. <br /> We are proud to be #19 on the Fortune 500 list <br />
  • Mark <br /> How we use the data specific to Big Data <br />
  • Mark
  • Jeff <br /> Some view Locking down areas of functionality as a bad thing. We should embrace lockdown much like we do brakes on a car. The breaks actually allow us to take more risks and improve agility.
  • Jeff
  • Jeff
  • Jeff
  • Mark
  • Mark
  • Jeff <br /> Data is ingested under a source specific account. <br /> <br /> The data ingestion process is loosely coupled with the transformation processes. <br /> <br /> This afforded us finer grain control over who and what processes have permission to access raw data. <br /> <br /> This required us to develop atomic data patterns to avoid partial data products. <br />
  • Jeff
  • Jeff <br /> This gave us the flexibility to segment ingestion privileges independently of any transformation.
  • Jeff <br /> This gave us the flexibility to segment ingestion privileges independently of any transformation.
  • Mark
  • Mark
  • Jeff
  • Jeff
  • Mark
  • Mark

Securing Big Dta: Lock it Down or Liberate Securing Big Dta: Lock it Down or Liberate Presentation Transcript

  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Securing Big Data Jeff Graham Mark Tomallo Sr. Advisor, Data Analytics Director, Information Security & Risk Enterprise Architecture Enterprise Services Department June 4th, 2014
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Cardinal Health 33,000 plus employees with direct operations in 10 countries 100,000 locations delivered to daily 2 Leading provider of products and services across the healthcare supply chain with an extensive footprint across multiple channels $101B FY13 revenue #19 on Fortune 500 list 85% of hospitals in the U.S. use our products and services
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. What types of data do we use? 3 Market Public Data (Medicare.gov) Clinical Product & Supplier Employee Logistics
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Challenge 4 • We knew the benefits of going to a Big Data platform, but we had huge concerns over securing those assets. • The technology was immature from a security standpoint. • The goals of an analytics group were sometimes at odds with the responsibility of Governance & Security.
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Opportunity 5 We needed to strike a balance between protecting our data and liberating our analytics community. This emerged into two guiding principles that is still evolving in our organization: • Lockdown the Platform • Liberate the Data to authorized users Lockdown Liberate
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. The Journey Begins.. 6 We needed involvement from many disciplines to come together: • Platform Security • Identity Management • Network Security • Data Segmentation • Data Tokenization • Governance
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 7 Lockdown:Platform Security • Host-based firewalls on control & data nodes – Locked down using iptables – Block connections from unauthorized hosts • Gold-image boot for data nodes – No persistent OS / config data - continuous fresh, secure image – Ease of security patching
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 8 Lockdown:Hadoop Architecture
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Access Nodes 9 Lockdown:Identity Management • Segmented access control to access/ control/ data nodes • Secure Active Directory groups for data segmentation where sensitive • Vintella Authentication using Kerberos • Access Nodes can talk to Control Nodes, Control Nodes can talk to Data Nodes, User restricted to Access Layer Datameer Admin Data Nodes Users Power Users AD MySQL Sqoop Hive Flume Control Nodes Developers Data Owners
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. 10 Lockdown:Network Security • Host-based firewalls on control & data nodes • Segregated VLAN on dedicated network switches • Segregated Prod, Integration, Backup environments • Transaction, security and event logging • Host-based file integrity monitoring
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 11 • Data is ingested under source specific accounts. • Data ingestion is loosely coupled with transformations. • Atomic data patterns to avoid partial data products • Finer grain control over data access. Ingestion Transform
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 12 Ingestion • We had to ensure that our landed data was “all or nothing” • Each load is atomic in nature. • If a load fails, we don’t want to see partially streamed results. HDFS Merge & Rename Source (target area)Staging Part FilesRDBMS Step #1 Sqoop Step #3 hadoop fs -mv Step #2 copyMerge API
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 13 This gave us the flexibility to segment ingestion privileges independently of any transformation. Sales Market Market Employee Logistics Clinical Public Data
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Segmentation 14 This gave us the flexibility to segment ingestion privileges independently of any transformation. Customer Insights Sales Market Market Employee Logistics Warehouse Optimization Clinical Public Data Outcome Based Medicine
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization Private Data without Identity is no longer Private* Segregation Model: 1. Private Identity Data – Identity data which is itself private – e.g. Social Security Number 2. Identity Data – Data to identify the subject of the associated data – e.g. Name, Passport ID 3. Private Attributes – Data only sensitive when associated with an identity – e.g. blood type *Except in rare cases where the Law decides it’s private without Identity. 15
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 16 A tokenization gateway gives us a centralized, reusable framework for transforming private data into non-sensitive data. Address Tokenized Address 1313 Mockingbird Ln A76a39daf6e83363372d326 1700 Pennsylvania Ave 9eeb8dc55d37388b18c12b4 1411 N. Park Ave 0f2ef91d336d38b4db3be54
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 17 The gateway is a highly protected service outside of the cluster.
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Liberate: Data Tokenization 18 The gateway is composed of three regions: PRIVATE • Data that needs to be tokenized. • At a minimum must be comprised of a primary key and token values. • Multi-tenant store with role-based security VAULT • Stores the private data in a SHA2/128-bit AES encrypted binary string. • Generates a token by • Tokens are sharded and referenced by name(and can be shared). • Access extremely limit (administrator only). PUBLIC • Once tokens are generated in the vault, private data is joined to those tokens and landed in the Public region. • Multi-tenant store with role-based security. • Private may read public, but public may only read public.
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. In Summary 19 We needed involvement from many disciplines to come together: • Platform Security • Hadoop Architecture • Identity Management • Network Security • Data Segmentation • Data Tokenization Lockdown Lockdown Liberate Liberate Lockdown Lockdown
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Lessons Learned 20 • Original focus was technology. Data privacy, governance, and declassification were our largest hurdles. • Accountability across the Enterprise is important. • For Big Data, we haven’t achieved pure statistical anonymization as this isn’t our core competency. • Legacy source metadata security classification is challenge. • Initial tokenization was a success. However: o The complexity of a mature tokenization solution is orders of magnitude more difficult than anticipated – The margin of error and penalty of error are both very high. o Metadata needed for full token lifecycle management are unknown & complex o Implementing without the right metadata would likely result in duplication of tokens
  • © Copyright 2013, Cardinal Health. All rights reserved. CARDINAL HEALTH, the Cardinal Health LOGO and ESSENTIAL TO CARE are trademarks or registered trademarks of Cardinal Health. Q&A