Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Automated Data Governance 101 - A Guide to Proactively Addressing Your Privacy, Security, and Compliance Needs


Published on

“Data privacy,” “data security,” “data protection” –
whatever we call the way we control our data, it isn’t working. Data is as
vulnerable as ever. And this is true for both consumers hoping to keep their
data safe, and for enterprises seeking to govern their corporate and customer

We’re at a crossroads: Governing data and putting data to
use are two dueling objectives, and businesses are stuck in the middle.

Can this problem be solved? In a word: yes.

The answer is through what we call automated Data Governance, which introduces speed, agility, and precision into the process of applying rules on data. Join Immuta for a webinar as we explore these Data Governance challenges and discuss how you can proactively address them with automated Data Governance.

Published in: Data & Analytics
  • D0WNL0AD FULL ▶ ▶ ▶ ▶ ◀ ◀ ◀ ◀
    Are you sure you want to  Yes  No
    Your message goes here

Automated Data Governance 101 - A Guide to Proactively Addressing Your Privacy, Security, and Compliance Needs

  1. 1. Automated Data Governance 101 Andrew Burt, Chief Privacy Officer, Immuta, Twitter: @AndBurt Matt Vogt, Head of Global Solution Architecture, Immuta, Twitter: @mattvogt
  2. 2. © 2019 IMMUTA What is data governance? Automated Data Governance 101 Privacy ComplianceSecurity
  3. 3. Why is data governance so difficult?
  4. 4. © 2019 IMMUTA Privacy ComplianceSecurity
  5. 5. The Privacy Problem: Too Much Data
  6. 6. © 2019 IMMUTA Judd & Leslie Are Living Happy Lives
  7. 7. © 2019 IMMUTA ▪ Chris Wong perusing Twitter in March 2014 ▪ Sees a Taxi & Limousine Commission chart on traffic patterns ▪ Makes a freedom of information request for 12 months of data ▪ Receives 50 gigabytes of data A New York City Researcher Gets Curious...
  8. 8. © 2019 IMMUTA ▪ Data was released containing taxi pickups, dropoffs, location, time, amount, and tip amount, among others ▪ This seems pretty harmless, right? The New York Taxi & Limousine Commission
  9. 9. © 2019 IMMUTA ▪ This photo was geotagged (with time), so by simply querying by medallion and time, we know how much Judd and Leslie tip! Well, Judd and Leslie May Not Think It’s Harmless...
  10. 10. © 2019 IMMUTA This is an Example of a “Link Attack” Medallion & Photo Time Medallion & Pickup Time New York Taxi Data
  11. 11. © 2019 IMMUTA Same Attack on Other Celebrities
  12. 12. © 2019 IMMUTA Same Attack on Other Celebrities
  13. 13. © 2019 IMMUTA Same Attack on Other Celebrities
  14. 14. © 2019 IMMUTA “... the dates and locations of four purchases are enough to identify 90 percent of the people in a data set recording three months’ worth of credit card transactions by 1.1 million users ... someone with copies of just three of your recent receipts - or one receipt, one Instagram photo, and one tweet about the phone you just bought - would have a 94 percent chance of extracting your credit card records from those of a million other people.” In Fact...
  15. 15. © 2019 IMMUTA ▪ The volume of data we generate has undermined privacy as we know it ▪ Instead of focusing on how and when our data is gathered… ▪ Privacy is best served in limiting how our data is being used - or how the data consumers within our organizations are using this data ▪ The privacy problem begets another challenge for enterprises: how do you balance data privacy with utility? The End of Privacy (As We Know It)
  16. 16. Data can either be useful or perfectly anonymous, but never both. Paul Ohm Broken Promises of Privacy 57 UCLA Law Review 1701 (2010)
  17. 17. © 2019 IMMUTA ▪ To preserve privacy, organizations have to make the data less closely resemble the raw data (or full data). ▪ Moving along this curve, data becomes more robust against certain types of privacy risks. ▪ The actual trade-off is highly coupled with analytical context. In Practice, Privacy is a Continuum
  18. 18. © 2019 IMMUTA Privacy ComplianceSecurity
  19. 19. The Security Problem: Too Much Complexity
  20. 20. © 2019 IMMUTA Traditionally defined as a “triad”: ● Confidentiality only the right people can view the right data . . . ● Integrity . . . in the right form . . . ● Availability . . . at the right time. Information Security
  21. 21. © 2019 IMMUTA Today’s IT Landscape ▪ 2.5 quintillion bytes of data created each day ▪ 90 percent of the data in the world was generated in the last 2 years ▪ Estimated 50 billion connected devices by next year - over six per person on the planet ▪ Average of 40,000 searches conducted on Google per second ▪ Web browsing, email, cell tower pings, image and video, audio, and more ▪ Average business uses ~500 custom software applications, only 40 percent of which are known to IT ▪ Number of known vulnerabilities is increasing (significantly) over time ▪ Complexity of software systems and IT environments also appears to be increasing ▪ Adoption of AI tools and techniques is exacerbating these trends The Data We Generate (And Collect) The Software We Use
  22. 22. Complexity Is the Enemy of Security
  23. 23. The only computer that’s completely secure is a computer that no one can use. Willis Ware Cybersecurity Pioneer
  24. 24. © 2019 IMMUTA Privacy ComplianceSecurity
  25. 25. The Compliance Problem: Not Enough Time
  26. 26. The number and complexity of regulations on data is increasing drastically.
  27. 27. 150+ Privacy Laws Proposed in 25 States 250+ Information Security Laws Proposed in 45 States Could Cost Organizations Up to $122B Per Year In 2019 in the U.S. alone…
  28. 28. © 2019 IMMUTA GDPR ▪ EU’s General Data Protection Regulation ▪ Came into force May 2018 as the first and most stringent law in a new wave of global privacy regulations ▪ Fines up to four percent of global revenue ▪ Driven many global companies to rethink how they collect and reuse their data A Few Examples CCPA ▪ California Consumer Privacy Act ▪ Passed in 2018 and goes into effect January 2020 ▪ State legislators implemented some of the strictest standards on consumer data in the nation ▪ Potentially affects any business that collects the data of California residents Cybersecurity Law ▪ Enacted by the Chinese government in 2017 ▪ Increased penalties on the misuse of data collected or stored in the world’s second largest economy
  29. 29. © 2019 IMMUTA Noncompliance Has Serious Consequences
  30. 30. © 2019 IMMUTA ▪ Involves lots of analytical work ▪ “Meetings and memos” approach ▪ Increasing number of stakeholders and regulatory environments are now involved ▪ Not simple, and not fast! Compliance Takes Time
  31. 31. How are we addressing these problems today?
  32. 32. Passively.
  33. 33. © 2019 IMMUTA ❏ Time-Consuming Meetings ❏ Long Policy Memos ❏ Custom Permissions ❏ Varying Policies Per Database ❏ Creation of New Copies of Data to Satisfy Compliance or Privacy Concerns Traditional Signs of Passive Data Governance
  34. 34. © 2019 IMMUTA How long does it take between 1) when your organization collects data, and 2) when that data can be accessed and used? Is your approach to data governance passive? Ask yourself... A. Days B. Weeks C. Months
  35. 35. How can we move away from a passive approach to data governance?
  36. 36. Automated Data Governance.
  37. 37. © 2019 IMMUTA Automated data governance is the process of proactively applying rules on data to ensure compliance and drive data analytics. What Is Automated Data Governance?
  38. 38. 5 Pillars of Automated Data Governance
  39. 39. © 2019 IMMUTA 1. Any Tool 2. Any Data 3. No Copies 4. Any Level of Expertise 5. One Policy, In One Place 5 Pillars of Effective Automated Data Governance
  40. 40. Pillar 1 Any Tool
  41. 41. © 2019 IMMUTA Pillar 1: Any Tool ▪ Automated data governance must support any tool a data scientist or analyst uses, now or in the future ▪ Enables data science and analytics teams to use their tool of choice to access the data they need ▪ Avoids tool “lock in” for governance reasons ▪ Incentivizes governance for the long-term
  42. 42. Pillar 2 Any Data
  43. 43. © 2019 IMMUTA Pillar 2: Any Data ▪ Must enable the use of ALL data, regardless of where it’s stored or the underlying storage technology ▪ Otherwise, leaves insights undiscovered or incentivizes non-compliance ▪ Flexibility is key to long-term governance efforts
  44. 44. Pillar 3 No Copies
  45. 45. © 2019 IMMUTA Pillar 3: No Copies ▪ A passive approach frequently relies on creating new copies of data, usually with sensitive identifiers removed or obscured - this can’t scale! ▪ Automated data governance requires direct access to the same live data across the organization ▪ Data must never be copied for governance purposes
  46. 46. Pillar 4 Any Level of Expertise
  47. 47. © 2019 IMMUTA Pillar 4: Any Level of Expertise ▪ Requires that anyone, with any level of expertise, can understand what rules are being applied to enterprise data ▪ Must empower both those with technical skill sets and those with privacy and compliance knowledge, so all teams can play a meaningful role controlling how data is used
  48. 48. Pillar 5 One Policy, In One Place
  49. 49. © 2019 IMMUTA Pillar 5: One Policy, In One Place ▪ Requires that data policies live in one central location, so they can be easily tracked, monitored ▪ Allows for standardization - and updates over time ▪ Key to long-term governance efforts
  50. 50. Your Roadmap for Automating Data Governance
  51. 51. © 2019 IMMUTA 1. What process governs how an analyst receives new data? 2. Where do your policies come from? What rules do you most care about adhering to? 3. Where is your data, and who’s responsible for it? 4. How is your data used, and how is it catalogued and tagged? 5. What technology stack do you rely on to share data faster, and control data more effectively? How You Can Automate Data Governance
  52. 52. 52 Download WHITE PAPER Automated Data Governance 101
  53. 53. Automated Data Governance In Practice
  54. 54. © 2019 IMMUTA Data Access and Governance ▪ AI-based digital diagnostics and personalized therapies. ▪ Self-service data for data scientists for exploration, experimentation, and analytics. ▪ Run any queries they want without taking additional steps to ensure HIPAA compliance. ▪ Data privacy and security concerns are paramount. ▪ Data is stored in Amazon Aurora and analyzed in Databricks and Tableau. ▪ Create different views of the same data for different parties with varying functional responsibilities. ▪ Removing ePHI and HIPAA sensitive information for model building was extremely time and labor intensive. Cognoa: Digital Behavioral Health Company
  55. 55. © 2019 IMMUTA GDPR Compliant Analytics ▪ Data collection from vehicles requires complex controls. ▪ Different use cases require different levels of anonymization. ▪ Differential privacy is a key enabler. ▪ A model to allow for individual-related insights and/or use cases without violating privacy protection. ▪ Re-identification required: predictive maintenance on a vehicle; they need to unmask the owner in order to provide maintenance (purpose-based views). ▪ Analyzing the most-listened-to radio stations. This does not require identifying an individual, and thus only requires aggregate questions. Multinational Automobile Corporation
  56. 56. © 2019 IMMUTA Accelerated Time to Insight from Highly Sensitive Government Data ▪ Built an integrated data analytics platform for the Office of the Secretary of Defense. ▪ Maintenance and Availability Data Warehouse (MADW) contains availability, cost, inventory, and transactional data on nearly every Department of Defense weapons system and readiness reportable piece of equipment. ▪ More than one billion maintenance records from 46 authoritative data systems. ▪ Integration of availability, cost, inventory, maintenance, and supply data makes numerous analyses available to leaders across the DoD enterprise. LMI: Consultancy Dedicated to Improving the Business of Government
  57. 57. © 2019 IMMUTA Built a Self-Service Environment for Easy Access to Operational Data ▪ Scalable, no-code self-service data access for business intelligence operations. ▪ Provide a single interface for legal teams to implement global policy enforcement on controlled metadata vs one-by-one policy creation. ▪ Automate reporting for credit and loan decisions. ▪ Set project-, purpose-, and role-based restrictions that ensure users can only see the data they are entitled to see. ▪ Controlling access to all data in Data Lake, providing automated reports on the purpose of all data usage (a core GDPR requirement) ▪ Exposed over 8000 data sources, abstracted policy enforcement and instantaneously allowed non-technical users to gain access to the data for the first time, e.g. HR and Marketing. Global Financial Institution
  58. 58. 58 Download WHITE PAPER Automated Data Governance 101
  59. 59. Q&A Andrew Burt, Chief Privacy Officer, Immuta, Twitter: @AndBurt Matt Vogt, Head of Global Solution Architecture, Immuta, Twitter: @mattvogt