SlideShare a Scribd company logo
Protecting the Data Lake
A bit about Ash Narkar !
@ashtalk
● Data Lake Overview
● Open Policy Agent
○ Community
○ Features
○ Use Cases
● Use case deep dive
○ Ceph Data Protection
Agenda
Data is King !
Data is King !
● Pervasive
● Abundant
● Customer Experience
● Revenue Growth
Data is King !
● Pervasive
● Abundant
● Customer Experience
● Revenue Growth
● Cyber Attacks
● Breaches
● Fines
● Loss of Customer Trust
What Is A Data Lake?
Data Lake Features
● Centralized Content
● Scalability
● Multiple data type support
● Resource optimization
Data Lake Platform
Sources Consumers
Data Lake Platform: Kafka
Features
● Distributed streaming platform
● Building real-time streaming data pipelines and applications
Security Challenges
● Authorization using Access Control Lists(ACLs)
● How to authorize requests based on context, like user, IP, common name in
certificate
Security Policies
● Consumers of topics containing PII must be whitelisted
● Producers to topics with high fanout must be whitelisted
Data Lake Platform: Ceph
Features
● Unified distributed storage system
● Delivers object, block, and file storage
Security Challenges
● Security protocol handles only Ceph clients and servers. NO human
users or applications 😔
Security Policies
● Users can access only those buckets belonging to the same
geographical region as them
● Access based on a user’s Business Unit, Department etc.
Data Lake Platform: Elasticsearch
Features
● Full-text search and analytics engine
● Store, search and analyze
Security Challenges
● Authorization is not considered as part of job
● User responsible for implementing access control
Security Policies
● Access control policies for a patient’s PHI
Security Challenge Overview
● Distinct systems
● Changing security requirements
❌ Hardcoding policy
❌ Tight coupling
✅ Expressiveness
✅ Speed and performance
✅ Unified Solution
Who can solve the Security Challenge ?
What Is OPA?
OPA: Community
Inception
Project started in 2016 at
Styra.
Goal
Unify policy enforcement
across the stack.
Use Cases
Admission control
Authorization
ACLs
RBAC
IAM
ABAC
Risk management
Data Protection
Data Filtering
Users
Netflix
Medallia
Chef
Cloudflare
State Street
Pinterest
Intuit
Capital One
...and many more.
Today
CNCF project (Incubating)
59 contributors
800+ slack members
2000+ stars
20+ integrations
Service
OPA
Policy
(Rego)
Data
(JSON)
Request
DecisionQuery
OPA: General-purpose policy engine
● Declarative Policy Language (Rego)
○ Can user X do operation Y on resource Z?
○ What invariants does workload W violate?
○ Which records should bob be allowed to see?
● Library, sidecar, host-level daemon
○ Policy and data are kept in-memory
○ Zero decision-time dependencies
● Management APIs for control & observability
○ Bundle service API for sending policy & data to OPA
○ Status service API for receiving status from OPA
○ Log service API for receiving audit log from OPA
● Tooling to build, test, and debug policy
○ opa run, opa test, opa fmt, opa deps, opa check, etc.
○ VS Code plugin, Tracing, Profiling, etc.
OPA: Features
Service
OPA
Policy
(Rego)
Data
(JSON)
Request
DecisionQuery
How does OPA work?
How does OPA work?
Salary Service V1
OPA
Policy
(Rego)
Data
(JSON)
Request
DecisionQuery
Example policy
"Employees can read their own salary
and the salary of anyone they
manage."
How does OPA work?
Example policy
Employees can read their own salary and the salary of
anyone they manage.
Input Data
method: "GET"
path: ["salary", "bob"]
user: "bob"
3 Steps to OPA
Step 1: Clone OPA Repo
3 Steps to OPA
Step 1: Clone OPA Repo
Step 2: Build OPA binary
3 Steps to OPA
Step 1: Clone OPA Repo
Step 2: Build OPA binary
Step 3: Execute OPA binary
Use Cases
CLOUD
Host
DB
Host
sshd
App
Container
HTTP API
Microservice APIs
Orchestrator
Admission Control
Container Execution,
SSH, sudo
Linux
Risk
Management
Data Protection and
Data Filtering
OPA Use Case:
Ceph Data Protection
Ceph Architecture
CEPH STORAGE CLUSTER (RADOS)
LIBRADOS
RBD CEPH FSRADOSGW
Ceph Data Protection: Setup
OPA
Node Port
Service
Incoming
Request
HTTP
S3 Api
RADOSGW RADOS
user, method,
bucket name
Allow / Deny
Policy
(Rego)
Data
(JSON)
Example policy
"Users can access only those buckets belonging
to the same geographical region as them."
Demo: Ceph Data Protection
https://katacoda.com/styra
Data is King !
● Pervasive
● Abundant
● Customer Experience
● Revenue Growth
● Cyber Attacks
● Breaches
● Fines
● Loss of Customer Trust
Thank You!
github.com/open-policy-agent/opa
openpolicyagent.org
slack.openpolicyagent.org
Booth S20

More Related Content

What's hot

Tracing Micro Services with OpenTracing
Tracing Micro Services with OpenTracingTracing Micro Services with OpenTracing
Tracing Micro Services with OpenTracing
Hemant Kumar
 
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios
 
APIs: Intelligent Routing, Security, & Management
APIs: Intelligent Routing, Security, & ManagementAPIs: Intelligent Routing, Security, & Management
APIs: Intelligent Routing, Security, & Management
NGINX, Inc.
 
Elastic{ON} 2017 Recap
Elastic{ON} 2017 RecapElastic{ON} 2017 Recap
Elastic{ON} 2017 Recap
Matias Cascallares
 
Next-Gen DDoS Detection
Next-Gen DDoS DetectionNext-Gen DDoS Detection
Next-Gen DDoS Detection
Alex Henthorn-Iwane
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
StreamNative
 
NGINX Plus R19 : EMEA
NGINX Plus R19 : EMEANGINX Plus R19 : EMEA
NGINX Plus R19 : EMEA
NGINX, Inc.
 
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
StreamNative
 
Digital Forensics and Incident Response in The Cloud
Digital Forensics and Incident Response in The CloudDigital Forensics and Incident Response in The Cloud
Digital Forensics and Incident Response in The Cloud
Velocidex Enterprises
 
Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017
Gianluca Arbezzano
 
Scale your application to new heights with NGINX and AWS
Scale your application to new heights with NGINX and AWSScale your application to new heights with NGINX and AWS
Scale your application to new heights with NGINX and AWS
NGINX, Inc.
 
Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...
StreamNative
 
What's new in NGINX Plus R19
What's new in NGINX Plus R19What's new in NGINX Plus R19
What's new in NGINX Plus R19
NGINX, Inc.
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
Elasticsearch
 
An approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stackAn approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stack
Arthur Berezin
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
Analyzing NGINX Logs with Datadog
Analyzing NGINX Logs with DatadogAnalyzing NGINX Logs with Datadog
Analyzing NGINX Logs with Datadog
NGINX, Inc.
 
6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production 6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production
Hung Lin
 
Vault Agent and Vault 0.11 features
Vault Agent and Vault 0.11 featuresVault Agent and Vault 0.11 features
Vault Agent and Vault 0.11 features
Mitchell Pronschinske
 
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
Distributed Tracing with OpenTracing, ZipKin and KubernetesDistributed Tracing with OpenTracing, ZipKin and Kubernetes
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
Container Solutions
 

What's hot (20)

Tracing Micro Services with OpenTracing
Tracing Micro Services with OpenTracingTracing Micro Services with OpenTracing
Tracing Micro Services with OpenTracing
 
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
Nagios Conference 2014 - Scott Wilkerson - Getting Started with Nagios Networ...
 
APIs: Intelligent Routing, Security, & Management
APIs: Intelligent Routing, Security, & ManagementAPIs: Intelligent Routing, Security, & Management
APIs: Intelligent Routing, Security, & Management
 
Elastic{ON} 2017 Recap
Elastic{ON} 2017 RecapElastic{ON} 2017 Recap
Elastic{ON} 2017 Recap
 
Next-Gen DDoS Detection
Next-Gen DDoS DetectionNext-Gen DDoS Detection
Next-Gen DDoS Detection
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
NGINX Plus R19 : EMEA
NGINX Plus R19 : EMEANGINX Plus R19 : EMEA
NGINX Plus R19 : EMEA
 
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
Select Star: Flink SQL for Pulsar Folks - Pulsar Summit NA 2021
 
Digital Forensics and Incident Response in The Cloud
Digital Forensics and Incident Response in The CloudDigital Forensics and Incident Response in The Cloud
Digital Forensics and Incident Response in The Cloud
 
Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017Open Tracing, to order and understand your mess. - ApiConf 2017
Open Tracing, to order and understand your mess. - ApiConf 2017
 
Scale your application to new heights with NGINX and AWS
Scale your application to new heights with NGINX and AWSScale your application to new heights with NGINX and AWS
Scale your application to new heights with NGINX and AWS
 
Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...
 
What's new in NGINX Plus R19
What's new in NGINX Plus R19What's new in NGINX Plus R19
What's new in NGINX Plus R19
 
Elastic Stack roadmap deep dive
Elastic Stack roadmap deep diveElastic Stack roadmap deep dive
Elastic Stack roadmap deep dive
 
An approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stackAn approach for migrating enterprise apps into open stack
An approach for migrating enterprise apps into open stack
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Analyzing NGINX Logs with Datadog
Analyzing NGINX Logs with DatadogAnalyzing NGINX Logs with Datadog
Analyzing NGINX Logs with Datadog
 
6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production 6 Months Sailing with Docker in Production
6 Months Sailing with Docker in Production
 
Vault Agent and Vault 0.11 features
Vault Agent and Vault 0.11 featuresVault Agent and Vault 0.11 features
Vault Agent and Vault 0.11 features
 
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
Distributed Tracing with OpenTracing, ZipKin and KubernetesDistributed Tracing with OpenTracing, ZipKin and Kubernetes
Distributed Tracing with OpenTracing, ZipKin and Kubernetes
 

Similar to Protecting the Data Lake

Open Policy Agent
Open Policy AgentOpen Policy Agent
Open Policy Agent
Torin Sandall
 
Dynamic Authorization & Policy Control for Docker Environments
Dynamic Authorization & Policy Control for Docker EnvironmentsDynamic Authorization & Policy Control for Docker Environments
Dynamic Authorization & Policy Control for Docker Environments
Torin Sandall
 
Dynamic Policy Enforcement for Microservice Environments
Dynamic Policy Enforcement for Microservice EnvironmentsDynamic Policy Enforcement for Microservice Environments
Dynamic Policy Enforcement for Microservice Environments
Nebulaworks
 
OPA APIs and Use Case Survey
OPA APIs and Use Case SurveyOPA APIs and Use Case Survey
OPA APIs and Use Case Survey
Torin Sandall
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
Hive on Spark, production experience @Uber
 Hive on Spark, production experience @Uber Hive on Spark, production experience @Uber
Hive on Spark, production experience @Uber
Future of Data Meetup
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Jaroslav Gergic
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
Amihay Zer-Kavod
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
Camille Salas
 
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles Sonigo
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
Prasad Wagle
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA
 
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
Amazon Web Services
 
Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023
Nelson Calero
 
Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017
Ioannis Papapanagiotou
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandra
Vinay Kumar Chella
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
Gaurav Bahrani
 
Scaling up uber's real time data analytics
Scaling up uber's real time data analyticsScaling up uber's real time data analytics
Scaling up uber's real time data analytics
Xiang Fu
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
Alex Henthorn-Iwane
 

Similar to Protecting the Data Lake (20)

Open Policy Agent
Open Policy AgentOpen Policy Agent
Open Policy Agent
 
Dynamic Authorization & Policy Control for Docker Environments
Dynamic Authorization & Policy Control for Docker EnvironmentsDynamic Authorization & Policy Control for Docker Environments
Dynamic Authorization & Policy Control for Docker Environments
 
Dynamic Policy Enforcement for Microservice Environments
Dynamic Policy Enforcement for Microservice EnvironmentsDynamic Policy Enforcement for Microservice Environments
Dynamic Policy Enforcement for Microservice Environments
 
OPA APIs and Use Case Survey
OPA APIs and Use Case SurveyOPA APIs and Use Case Survey
OPA APIs and Use Case Survey
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
Hive on Spark, production experience @Uber
 Hive on Spark, production experience @Uber Hive on Spark, production experience @Uber
Hive on Spark, production experience @Uber
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
 
Data Platform in the Cloud
Data Platform in the CloudData Platform in the Cloud
Data Platform in the Cloud
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
Charles sonigo - Demuxed 2018 - How to be data-driven when you aren't Netflix...
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
Data Con LA 2018 - Enabling real-time exploration and analytics at scale at H...
 
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
How Netflix Monitors Applications in Near Real-time w Amazon Kinesis - ABD401...
 
Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023Database automation guide - Oracle Community Tour LATAM 2023
Database automation guide - Oracle Community Tour LATAM 2023
 
Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017Dynomite @ RedisConf 2017
Dynomite @ RedisConf 2017
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandra
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
Scaling up uber's real time data analytics
Scaling up uber's real time data analyticsScaling up uber's real time data analytics
Scaling up uber's real time data analytics
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
 

Recently uploaded

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Protecting the Data Lake

  • 2. A bit about Ash Narkar ! @ashtalk
  • 3. ● Data Lake Overview ● Open Policy Agent ○ Community ○ Features ○ Use Cases ● Use case deep dive ○ Ceph Data Protection Agenda
  • 5. Data is King ! ● Pervasive ● Abundant ● Customer Experience ● Revenue Growth
  • 6. Data is King ! ● Pervasive ● Abundant ● Customer Experience ● Revenue Growth ● Cyber Attacks ● Breaches ● Fines ● Loss of Customer Trust
  • 7. What Is A Data Lake?
  • 8. Data Lake Features ● Centralized Content ● Scalability ● Multiple data type support ● Resource optimization
  • 10. Data Lake Platform: Kafka Features ● Distributed streaming platform ● Building real-time streaming data pipelines and applications Security Challenges ● Authorization using Access Control Lists(ACLs) ● How to authorize requests based on context, like user, IP, common name in certificate Security Policies ● Consumers of topics containing PII must be whitelisted ● Producers to topics with high fanout must be whitelisted
  • 11. Data Lake Platform: Ceph Features ● Unified distributed storage system ● Delivers object, block, and file storage Security Challenges ● Security protocol handles only Ceph clients and servers. NO human users or applications 😔 Security Policies ● Users can access only those buckets belonging to the same geographical region as them ● Access based on a user’s Business Unit, Department etc.
  • 12. Data Lake Platform: Elasticsearch Features ● Full-text search and analytics engine ● Store, search and analyze Security Challenges ● Authorization is not considered as part of job ● User responsible for implementing access control Security Policies ● Access control policies for a patient’s PHI
  • 13. Security Challenge Overview ● Distinct systems ● Changing security requirements ❌ Hardcoding policy ❌ Tight coupling ✅ Expressiveness ✅ Speed and performance ✅ Unified Solution
  • 14. Who can solve the Security Challenge ?
  • 16. OPA: Community Inception Project started in 2016 at Styra. Goal Unify policy enforcement across the stack. Use Cases Admission control Authorization ACLs RBAC IAM ABAC Risk management Data Protection Data Filtering Users Netflix Medallia Chef Cloudflare State Street Pinterest Intuit Capital One ...and many more. Today CNCF project (Incubating) 59 contributors 800+ slack members 2000+ stars 20+ integrations
  • 18. ● Declarative Policy Language (Rego) ○ Can user X do operation Y on resource Z? ○ What invariants does workload W violate? ○ Which records should bob be allowed to see? ● Library, sidecar, host-level daemon ○ Policy and data are kept in-memory ○ Zero decision-time dependencies ● Management APIs for control & observability ○ Bundle service API for sending policy & data to OPA ○ Status service API for receiving status from OPA ○ Log service API for receiving audit log from OPA ● Tooling to build, test, and debug policy ○ opa run, opa test, opa fmt, opa deps, opa check, etc. ○ VS Code plugin, Tracing, Profiling, etc. OPA: Features Service OPA Policy (Rego) Data (JSON) Request DecisionQuery
  • 19. How does OPA work?
  • 20. How does OPA work? Salary Service V1 OPA Policy (Rego) Data (JSON) Request DecisionQuery Example policy "Employees can read their own salary and the salary of anyone they manage."
  • 21. How does OPA work? Example policy Employees can read their own salary and the salary of anyone they manage. Input Data method: "GET" path: ["salary", "bob"] user: "bob"
  • 22. 3 Steps to OPA Step 1: Clone OPA Repo
  • 23. 3 Steps to OPA Step 1: Clone OPA Repo Step 2: Build OPA binary
  • 24. 3 Steps to OPA Step 1: Clone OPA Repo Step 2: Build OPA binary Step 3: Execute OPA binary
  • 25.
  • 26. Use Cases CLOUD Host DB Host sshd App Container HTTP API Microservice APIs Orchestrator Admission Control Container Execution, SSH, sudo Linux Risk Management Data Protection and Data Filtering
  • 27. OPA Use Case: Ceph Data Protection
  • 28. Ceph Architecture CEPH STORAGE CLUSTER (RADOS) LIBRADOS RBD CEPH FSRADOSGW
  • 29. Ceph Data Protection: Setup OPA Node Port Service Incoming Request HTTP S3 Api RADOSGW RADOS user, method, bucket name Allow / Deny Policy (Rego) Data (JSON)
  • 30. Example policy "Users can access only those buckets belonging to the same geographical region as them."
  • 31. Demo: Ceph Data Protection https://katacoda.com/styra
  • 32. Data is King ! ● Pervasive ● Abundant ● Customer Experience ● Revenue Growth ● Cyber Attacks ● Breaches ● Fines ● Loss of Customer Trust