SlideShare a Scribd company logo
1 of 21
1 © Hortonworks Inc. 2011–2018. All rights reserved
Integration of Apache Accumulo and
Apache Ranger
Marcus Waineo, Principal Solutions Engineer
John Highcock, Systems Architect
2 © Hortonworks Inc. 2011–2018. All rights reserved
Introduction
2
• Attribute Based Access Control (ABAC)
• Apache Ranger
• Apache Accumulo security
• Accumulo-Ranger integration Demo
• Future Work
3 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Attribute Based Access Control?
3
• An access control method where subject requests to perform operations on objects are
granted or denied based on assigned attributes of the subject, assigned attributes of the
object, environment conditions, and a set of policies that are specified in terms of those
attributes and conditions.
• Subject (person entity / non-person entity)
• Object (Resource)
• Attributes (of subject and object
• Policy
• Environment
4 © Hortonworks Inc. 2011–2018. All rights reserved
Why Use ABAC?
4
• Consistent implementation of centralized policies, centralized object and attribute
management
• More flexible than legacy RBAC / ACL based systems and technologies
• Examples
5 © Hortonworks Inc. 2011–2018. All rights reserved
What Is Apache Ranger?
5
• Created by XA Secure, acquired by Hortonworks and donated to Apache in 2014
• Distributed framework for management and monitoring of comprehensive data security
of the Hadoop ecosystem
• Provides centralized policy management and audit
• Provides distributed policy decision points and policy enforcement
• Already supports 10+ ecosystem components including
• HDFS
• YARN
• Nifi
• HiveServer 2
• etc.
6 © Hortonworks Inc. 2011–2018. All rights reserved
HDFS
Ranger Administration Portal
HBase
Hive Server2
Ranger Audit Server
Ranger Plugin
HadoopComponentsEnterprise
Users
Ranger Plugin
Ranger Plugin
Legacy Tools and Data Governance
HDFS
Knox
NifI
Ranger Plugin
Ranger Plugin
SolrRanger Plugin
Ranger Policy Server
Integration API
KafkaRanger Plugin
YARNRanger Plugin
Ranger PluginStorm Ranger Plugin Accumulo
Solr
How Apache Ranger works
7 © Hortonworks Inc. 2011–2018. All rights reserved
Why Use Apache Ranger with Accumulo?
7
• The only Hadoop ecosystem security project capable of supporting ABAC
• Provides security framework to allow users to more easily leverage Accumulo’s superior
security features
• Distributed plugin architecture maps well to Accumulo tabletserver architecture
• Centralized policy management and audit for Accumulo and other Hadoop ecosystem
services (e.g. HDFS)
8 © Hortonworks Inc. 2011–2018. All rights reserved
How Accumulo Security Works
8
• Authentication
• Supports Kerberos
• Permissions
• Pertains to System, Table, and Namespace
• Authorizations
• Pertains to data within the system, implemented via Accumulo column visibilities
9 © Hortonworks Inc. 2011–2018. All rights reserved
Accumulo-Ranger Integration Status
9
• Ranger Authorization and Auditing via User and/or Group Assignment for
• Table
• Namespace
• System
• Authorizations
• Created a few things
• Ranger Plugin for Accumulo
• Ambari blueprint for automated Kerberos/Ranger deployment
• Docker containers
• https://github.com/trucemop/rangerAccumulo
10 © Hortonworks Inc. 2011–2018. All rights reserved
Demo
11 © Hortonworks Inc. 2011–2018. All rights reserved
Demo Overview
11
• Simple example of a doctor’s office
• One table within a namespace (office.patient)
• Demo will use 2 users
• doctorbob
• tom, member of reception group
• Below able contains entries with labels for pii, phi, maskedpii, and maskedphi
Accumulo Authorizations
Accumulo Permissions
Namespace Table All
pii phi maskedpii maskedphi office office.patient
Users
doctorbob X X X
accumulo X
Groups
reception X X X
billing X X
12 © Hortonworks Inc. 2011–2018. All rights reserved
Accumulo Permissions and Authorizations as Ranger Policies
12
13 © Hortonworks Inc. 2011–2018. All rights reserved
Demo Script
13
• doctorbob scan order:
1. scan office.patient table with no auth
2. scan office.patient table with pii (explicitly granted to user doctorbob)
3. scan office.patient table with pii,phi (explicitly granted to user doctorbob)
4. scan office.patient table with pii,phi,nope (undefined accumulo authorization)
5. scan office.patient table with pii,phi,maskedpii (defined, but unauthorized)
• tom scan order
1. scan office.patient table with maskedpii,maskedphi (granted by tom's group membership)
2. scan office.patient table with maskedpii,maskedphi,pii (ungranted accumulo authornization)
3. scan trace table (namespace or table he does not have access to)
4. insert into office.patient (write on table)
5. scan office.patient table with maskedpii (scan to show new entries)
14 © Hortonworks Inc. 2011–2018. All rights reserved
User “doctorbob” scans, pt1
14
15 © Hortonworks Inc. 2011–2018. All rights reserved
User “doctorbob” scans, pt2
15
16 © Hortonworks Inc. 2011–2018. All rights reserved
Audit events of “doctorbob” scans
16
17 © Hortonworks Inc. 2011–2018. All rights reserved
User “tom” scans, pt1
17
18 © Hortonworks Inc. 2011–2018. All rights reserved
User “tom” scans, pt2
18
19 © Hortonworks Inc. 2011–2018. All rights reserved
Audit events of “tom” scans
19
20 © Hortonworks Inc. 2011–2018. All rights reserved
Future Work
20
• Enable editing Ranger policies via Accumulo API
• Consolidation of audit events​
• RANGER-919
• Resource lookup for table/namespace
• LDAP attribute sync
• Apache Atlas Tag integration
21 © Hortonworks Inc. 2011–2018. All rights reserved
Thank you

More Related Content

What's hot

A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiDataWorks Summit
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorDatabricks
 
Prometheus
PrometheusPrometheus
Prometheuswyukawa
 
Git ops & Continuous Infrastructure with terra*
Git ops  & Continuous Infrastructure with terra*Git ops  & Continuous Infrastructure with terra*
Git ops & Continuous Infrastructure with terra*Haggai Philip Zagury
 
Introduction to OPA
Introduction to OPAIntroduction to OPA
Introduction to OPAKnoldus Inc.
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elkRushika Shah
 
Fluentd and Kafka
Fluentd and KafkaFluentd and Kafka
Fluentd and KafkaN Masahiro
 
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API ManagerWSO2
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsBryan Bende
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayDataWorks Summit
 

What's hot (20)

A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Apache Kafka Security
Apache Kafka Security Apache Kafka Security
Apache Kafka Security
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Deep dive into AWS fargate
Deep dive into AWS fargateDeep dive into AWS fargate
Deep dive into AWS fargate
 
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark OperatorApache Spark Streaming in K8s with ArgoCD & Spark Operator
Apache Spark Streaming in K8s with ArgoCD & Spark Operator
 
Prometheus
PrometheusPrometheus
Prometheus
 
Git ops & Continuous Infrastructure with terra*
Git ops  & Continuous Infrastructure with terra*Git ops  & Continuous Infrastructure with terra*
Git ops & Continuous Infrastructure with terra*
 
Introduction to OPA
Introduction to OPAIntroduction to OPA
Introduction to OPA
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elk
 
Fluentd and Kafka
Fluentd and KafkaFluentd and Kafka
Fluentd and Kafka
 
Rego Deep Dive
Rego Deep DiveRego Deep Dive
Rego Deep Dive
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager
[WSO2 API Manager Community Call] Mastering JWTs with WSO2 API Manager
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
 
Apache NiFi SDLC Improvements
Apache NiFi SDLC ImprovementsApache NiFi SDLC Improvements
Apache NiFi SDLC Improvements
 
Hadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox GatewayHadoop REST API Security with Apache Knox Gateway
Hadoop REST API Security with Apache Knox Gateway
 

Similar to Enabling ABAC with Accumulo and Ranger integration

Accumulo and Ranger Integration
Accumulo and Ranger IntegrationAccumulo and Ranger Integration
Accumulo and Ranger IntegrationAccumulo Summit
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not laterDataWorks Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in HadoopMadhan Neethiraj
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache RangerDataWorks Summit
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache RangerDataWorks Summit
 
Atlas and ranger epam meetup
Atlas and ranger epam meetupAtlas and ranger epam meetup
Atlas and ranger epam meetupAlex Zeltov
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it DataWorks Summit/Hadoop Summit
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasDataWorks Summit/Hadoop Summit
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies DataWorks Summit/Hadoop Summit
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...DataWorks Summit/Hadoop Summit
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPHortonworks
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesTimothy Spann
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemDataWorks Summit
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...huguk
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Seetharam Venkatesh
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxDataWorks Summit
 

Similar to Enabling ABAC with Accumulo and Ranger integration (20)

Accumulo and Ranger Integration
Accumulo and Ranger IntegrationAccumulo and Ranger Integration
Accumulo and Ranger Integration
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
 
Overview of new features in Apache Ranger
Overview of new features in Apache RangerOverview of new features in Apache Ranger
Overview of new features in Apache Ranger
 
Enterprise Data Classification and Provenance
Enterprise Data Classification and ProvenanceEnterprise Data Classification and Provenance
Enterprise Data Classification and Provenance
 
Securing Hadoop with Apache Ranger
Securing Hadoop with Apache RangerSecuring Hadoop with Apache Ranger
Securing Hadoop with Apache Ranger
 
Atlas and ranger epam meetup
Atlas and ranger epam meetupAtlas and ranger epam meetup
Atlas and ranger epam meetup
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it
 
Security and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache AtlasSecurity and Data Governance using Apache Ranger and Apache Atlas
Security and Data Governance using Apache Ranger and Apache Atlas
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
Apache Atlas: Why Big Data Management Requires Hierarchical Taxonomies
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New Features
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
 
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
Apache Argus - How do I secure my entire Hadoop cluster? Olivier Renault @ Ho...
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Enabling ABAC with Accumulo and Ranger integration

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Integration of Apache Accumulo and Apache Ranger Marcus Waineo, Principal Solutions Engineer John Highcock, Systems Architect
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Introduction 2 • Attribute Based Access Control (ABAC) • Apache Ranger • Apache Accumulo security • Accumulo-Ranger integration Demo • Future Work
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved What Is Attribute Based Access Control? 3 • An access control method where subject requests to perform operations on objects are granted or denied based on assigned attributes of the subject, assigned attributes of the object, environment conditions, and a set of policies that are specified in terms of those attributes and conditions. • Subject (person entity / non-person entity) • Object (Resource) • Attributes (of subject and object • Policy • Environment
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved Why Use ABAC? 4 • Consistent implementation of centralized policies, centralized object and attribute management • More flexible than legacy RBAC / ACL based systems and technologies • Examples
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved What Is Apache Ranger? 5 • Created by XA Secure, acquired by Hortonworks and donated to Apache in 2014 • Distributed framework for management and monitoring of comprehensive data security of the Hadoop ecosystem • Provides centralized policy management and audit • Provides distributed policy decision points and policy enforcement • Already supports 10+ ecosystem components including • HDFS • YARN • Nifi • HiveServer 2 • etc.
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved HDFS Ranger Administration Portal HBase Hive Server2 Ranger Audit Server Ranger Plugin HadoopComponentsEnterprise Users Ranger Plugin Ranger Plugin Legacy Tools and Data Governance HDFS Knox NifI Ranger Plugin Ranger Plugin SolrRanger Plugin Ranger Policy Server Integration API KafkaRanger Plugin YARNRanger Plugin Ranger PluginStorm Ranger Plugin Accumulo Solr How Apache Ranger works
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved Why Use Apache Ranger with Accumulo? 7 • The only Hadoop ecosystem security project capable of supporting ABAC • Provides security framework to allow users to more easily leverage Accumulo’s superior security features • Distributed plugin architecture maps well to Accumulo tabletserver architecture • Centralized policy management and audit for Accumulo and other Hadoop ecosystem services (e.g. HDFS)
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved How Accumulo Security Works 8 • Authentication • Supports Kerberos • Permissions • Pertains to System, Table, and Namespace • Authorizations • Pertains to data within the system, implemented via Accumulo column visibilities
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved Accumulo-Ranger Integration Status 9 • Ranger Authorization and Auditing via User and/or Group Assignment for • Table • Namespace • System • Authorizations • Created a few things • Ranger Plugin for Accumulo • Ambari blueprint for automated Kerberos/Ranger deployment • Docker containers • https://github.com/trucemop/rangerAccumulo
  • 10. 10 © Hortonworks Inc. 2011–2018. All rights reserved Demo
  • 11. 11 © Hortonworks Inc. 2011–2018. All rights reserved Demo Overview 11 • Simple example of a doctor’s office • One table within a namespace (office.patient) • Demo will use 2 users • doctorbob • tom, member of reception group • Below able contains entries with labels for pii, phi, maskedpii, and maskedphi Accumulo Authorizations Accumulo Permissions Namespace Table All pii phi maskedpii maskedphi office office.patient Users doctorbob X X X accumulo X Groups reception X X X billing X X
  • 12. 12 © Hortonworks Inc. 2011–2018. All rights reserved Accumulo Permissions and Authorizations as Ranger Policies 12
  • 13. 13 © Hortonworks Inc. 2011–2018. All rights reserved Demo Script 13 • doctorbob scan order: 1. scan office.patient table with no auth 2. scan office.patient table with pii (explicitly granted to user doctorbob) 3. scan office.patient table with pii,phi (explicitly granted to user doctorbob) 4. scan office.patient table with pii,phi,nope (undefined accumulo authorization) 5. scan office.patient table with pii,phi,maskedpii (defined, but unauthorized) • tom scan order 1. scan office.patient table with maskedpii,maskedphi (granted by tom's group membership) 2. scan office.patient table with maskedpii,maskedphi,pii (ungranted accumulo authornization) 3. scan trace table (namespace or table he does not have access to) 4. insert into office.patient (write on table) 5. scan office.patient table with maskedpii (scan to show new entries)
  • 14. 14 © Hortonworks Inc. 2011–2018. All rights reserved User “doctorbob” scans, pt1 14
  • 15. 15 © Hortonworks Inc. 2011–2018. All rights reserved User “doctorbob” scans, pt2 15
  • 16. 16 © Hortonworks Inc. 2011–2018. All rights reserved Audit events of “doctorbob” scans 16
  • 17. 17 © Hortonworks Inc. 2011–2018. All rights reserved User “tom” scans, pt1 17
  • 18. 18 © Hortonworks Inc. 2011–2018. All rights reserved User “tom” scans, pt2 18
  • 19. 19 © Hortonworks Inc. 2011–2018. All rights reserved Audit events of “tom” scans 19
  • 20. 20 © Hortonworks Inc. 2011–2018. All rights reserved Future Work 20 • Enable editing Ranger policies via Accumulo API • Consolidation of audit events​ • RANGER-919 • Resource lookup for table/namespace • LDAP attribute sync • Apache Atlas Tag integration
  • 21. 21 © Hortonworks Inc. 2011–2018. All rights reserved Thank you