SlideShare a Scribd company logo
1 of 32
Securing the Hadoop Ecosystem
Hadoop Security and Compliance Challenges
2
• History
• Security was not a priority in early Hadoop adopters like Yahoo!
and Facebook / it is now!
• Data concentration
• Quantity and diversity of data creates compliance challenges
• Flexibility of the Hadoop architecture
• Many paths for data in, out, processing
• Access data at different granularities, from fields to files
• ELT: sensitive data “discovery” occurs after data arrives
Cloudera has led in investments in security
3
Authentication
• First Hadoop distribution to offer strong authentication throughout
Encryption
• First Hadoop distribution to support encryption on wire
Audit
• Only Hadoop distribution to support audit histories for all data objects & access
paths
• Single point for log capture, audit
Authorization
• Founded the Apache Sentry project along with Oracle and Lab41 to manage fine-
grained permissions
Automation
• Cloudera Manager automates security configurations & LDAP/AD integration
Case Study: Finance and Banking
• Identify patterns in financially-sensitive, PCI and PII
data
• Before: Unable to build applications on Hadoop; forced
to use other systems, to greatly limit Hadoop access, or
to forgo analysis due to privacy concerns
• Now: Provide broad analysis capabilities with Impala to
large population and secured by Sentry
Fraud and Purchasing
Behavior Analysis
Enterprise Security in Hadoop overview
5
Four Functional Areas
Hadoop Cluster
Users
Applications Operators
Perimeter
Data
Access
Visibility
Defining the Functional Areas
6
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
Enabling Enterprise Security
7
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Enabling Enterprise Security
8
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Perimeter: Authentication in Hadoop
10
Kerberos
• Provably strong authentication between all
Hadoop services and (optionally) to end-points
• Cloudera Manager hides complexity
LDAP/AD
• Username / password
• Option for Hue, Hive Metastore, Impala
connectors, Cloudera Manager admin logins
SAML
• For Single Sign-On (SSO) for listed options
• Kerberos clients no longer required on most user
end-points
Authentication Options and Coverage
11
HDFS
DN NN
YARN
RM AM
Impala
ID SS
MapReduce
JT TT
… Services …
(Oozie, Search, etc.)
3rd Party
Gateway …
Client
Client
Client
Client
… Applications …
(Pig, Hive, Hue, etc.)
“End-to-End” Kerberos
“Core” Kerberos “Edge” AD/LDAP/SAML
IT Integration: Kerberos
• Users don’t want Yet Another Credential
• Corp IT doesn’t want to provision and maintain thousands of service principals and
keytabs
• Solution: local KDC + one-way trust
• Run MIT Kerberos KDC in the cluster
• Put all service principals here
• Set up one-way trust of central corporate realm by local KDC
• Normal user credentials can be used to access Hadoop
• Recommended: Use Cloudera Manager
• To properly tune inter-related configuration knobs
• To manage principals/keytabs creation and distribution
• To preserve service monitoring with Kerberos security enabled
IT Integration: Kerberos + LDAP
Hadoop Cluster
Local KDC (MIT Kerberos)
hdfs/host1@HADOOP.EXAMPLE.COM
yarn/host2@HADOOP.EXAMPLE.COM
…
Central
Active Directory
user@EXAMPLE.COM …
Cross-realm
trust
NN JT
LDAP group
mapping
Network Access Management
• Use Hue to front-end both Hadoop and Oozie to control access through a web browser
• HTTP proxy servers:
• Oozie : MR jobs, Pig jobs, Hive jobs
• HttpFS: hadoop fs is front-ended over HTTP
• HBase REST server: HBase reads
Secure configuration with Oozie, Hue and HttpFS front-ends co-located to act as network
bridge
Hue supports AD/LDAP based authentication instead of Kerberos for client simplicity
Enabling Enterprise Security
15
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Data: Protection in Hadoop
16
Data in Motion Data at Rest
“Network Encryption”
• SASL: Network RPC
• SSL: MapReduce shuffle
• SSL: Web-based user and
administration tools
• SSL: JDBC
• HDFS data transfer protocol
“Data Encryption”
• Certified partner solutions
• Field-level encryption
• Data masking or tokenization
• OS-level file system encryption
Enabling Enterprise Security
18
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Prior State of Authorization
Two Sub-Optimal Choices for SQL on Hadoop
19
• Insecure Advisory Authorization
• Users could grant themselves permissions
• Intended to prevent accidental deletion of data
• Problem: Did not guard against malicious users
• Problem: Only worked with Hive
• HDFS Impersonation
• Data was only protected at the file level by HDFS permissions
• Problem: File-level not granular enough
• Problem: Lacked flexibility; not role-based
Sentry: Key Capabilities
21
Fine-Grained Authorization
• Specify security for SERVERS, DATABASES, TABLES,
VIEWS, and search indices
Role-Based Authorization
• SELECT privilege on views & tables
• INSERT privilege on tables
• TRANSFORM privilege on servers
• ALL privilege on the server, databases, tables & views
• ALL privilege is needed to create/modify schema
Multitenant Administration
• Separate policies for each database/schema
• Can be maintained by separate admins
Sentry Architecture
22
Binding
Layer
Impala
Impala Hive
Policy Engine
Search
Policy Provider
File Database
HiveServer2
Authorization
Provider Evaluation, Validation
Parsing
Interface
Interface
Local FS/HDFS
Search
QueryMR
SQL
Query Execution Flow
23
Parse
Build
Check
Plan
Sentry
Validate SQL grammar
Construct statement tree
Validate statement objects
• First check: Authorization
Forward to execution planner
Multitenant Security
Global
[groups]
admin_group = admin_role
dep1_admin = uri_role
[roles]
admin_role = server=server1
uri_role = hdfs:///ha-nn-uri/data
[databases]
db1 = hdfs://ha-nn-
uri/user/hive/sentry/db1.ini
Per Database
[groups]
dep1_admin = db1_admin_role
dep1_analyst = db1_read_role
[roles]
db1_admin_role = server=server1-
>db=db1
db1_read_role = server=server1-
>db=db1->table=*->action=select
Apache Ecosystem and Sentry
Inline support in Cloudera Impala
Extensibility plug-in for Apache HiveServer2
Inline support in Cloudera Search
Complementary security with HDFS ACLs
Access: Authorization in Hadoop
26
File ACL
Admin RBAC
Data RBAC
• Permission at file-level granularity
• HDFS POSIX-style permissions: u/g/o
• Access Control Lists (ACL)
• HBase, Oozie, MapReduce
• Permissions on tables, views, indices
• Sentry for HiveServer2, Impala, Search
App and Workflow
• Cloudera Manager, Hue
Enabling Enterprise Security
28
Perimeter
Guarding access to the
cluster itself
Technical Concepts:
Authentication
Network isolation
Data
Protecting data in the
cluster from
unauthorized visibility
Technical Concepts:
Encryption, Tokenization,
Data masking
Access
Defining what users
and applications can do
with data
Technical Concepts:
Permissions
Authorization
Visibility
Reporting on where
data came from and
how it’s being used
Technical Concepts:
Auditing
Lineage
SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
Visibility: Cloudera Navigator
29
Audit & Access Control
• Maintain full audit history
• Ensuring appropriate
permissions and reporting
on data access for
compliance
Discovery & Exploration
• Finding out what data is
available and what it looks
like
Lineage
• Tracing data back to its
original source
Lifecycle Management
• Migration of data based on
policies
3RD PARTY
APPS
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
CLOUDERA’S ENTERPRISE DATA HUB
BATCH
PROCESSING
MAPREDUCE
ANALYTIC
SQL
IMPALA
SEARCH
ENGINE
SOLR
MACHINE
LEARNING
SPARK
STREAM
PROCESSING
SPARK STREAMING
WORKLOAD MANAGEMENT YARN
FILESYSTEM
HDFS
ONLINE NOSQL
HBASE
DATA
MANAGEMENT
CLOUDERANAVIGATOR
SYSTEM
MANAGEMENT
CLOUDERAMANAGER
SENTRY, SECURE
Why Navigator?
30
Lots of Data Landing in Cloudera Enterprise
 Huge quantities
 Many different sources – structured and unstructured
 Varying levels of sensitivity
1
Many Users Working with the Data
 Administrators and compliance officers
 Analysts and data scientists
 Business users
2
Need to Effectively Control and Consume Data
 Get visibility and control over the environment
 Discover and explore data
3
31
31
32
32
33
33
Leading Investment to Address the Challenges
34
Authentication First Hadoop distribution to offer strong authentication
throughout
Encryption First Hadoop distribution to support encryption on wire
Audit Only Hadoop distribution to support audit histories for all data
objects and access paths; Single point for log capture, audit
Authorization Founded the Apache Sentry project along with Oracle and
Lab41 to manage fine-grained permissions
Automation Cloudera Manager automates security configurations &
LDAP/AD integration
Cloudera 5: Enabling the Enterprise Data Hub
35
Open Source
Scalable
Flexible
Cost-Effective
✔
Managed ✖
Open
Architecture ✖
Secure and
Governed ✖
✔
✔
✔
3RD PARTY
APPS
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
CLOUDERA’S ENTERPRISE DATA HUB
BATCH
PROCESSING
MAPREDUCE
ANALYTIC
SQL
IMPALA
SEARCH
ENGINE
SOLR
MACHINE
LEARNING
SPARK
STREAM
PROCESSING
SPARK STREAMING
WORKLOAD MANAGEMENT YARN
FILESYSTEM
HDFS
ONLINE NOSQL
HBASE
DATA
MANAGEMENT
CLOUDERANAVIGATOR
SYSTEM
MANAGEMENT
CLOUDERAMANAGER
SENTRY
Hadoop and Data Access Security

More Related Content

What's hot

NF102: Nutanix AHV Basics
NF102: Nutanix AHV BasicsNF102: Nutanix AHV Basics
NF102: Nutanix AHV BasicsNEXTtour
 
Hashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorHashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorKangaroot
 
Azure Security Overview
Azure Security OverviewAzure Security Overview
Azure Security OverviewAllen Brokken
 
Hashicorp Vault Open Source vs Enterprise
Hashicorp Vault Open Source vs EnterpriseHashicorp Vault Open Source vs Enterprise
Hashicorp Vault Open Source vs EnterpriseStenio Ferreira
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformDevOps.com
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security ArchitectureOwen O'Malley
 
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC WongDocker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC WongDocker, Inc.
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Rangertrihug
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Infrastructure-as-Code (IaC) using Terraform
Infrastructure-as-Code (IaC) using TerraformInfrastructure-as-Code (IaC) using Terraform
Infrastructure-as-Code (IaC) using TerraformAdin Ermie
 
Secure your applications with Azure AD and Key Vault
Secure your applications with Azure AD and Key VaultSecure your applications with Azure AD and Key Vault
Secure your applications with Azure AD and Key VaultDavide Benvegnù
 
Policy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelPolicy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelMitchell Pronschinske
 
HashiCorp Brand Guide
HashiCorp Brand GuideHashiCorp Brand Guide
HashiCorp Brand GuideHashiCorp
 
Azure Security Fundamentals
Azure Security FundamentalsAzure Security Fundamentals
Azure Security FundamentalsLorenzo Barbieri
 
Microsoft Azure Active Directory
Microsoft Azure Active DirectoryMicrosoft Azure Active Directory
Microsoft Azure Active DirectoryDavid J Rosenthal
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrewLibbySchulze
 

What's hot (20)

Terraform on Azure
Terraform on AzureTerraform on Azure
Terraform on Azure
 
NF102: Nutanix AHV Basics
NF102: Nutanix AHV BasicsNF102: Nutanix AHV Basics
NF102: Nutanix AHV Basics
 
Hashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public SectorHashicorp Vault - OPEN Public Sector
Hashicorp Vault - OPEN Public Sector
 
Azure Security Overview
Azure Security OverviewAzure Security Overview
Azure Security Overview
 
Hashicorp Vault Open Source vs Enterprise
Hashicorp Vault Open Source vs EnterpriseHashicorp Vault Open Source vs Enterprise
Hashicorp Vault Open Source vs Enterprise
 
Best Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with TerraformBest Practices of Infrastructure as Code with Terraform
Best Practices of Infrastructure as Code with Terraform
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Hadoop Security
Hadoop SecurityHadoop Security
Hadoop Security
 
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC WongDocker Hub: Past, Present and Future by Ken Cochrane & BC Wong
Docker Hub: Past, Present and Future by Ken Cochrane & BC Wong
 
Adopting HashiCorp Vault
Adopting HashiCorp VaultAdopting HashiCorp Vault
Adopting HashiCorp Vault
 
TriHUG October: Apache Ranger
TriHUG October: Apache RangerTriHUG October: Apache Ranger
TriHUG October: Apache Ranger
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Infrastructure-as-Code (IaC) using Terraform
Infrastructure-as-Code (IaC) using TerraformInfrastructure-as-Code (IaC) using Terraform
Infrastructure-as-Code (IaC) using Terraform
 
Secure your applications with Azure AD and Key Vault
Secure your applications with Azure AD and Key VaultSecure your applications with Azure AD and Key Vault
Secure your applications with Azure AD and Key Vault
 
Policy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp SentinelPolicy as Code: IT Governance With HashiCorp Sentinel
Policy as Code: IT Governance With HashiCorp Sentinel
 
HashiCorp Brand Guide
HashiCorp Brand GuideHashiCorp Brand Guide
HashiCorp Brand Guide
 
Azure Security Fundamentals
Azure Security FundamentalsAzure Security Fundamentals
Azure Security Fundamentals
 
Microsoft Azure Active Directory
Microsoft Azure Active DirectoryMicrosoft Azure Active Directory
Microsoft Azure Active Directory
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrew
 
Infrastructure as Code
Infrastructure as CodeInfrastructure as Code
Infrastructure as Code
 

Viewers also liked

Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkDataWorks Summit
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Peter Wood
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Kevin Minder
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data miningharithavijay94
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Apache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXApache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXAbhishek Mallick
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Emilio Coppa
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authenticationleahculver
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1Amal Abid
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
Hadoop et son écosystème
Hadoop et son écosystèmeHadoop et son écosystème
Hadoop et son écosystèmeKhanh Maudoux
 

Viewers also liked (20)

Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Performance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache SparkPerformance Update: When Apache ORC Met Apache Spark
Performance Update: When Apache ORC Met Apache Spark
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)Big Data and Security - Where are we now? (2015)
Big Data and Security - Where are we now? (2015)
 
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
Securing Hadoop's REST APIs with Apache Knox Gateway Hadoop Summit June 6th, ...
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Information security in big data -privacy and data mining
Information security in big data -privacy and data miningInformation security in big data -privacy and data mining
Information security in big data -privacy and data mining
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop
HadoopHadoop
Hadoop
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Apache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOXApache Knox setup and hive and hdfs Access using KNOX
Apache Knox setup and hive and hdfs Access using KNOX
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)Hadoop Internals (2.3.0 or later)
Hadoop Internals (2.3.0 or later)
 
OAuth - Open API Authentication
OAuth - Open API AuthenticationOAuth - Open API Authentication
OAuth - Open API Authentication
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Cours Big Data Chap1
Cours Big Data Chap1Cours Big Data Chap1
Cours Big Data Chap1
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Hadoop et son écosystème
Hadoop et son écosystèmeHadoop et son écosystème
Hadoop et son écosystème
 

Similar to Hadoop and Data Access Security

Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoopNiel Dunnage
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
 
Cloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceCloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceGoDataDriven
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessCloudera, Inc.
 
The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubDataWorks Summit
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014Cloudera, Inc.
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopCloudera, Inc.
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_securityAdam Muise
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextHellmar Becker
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopDataWorks Summit
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaCaserta
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahidBigDataExpo
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
大数据数据安全
大数据数据安全大数据数据安全
大数据数据安全Jianwei Li
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentryBrock Noland
 

Similar to Hadoop and Data Access Security (20)

Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
 
Cloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and GovernanceCloudera GoDataFest Security and Governance
Cloudera GoDataFest Security and Governance
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 
The Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data HubThe Future of Data Management - the Enterprise Data Hub
The Future of Data Management - the Enterprise Data Hub
 
The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014The Future of Hadoop Security - Hadoop Summit 2014
The Future of Hadoop Security - Hadoop Summit 2014
 
Project Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for HadoopProject Rhino: Enhancing Data Protection for Hadoop
Project Rhino: Enhancing Data Protection for Hadoop
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Bringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache HadoopBringing Trus and Visibility to Apache Hadoop
Bringing Trus and Visibility to Apache Hadoop
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 
Intel boubker el mouttahid
Intel boubker el mouttahidIntel boubker el mouttahid
Intel boubker el mouttahid
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
大数据数据安全
大数据数据安全大数据数据安全
大数据数据安全
 
Hive contributors meetup apache sentry
Hive contributors meetup   apache sentryHive contributors meetup   apache sentry
Hive contributors meetup apache sentry
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Recently uploaded (20)

Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Hadoop and Data Access Security

  • 2. Hadoop Security and Compliance Challenges 2 • History • Security was not a priority in early Hadoop adopters like Yahoo! and Facebook / it is now! • Data concentration • Quantity and diversity of data creates compliance challenges • Flexibility of the Hadoop architecture • Many paths for data in, out, processing • Access data at different granularities, from fields to files • ELT: sensitive data “discovery” occurs after data arrives
  • 3. Cloudera has led in investments in security 3 Authentication • First Hadoop distribution to offer strong authentication throughout Encryption • First Hadoop distribution to support encryption on wire Audit • Only Hadoop distribution to support audit histories for all data objects & access paths • Single point for log capture, audit Authorization • Founded the Apache Sentry project along with Oracle and Lab41 to manage fine- grained permissions Automation • Cloudera Manager automates security configurations & LDAP/AD integration
  • 4. Case Study: Finance and Banking • Identify patterns in financially-sensitive, PCI and PII data • Before: Unable to build applications on Hadoop; forced to use other systems, to greatly limit Hadoop access, or to forgo analysis due to privacy concerns • Now: Provide broad analysis capabilities with Impala to large population and secured by Sentry Fraud and Purchasing Behavior Analysis
  • 5. Enterprise Security in Hadoop overview 5 Four Functional Areas Hadoop Cluster Users Applications Operators Perimeter Data Access Visibility
  • 6. Defining the Functional Areas 6 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage
  • 7. Enabling Enterprise Security 7 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 8. Enabling Enterprise Security 8 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 9. Perimeter: Authentication in Hadoop 10 Kerberos • Provably strong authentication between all Hadoop services and (optionally) to end-points • Cloudera Manager hides complexity LDAP/AD • Username / password • Option for Hue, Hive Metastore, Impala connectors, Cloudera Manager admin logins SAML • For Single Sign-On (SSO) for listed options • Kerberos clients no longer required on most user end-points
  • 10. Authentication Options and Coverage 11 HDFS DN NN YARN RM AM Impala ID SS MapReduce JT TT … Services … (Oozie, Search, etc.) 3rd Party Gateway … Client Client Client Client … Applications … (Pig, Hive, Hue, etc.) “End-to-End” Kerberos “Core” Kerberos “Edge” AD/LDAP/SAML
  • 11. IT Integration: Kerberos • Users don’t want Yet Another Credential • Corp IT doesn’t want to provision and maintain thousands of service principals and keytabs • Solution: local KDC + one-way trust • Run MIT Kerberos KDC in the cluster • Put all service principals here • Set up one-way trust of central corporate realm by local KDC • Normal user credentials can be used to access Hadoop • Recommended: Use Cloudera Manager • To properly tune inter-related configuration knobs • To manage principals/keytabs creation and distribution • To preserve service monitoring with Kerberos security enabled
  • 12. IT Integration: Kerberos + LDAP Hadoop Cluster Local KDC (MIT Kerberos) hdfs/host1@HADOOP.EXAMPLE.COM yarn/host2@HADOOP.EXAMPLE.COM … Central Active Directory user@EXAMPLE.COM … Cross-realm trust NN JT LDAP group mapping
  • 13. Network Access Management • Use Hue to front-end both Hadoop and Oozie to control access through a web browser • HTTP proxy servers: • Oozie : MR jobs, Pig jobs, Hive jobs • HttpFS: hadoop fs is front-ended over HTTP • HBase REST server: HBase reads Secure configuration with Oozie, Hue and HttpFS front-ends co-located to act as network bridge Hue supports AD/LDAP based authentication instead of Kerberos for client simplicity
  • 14. Enabling Enterprise Security 15 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 15. Data: Protection in Hadoop 16 Data in Motion Data at Rest “Network Encryption” • SASL: Network RPC • SSL: MapReduce shuffle • SSL: Web-based user and administration tools • SSL: JDBC • HDFS data transfer protocol “Data Encryption” • Certified partner solutions • Field-level encryption • Data masking or tokenization • OS-level file system encryption
  • 16. Enabling Enterprise Security 18 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 17. Prior State of Authorization Two Sub-Optimal Choices for SQL on Hadoop 19 • Insecure Advisory Authorization • Users could grant themselves permissions • Intended to prevent accidental deletion of data • Problem: Did not guard against malicious users • Problem: Only worked with Hive • HDFS Impersonation • Data was only protected at the file level by HDFS permissions • Problem: File-level not granular enough • Problem: Lacked flexibility; not role-based
  • 18. Sentry: Key Capabilities 21 Fine-Grained Authorization • Specify security for SERVERS, DATABASES, TABLES, VIEWS, and search indices Role-Based Authorization • SELECT privilege on views & tables • INSERT privilege on tables • TRANSFORM privilege on servers • ALL privilege on the server, databases, tables & views • ALL privilege is needed to create/modify schema Multitenant Administration • Separate policies for each database/schema • Can be maintained by separate admins
  • 19. Sentry Architecture 22 Binding Layer Impala Impala Hive Policy Engine Search Policy Provider File Database HiveServer2 Authorization Provider Evaluation, Validation Parsing Interface Interface Local FS/HDFS Search
  • 20. QueryMR SQL Query Execution Flow 23 Parse Build Check Plan Sentry Validate SQL grammar Construct statement tree Validate statement objects • First check: Authorization Forward to execution planner
  • 21. Multitenant Security Global [groups] admin_group = admin_role dep1_admin = uri_role [roles] admin_role = server=server1 uri_role = hdfs:///ha-nn-uri/data [databases] db1 = hdfs://ha-nn- uri/user/hive/sentry/db1.ini Per Database [groups] dep1_admin = db1_admin_role dep1_analyst = db1_read_role [roles] db1_admin_role = server=server1- >db=db1 db1_read_role = server=server1- >db=db1->table=*->action=select
  • 22. Apache Ecosystem and Sentry Inline support in Cloudera Impala Extensibility plug-in for Apache HiveServer2 Inline support in Cloudera Search Complementary security with HDFS ACLs
  • 23. Access: Authorization in Hadoop 26 File ACL Admin RBAC Data RBAC • Permission at file-level granularity • HDFS POSIX-style permissions: u/g/o • Access Control Lists (ACL) • HBase, Oozie, MapReduce • Permissions on tables, views, indices • Sentry for HiveServer2, Impala, Search App and Workflow • Cloudera Manager, Hue
  • 24. Enabling Enterprise Security 28 Perimeter Guarding access to the cluster itself Technical Concepts: Authentication Network isolation Data Protecting data in the cluster from unauthorized visibility Technical Concepts: Encryption, Tokenization, Data masking Access Defining what users and applications can do with data Technical Concepts: Permissions Authorization Visibility Reporting on where data came from and how it’s being used Technical Concepts: Auditing Lineage SentryKerberos | AD/LDAP Cloudera NavigatorNative | Certified Partners
  • 25. Visibility: Cloudera Navigator 29 Audit & Access Control • Maintain full audit history • Ensuring appropriate permissions and reporting on data access for compliance Discovery & Exploration • Finding out what data is available and what it looks like Lineage • Tracing data back to its original source Lifecycle Management • Migration of data based on policies 3RD PARTY APPS STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE CLOUDERA’S ENTERPRISE DATA HUB BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK STREAM PROCESSING SPARK STREAMING WORKLOAD MANAGEMENT YARN FILESYSTEM HDFS ONLINE NOSQL HBASE DATA MANAGEMENT CLOUDERANAVIGATOR SYSTEM MANAGEMENT CLOUDERAMANAGER SENTRY, SECURE
  • 26. Why Navigator? 30 Lots of Data Landing in Cloudera Enterprise  Huge quantities  Many different sources – structured and unstructured  Varying levels of sensitivity 1 Many Users Working with the Data  Administrators and compliance officers  Analysts and data scientists  Business users 2 Need to Effectively Control and Consume Data  Get visibility and control over the environment  Discover and explore data 3
  • 27. 31 31
  • 28. 32 32
  • 29. 33 33
  • 30. Leading Investment to Address the Challenges 34 Authentication First Hadoop distribution to offer strong authentication throughout Encryption First Hadoop distribution to support encryption on wire Audit Only Hadoop distribution to support audit histories for all data objects and access paths; Single point for log capture, audit Authorization Founded the Apache Sentry project along with Oracle and Lab41 to manage fine-grained permissions Automation Cloudera Manager automates security configurations & LDAP/AD integration
  • 31. Cloudera 5: Enabling the Enterprise Data Hub 35 Open Source Scalable Flexible Cost-Effective ✔ Managed ✖ Open Architecture ✖ Secure and Governed ✖ ✔ ✔ ✔ 3RD PARTY APPS STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE CLOUDERA’S ENTERPRISE DATA HUB BATCH PROCESSING MAPREDUCE ANALYTIC SQL IMPALA SEARCH ENGINE SOLR MACHINE LEARNING SPARK STREAM PROCESSING SPARK STREAMING WORKLOAD MANAGEMENT YARN FILESYSTEM HDFS ONLINE NOSQL HBASE DATA MANAGEMENT CLOUDERANAVIGATOR SYSTEM MANAGEMENT CLOUDERAMANAGER SENTRY