SlideShare a Scribd company logo
1
Open Source Security
Tools For Big Data
Rommel Garcia
@rommelgarcia
Hortonworks
2
# whoami
 Global Security SME Lead @hortonworks
 Senior Solutions Engineer @hortonworks
 Book Author – Virtualizing Hadoop
 Co-organizer of Atlanta Hadoop User Group
 Regular Speaker at Big Data Conferences
Big Data Landscape
4
DATA – More Volume and More Types
I N C R E A S I N G D ATA V A R I E T Y A N D C O M P L E X I T Y
USER GENERATED CONTENT
MOBILE WEB
SMS/MMS
SENTIMENT
EXTERNAL
DEMOGRAPHICS
HD VIDEO
SPEECH TO TEXT
PRODUCT/
SERVICE LOGS
SOCIAL NETWORK
BUSINESS
DATA FEEDS
USER CLICK STREAM
WEB LOGS
OFFER HISTORY DYNAMIC PRICING
A/B TESTING
AFFILIATE NETWORKS
SEARCH MARKETING
BEHAVIORAL TARGETING
DYNAMIC FUNNELSPAYMENT
RECORD
SUPPORT
CONTACTS
CUSTOMER
TOUCHESPURCHASE DETAIL
PURCHASE
RECORD
SEGMENTATIONOFFER DETAILS
P E TA BY T E S
T E R A BY T E S
G I G A BY T E S
E X A BY T E S
E R P
BIG DATA
WEB
CR M
5
Big Data Ecosystem
Big Data Platform
DATA REPOSITORIES
Risk modeling
Fraud detection
Compliance (AML, KYC)
Bank 3.0
Information security
Single view of customer
Trading applications
Market data management
ANALYSIS & VISUALIZATION
Security
Operations
Governance
&Integration
°1 ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° N
YARN : Data Operating System
Script SQL NoSQL Stream Search Others
HDFS
(Hadoop Distributed File System)
In-Mem
TRADITIONAL SOURCES
EDW
OLAP Datamarts
Column
Databases
CRM
RDBMS
LENDING MARKETS TRADES COMPLIANCE DATA
CREDIT CARD CASH & EQUITY FINANCE & GL RISK DATA
EMERGING & NON-TRADITIONAL SOURCES
SERVER LOGS CALL CENTER EMAILS
WORD
DOCUMENTS
LOCATION DATA SENSOR DATA
CUSTOMER
SENTIMENT
RESEARCH
REPORTS
6
• HIPAA - Health Insurance Portability and Accountability Act of 1996
• HITECH - The Health Information Technology for Economic and Clinical Health Act
• PCI DSS - Payment Card Industry Data Security Standard
• SOX - The Sarbanes-Oxley Act of 2003
• ISO - International Organization Standardization
• COBIT - Control Objectives for Information and Related Technology
• Corporate Security Policies
Compliance Adherences
Big Data Security
8
• Authentication
• Authorization
• Audit
• Data at rest/in-motion Encryption
• Centralized Administration
5 Pillars of Security
9
Big Data Ecosystem
Big Data Platform
DATA REPOSITORIES
Risk modeling
Fraud detection
Compliance (AML, KYC)
Bank 3.0
Information security
Single view of customer
Trading applications
Market data management
ANALYSIS & VISUALIZATION
Security
Operations
Governance
&Integration
°1 ° ° ° ° ° ° °
° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° N
YARN : Data Operating System
Script SQL NoSQL Stream Search Others
HDFS
(Hadoop Distributed File System)
In-Mem
TRADITIONAL SOURCES
EDW
OLAP Datamarts
Column
Databases
CRM
RDBMS
LENDING MARKETS TRADES COMPLIANCE DATA
CREDIT CARD CASH & EQUITY FINANCE & GL RISK DATA
EMERGING & NON-TRADITIONAL SOURCES
SERVER LOGS CALL CENTER EMAILS
WORD
DOCUMENTS
LOCATION DATA SENSOR DATA
CUSTOMER
SENTIMENT
RESEARCH
REPORTS
1
1 Knox
2 Kerberos
3 Ranger
4 HDFS Enc.
5 LDAP
2
3
4
5
10
• Authentication ->
• Authorization ->
• Audit ->
• Data Protection ->
• Centralized Administration ->
5 Pillars of Security
11
Knox
12
Why Knox?
Simplified Access
• Kerberos encapsulation
• Extends API reach
• Single access point
• Multi-cluster support
• Single SSL certificate
Centralized Control
• Central REST API auditing
• Service-level authorization
• Alternative to SSH “edge node”
Enterprise Integration
• LDAP integration
• Active Directory integration
• SSO integration
• Apache Shiro extensibility
• Custom extensibility
Enhanced Security
• Protect network details
• Partial SSL for non-SSL services
• WebApp vulnerability filter
13
Knox Deployment with Hadoop Cluster
Application Tier
DMZ
Switch Switch
….
Master Nodes
Rack 1
Switch
NN
SNN
….
Slave Nodes
Rack 2
….
Slave Nodes
Rack N
SwitchSwitch
DN DN
Web Tier
LB
Knox
Hadoop CLIs
14
REST API
Hadoop
Services
What does Perimeter Security really mean?
Gateway
Firewall
User
Firewall
required at
perimeter
(today)
Knox Gateway
controls all
Hadoop REST API
access through
firewall
Hadoop
cluster
mostly
unaffected
Firewall only allows
connections
through specific
ports from Knox
host
Hive Host
HBase Host
WebHDFS
HBase Host
HBase Host
15
Kerberos
16
Why Kerberos?
Provides Strong Authentication
Establishes identity for users, services and hosts
Prevents impersonation on unauthorized account
Supports token delegation model
Works with existing directory services
Basis for Authorization
Page 16
17
Don’t be afraid of Kerberos…..
18
Security Implications
$ whoami
baduser
$ hadoop fs -ls /tmp
Found 2 items
drwx-wx-wx - ambari-qa hdfs 0 2015-07-14 18:38 /tmp/hive
drwx------ - hdfs hdfs 0 2015-07-14 20:33 /tmp/secure
$ hadoop fs -ls /tmp/secure
ls: Permission denied: user=baduser, access=READ_EXECUTE,
inode="/tmp/secure":hdfs:hdfs:drwx------
Good right?
19
Security Implications
$ whoami
baduser
$ hadoop fs -ls /tmp
Found 2 items
drwx-wx-wx - ambari-qa hdfs 0 2015-07-14 18:38 /tmp/hive
drwx------ - hdfs hdfs 0 2015-07-14 20:33 /tmp/secure
$ hadoop fs -ls /tmp/secure
ls: Permission denied: user=baduser, access=READ_EXECUTE,
inode="/tmp/secure":hdfs:hdfs:drwx------
Good right? – Look Again!!!
$ HADOOP_USER_NAME=hdfs hadoop fs -ls /tmp/secure
Found 1 items
drwxr-xr-x - hdfs hdfs 0 2015-07-14 20:35 /tmp/secure/blah
20
Kerberos Primer
Page 20
Client
KDC
NN
DN
1. kinit - Login and get Ticket Granting Ticket (TGT)
3. Get NameNode Service Ticket (NN-ST)
2. Client Stores TGT in Ticket Cache
4. Client Stores NN-ST in Ticket Cache
5. Read/write file given NN-ST and
file name; returns block locations,
block IDs and Block Access Tokens
if access permitted
6. Read/write block given
Block Access Token and block ID
Client’s
Kerberos Ticket
Cache
21
Ranger
22
Plugin PluginPlugin PluginPlugin Plugin
Apache Ranger authZ Architecture
Hive YARN Knox Storm Solr Kafka
Plugin
HDFS
Plugin
Audit Server Policy Server
Administration Portal
REST APIs
DB
SOLR
HDFS
KMS
LDAP/AD
user/group
syncLog4j
HBase
23
Sample Simplified Workflow - HDFS
Policy
Manager
Plugin
Admin sets policies for HDFS
files/folder
Data scientist runs a
map reduce job
User
Application
Users access HDFS data through
application Name Node
IT users access
HDFS through CLI
Namenode uses
Plugin for
Authorization
Audit
Database Audit logs pushed to DB
Namenode provides
resource access to
user/client
1
2
2
2
3
4
5
24
Ranger Stacks
• Apache Ranger v0.5 supports stack-model to enable easier onboarding of new
components, without requiring code changes in Apache Ranger.
Ranger Side Changes
Define Service-type
Secured Components Side Changes
Develop Ranger Authorization Plugin
• Create a JSON file with following
details :
- Resources
- Access types
- Config to connect
• Load the JSON into Ranger.
• Include plugin library in the secure component.
• During initialization of the service: Init RangerBasePlugIn &
RangerDefaultAuditHandler class.
• To authorize access to a resource: Use
RangerAccessRequest.isAccessAllowed()
• To support resource lookup: Implement
RangerBaseService.lookupResource() &
RangerBaseService.validateConfig()
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207
25
HDFS Encryption
26
Data Protection
Hadoop allows you to apply data protection policy at
two different layers across the Hadoop stack
Layer What? How ?
Storage Encrypt data in disk
Volume level: LUKS (Linux), BitLocker (Windows)
Native in Hadoop: HDFS Encryption
Partners: Voltage, Protegrity, DataGuise, Vormetric
OS level encrypt
Transmission Encrypt data as it moves
Native in Hadoop: SSL & SASL
AES 256 for SSL & DTP with SASL
27
Data at rest Encryption Protection
Volume Level Encryption (Open Source - LUKS, DMCrypt)
OS File Level Encryption (Open Source - eCryptfs)
Hadoop Level Encryption (HDFS TDE*, Hive CLE**, HBase** )
28
1
°
°
°
°
° °
° °
° °
° °
° N°
HDFS Encryption – How it works
DATA ACCESS
DATA MANAGEMENT
1 ° ° ° ° °
° ° ° ° ° °
° ° ° ° ° °
SECURITY
YARN
HDFS Client
° ° ° ° ° °
° ° ° ° ° °
° °
° °
° °
° °
°HDFS
(Hadoop Distributed File System)
Encryption Zone
(attributes - EZKey ID, version)
HDFS-6134
Encrypted File
(attributes - EDEK, IV)
Name Node
KeyProvider
API
KeyProvider
API
Key Management
System (KMS)
Hadoop-10433
KeyProvider API –
Hadoop-10141
EDEK
DEK
Crypto Stream
(r/w with DEK)
DEKs EZKs
Acronym Description
EZ Encryption Zone (an HDFS directory)
EZK Encryption Zone Key; master key associated with all
files in an EZ
DEK Data Encryption Key, unique key associated with each
file. EZ Key used to generate DEK
EDEK Encrypted DEK, Name Node only has access to
encrypted DEK.
IV Initialization Vector
EDEK
EDEK
29
As HDFS
Admin
HDFS Encryption – Common Commands
• Run KMS Server
– ./kms.sh run
• Create Encryption Key
– hadoop key create key1 -size 128
– # Key size can be 128, 192 or 256. 256 requires unlimited strength JCE file.
• List all Encryption Keys
– hadoop key list –metadata
• As an Admin(hdfs user) create an encryption Zone
– hdfs crypto -createZone -keyName key1 -path /secure1
– Point to an existing & empty directory
• List all Encryption Zones
– hdfs crypto –listZones
• Read/Write to HDFS unchanged
– hdfs dfs -copyFromLocal /tmp/vinay.txt /secure1
– hdfs dfs -cat /securehive/sal.txt
Run this as user not in HDFS admin role
As HDFS
End-user
30
Encrypting Data In-Motion
Page 30
Protocol Communication Point Encryption Mechanism
• REST • WebHDFS (Client to Cluster)
• Client to Knox
• REST over SSL
• Knox Gateway SSL
• SPNEGO - provides a mechanism for extending Kerberos to
Web applications through the standard HTTP protocol
• HTTP • NameNode/JobTracker UI
• MapReduce Shuffle
• HTTPS
• Encrypted MapReduce Shuffle (MAPREDUCE-4117)
• RPC • Hadoop Client (Client to
Cluster, Intra-Cluster)
• SASL – The Hadoop RPC system implements SASL which
provides different QoP including encryption
• JDBC/ODBC • HiveServer2 • SSL
• TCP/IP • Data Transfer (Client to
Cluster, Intra-Cluster)
• Encrypted DataTransfer Protocol available in Hadoop
• Adding SASL support to the DataTransferProtocol
Real-world Implementation
32
Data Sources
Data
Sources
33
Thank You !

More Related Content

What's hot

Protect your private data with ORC column encryption
Protect your private data with ORC column encryptionProtect your private data with ORC column encryption
Protect your private data with ORC column encryption
Owen O'Malley
 
getdns PyCon presentation
getdns PyCon presentationgetdns PyCon presentation
getdns PyCon presentation
Melinda Shore
 
An Overview of DNSSEC
An Overview of DNSSECAn Overview of DNSSEC
An Overview of DNSSEC
Carlos Martinez Cagnazzo
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
Srikrupa Srivatsan
 
DoH, DoT and ESNI
DoH, DoT and ESNIDoH, DoT and ESNI
DoH, DoT and ESNI
Jisc
 
Signing DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutionsSigning DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutions
APNIC
 
Dnssec tutorial-crypto-defs
Dnssec tutorial-crypto-defsDnssec tutorial-crypto-defs
Dnssec tutorial-crypto-defs
AFRINIC
 
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
Dan York
 
Dnssec
DnssecDnssec
Dnssec
guest3131f85
 
Understanding the DNS & DNSSEC
Understanding the DNS & DNSSECUnderstanding the DNS & DNSSEC
Understanding the DNS & DNSSEC
ICANN
 
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
APNIC
 
IPv6 Threat Presentation
IPv6 Threat PresentationIPv6 Threat Presentation
IPv6 Threat Presentation
johnmcclure00
 
DNSSEC Tutorial; USENIX LISA 2013
DNSSEC Tutorial; USENIX LISA 2013DNSSEC Tutorial; USENIX LISA 2013
DNSSEC Tutorial; USENIX LISA 2013
Shumon Huque
 
RSA APJ Velociraptor Lab
RSA APJ Velociraptor LabRSA APJ Velociraptor Lab
RSA APJ Velociraptor Lab
Velocidex Enterprises
 
Introduction To The DANE Protocol (DNSSEC)
Introduction To The DANE Protocol  (DNSSEC)Introduction To The DANE Protocol  (DNSSEC)
Introduction To The DANE Protocol (DNSSEC)
Deploy360 Programme (Internet Society)
 
DNS Cache Poisoning
DNS Cache PoisoningDNS Cache Poisoning
DNS Cache Poisoning
Christiaan Ottow
 
Extracting Forensic Information From Zeus Derivatives
Extracting Forensic Information From Zeus DerivativesExtracting Forensic Information From Zeus Derivatives
Extracting Forensic Information From Zeus Derivatives
Source Conference
 
The DNSSEC KSK of the root rolls
The DNSSEC KSK of the root rollsThe DNSSEC KSK of the root rolls
The DNSSEC KSK of the root rolls
Men and Mice
 
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAILDNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
Utah Networxs Consultoria e Treinamento
 
ION Santiago - DNSSEC and DANE Based Security for TLS
ION Santiago - DNSSEC and DANE Based Security for TLSION Santiago - DNSSEC and DANE Based Security for TLS
ION Santiago - DNSSEC and DANE Based Security for TLS
Deploy360 Programme (Internet Society)
 

What's hot (20)

Protect your private data with ORC column encryption
Protect your private data with ORC column encryptionProtect your private data with ORC column encryption
Protect your private data with ORC column encryption
 
getdns PyCon presentation
getdns PyCon presentationgetdns PyCon presentation
getdns PyCon presentation
 
An Overview of DNSSEC
An Overview of DNSSECAn Overview of DNSSEC
An Overview of DNSSEC
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
DoH, DoT and ESNI
DoH, DoT and ESNIDoH, DoT and ESNI
DoH, DoT and ESNI
 
Signing DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutionsSigning DNSSEC answers on the fly at the edge: challenges and solutions
Signing DNSSEC answers on the fly at the edge: challenges and solutions
 
Dnssec tutorial-crypto-defs
Dnssec tutorial-crypto-defsDnssec tutorial-crypto-defs
Dnssec tutorial-crypto-defs
 
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
Deploying New DNSSEC Algorithms (IEPG@IETF93 - July 2015)
 
Dnssec
DnssecDnssec
Dnssec
 
Understanding the DNS & DNSSEC
Understanding the DNS & DNSSECUnderstanding the DNS & DNSSEC
Understanding the DNS & DNSSEC
 
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
Understanding and Deploying DNSSEC, by Champika Wijayatunga [APRICOT 2015]
 
IPv6 Threat Presentation
IPv6 Threat PresentationIPv6 Threat Presentation
IPv6 Threat Presentation
 
DNSSEC Tutorial; USENIX LISA 2013
DNSSEC Tutorial; USENIX LISA 2013DNSSEC Tutorial; USENIX LISA 2013
DNSSEC Tutorial; USENIX LISA 2013
 
RSA APJ Velociraptor Lab
RSA APJ Velociraptor LabRSA APJ Velociraptor Lab
RSA APJ Velociraptor Lab
 
Introduction To The DANE Protocol (DNSSEC)
Introduction To The DANE Protocol  (DNSSEC)Introduction To The DANE Protocol  (DNSSEC)
Introduction To The DANE Protocol (DNSSEC)
 
DNS Cache Poisoning
DNS Cache PoisoningDNS Cache Poisoning
DNS Cache Poisoning
 
Extracting Forensic Information From Zeus Derivatives
Extracting Forensic Information From Zeus DerivativesExtracting Forensic Information From Zeus Derivatives
Extracting Forensic Information From Zeus Derivatives
 
The DNSSEC KSK of the root rolls
The DNSSEC KSK of the root rollsThe DNSSEC KSK of the root rolls
The DNSSEC KSK of the root rolls
 
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAILDNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
 
ION Santiago - DNSSEC and DANE Based Security for TLS
ION Santiago - DNSSEC and DANE Based Security for TLSION Santiago - DNSSEC and DANE Based Security for TLS
ION Santiago - DNSSEC and DANE Based Security for TLS
 

Similar to Open Source Security Tools for Big Data

Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
Uwe Printz
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Luan Moreno Medeiros Maciel
 
Big problems with big data – Hadoop interfaces security
Big problems with big data – Hadoop interfaces securityBig problems with big data – Hadoop interfaces security
Big problems with big data – Hadoop interfaces security
SecuRing
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a ServiceAWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
Amazon Web Services
 
Zeronights 2015 - Big problems with big data - Hadoop interfaces security
Zeronights 2015 - Big problems with big data - Hadoop interfaces securityZeronights 2015 - Big problems with big data - Hadoop interfaces security
Zeronights 2015 - Big problems with big data - Hadoop interfaces security
Jakub Kałużny
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
Karan Alang
 
How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized Environment
BlueData, Inc.
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
DataWorks Summit
 
Secure Hadoop as a Service - Session Sponsored by Intel
Secure Hadoop as a Service - Session Sponsored by IntelSecure Hadoop as a Service - Session Sponsored by Intel
Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
Cloudera, Inc.
 
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by IntelAWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Amazon Web Services
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
Andrey Kudryavtsev
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Hortonworks
 
batbern43 Self Service on a Big Data Platform
batbern43 Self Service on a Big Data Platformbatbern43 Self Service on a Big Data Platform
batbern43 Self Service on a Big Data Platform
BATbern
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
Positive Hack Days
 
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
PROIDEA
 

Similar to Open Source Security Tools for Big Data (20)

Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorksBig Data Security on Microsoft Azure - HDInsight and HortonWorks
Big Data Security on Microsoft Azure - HDInsight and HortonWorks
 
Big problems with big data – Hadoop interfaces security
Big problems with big data – Hadoop interfaces securityBig problems with big data – Hadoop interfaces security
Big problems with big data – Hadoop interfaces security
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
 
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a ServiceAWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
AWS Public Sector Symposium 2014 Canberra | Secure Hadoop as a Service
 
Zeronights 2015 - Big problems with big data - Hadoop interfaces security
Zeronights 2015 - Big problems with big data - Hadoop interfaces securityZeronights 2015 - Big problems with big data - Hadoop interfaces security
Zeronights 2015 - Big problems with big data - Hadoop interfaces security
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized Environment
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Secure Hadoop as a Service - Session Sponsored by Intel
Secure Hadoop as a Service - Session Sponsored by IntelSecure Hadoop as a Service - Session Sponsored by Intel
Secure Hadoop as a Service - Session Sponsored by Intel
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by IntelAWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
 
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
Trusted Analytics as a Service (BDT209) | AWS re:Invent 2013
 
DAOS Middleware overview
DAOS Middleware overviewDAOS Middleware overview
DAOS Middleware overview
 
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
batbern43 Self Service on a Big Data Platform
batbern43 Self Service on a Big Data Platformbatbern43 Self Service on a Big Data Platform
batbern43 Self Service on a Big Data Platform
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
 
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
[CONFidence 2016] Jakub Kałużny, Mateusz Olejarka - Big problems with big dat...
 

More from Great Wide Open

The Little Meetup That Could
The Little Meetup That CouldThe Little Meetup That Could
The Little Meetup That Could
Great Wide Open
 
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your DreamsLightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
Great Wide Open
 
Breaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational PullBreaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational Pull
Great Wide Open
 
Dealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to InfinityDealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to Infinity
Great Wide Open
 
You Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core FeaturesYou Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core Features
Great Wide Open
 
Hidden Features in HTTP
Hidden Features in HTTPHidden Features in HTTP
Hidden Features in HTTP
Great Wide Open
 
Using Cryptography Properly in Applications
Using Cryptography Properly in ApplicationsUsing Cryptography Properly in Applications
Using Cryptography Properly in Applications
Great Wide Open
 
Lightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open SourceLightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open Source
Great Wide Open
 
You have Selenium... Now what?
You have Selenium... Now what?You have Selenium... Now what?
You have Selenium... Now what?
Great Wide Open
 
How Constraints Cultivate Growth
How Constraints Cultivate GrowthHow Constraints Cultivate Growth
How Constraints Cultivate Growth
Great Wide Open
 
Inner Source 101
Inner Source 101Inner Source 101
Inner Source 101
Great Wide Open
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
Great Wide Open
 
Search is the new UI
Search is the new UISearch is the new UI
Search is the new UI
Great Wide Open
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
Great Wide Open
 
The Current Messaging Landscape
The Current Messaging LandscapeThe Current Messaging Landscape
The Current Messaging Landscape
Great Wide Open
 
Apache httpd v2.4
Apache httpd v2.4Apache httpd v2.4
Apache httpd v2.4
Great Wide Open
 
Understanding Open Source Class 101
Understanding Open Source Class 101Understanding Open Source Class 101
Understanding Open Source Class 101
Great Wide Open
 
Thinking in Git
Thinking in GitThinking in Git
Thinking in Git
Great Wide Open
 
Antifragile Design
Antifragile DesignAntifragile Design
Antifragile Design
Great Wide Open
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
Great Wide Open
 

More from Great Wide Open (20)

The Little Meetup That Could
The Little Meetup That CouldThe Little Meetup That Could
The Little Meetup That Could
 
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your DreamsLightning Talk - 5 Hacks to Getting the Job of Your Dreams
Lightning Talk - 5 Hacks to Getting the Job of Your Dreams
 
Breaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational PullBreaking Free from Proprietary Gravitational Pull
Breaking Free from Proprietary Gravitational Pull
 
Dealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to InfinityDealing with Unstructured Data: Scaling to Infinity
Dealing with Unstructured Data: Scaling to Infinity
 
You Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core FeaturesYou Don't Know Node: Quick Intro to 6 Core Features
You Don't Know Node: Quick Intro to 6 Core Features
 
Hidden Features in HTTP
Hidden Features in HTTPHidden Features in HTTP
Hidden Features in HTTP
 
Using Cryptography Properly in Applications
Using Cryptography Properly in ApplicationsUsing Cryptography Properly in Applications
Using Cryptography Properly in Applications
 
Lightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open SourceLightning Talk - Getting Students Involved In Open Source
Lightning Talk - Getting Students Involved In Open Source
 
You have Selenium... Now what?
You have Selenium... Now what?You have Selenium... Now what?
You have Selenium... Now what?
 
How Constraints Cultivate Growth
How Constraints Cultivate GrowthHow Constraints Cultivate Growth
How Constraints Cultivate Growth
 
Inner Source 101
Inner Source 101Inner Source 101
Inner Source 101
 
Running MySQL on Linux
Running MySQL on LinuxRunning MySQL on Linux
Running MySQL on Linux
 
Search is the new UI
Search is the new UISearch is the new UI
Search is the new UI
 
Troubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed DebuggingTroubleshooting Hadoop: Distributed Debugging
Troubleshooting Hadoop: Distributed Debugging
 
The Current Messaging Landscape
The Current Messaging LandscapeThe Current Messaging Landscape
The Current Messaging Landscape
 
Apache httpd v2.4
Apache httpd v2.4Apache httpd v2.4
Apache httpd v2.4
 
Understanding Open Source Class 101
Understanding Open Source Class 101Understanding Open Source Class 101
Understanding Open Source Class 101
 
Thinking in Git
Thinking in GitThinking in Git
Thinking in Git
 
Antifragile Design
Antifragile DesignAntifragile Design
Antifragile Design
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 

Recently uploaded

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 

Recently uploaded (20)

20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 

Open Source Security Tools for Big Data

  • 1. 1 Open Source Security Tools For Big Data Rommel Garcia @rommelgarcia Hortonworks
  • 2. 2 # whoami  Global Security SME Lead @hortonworks  Senior Solutions Engineer @hortonworks  Book Author – Virtualizing Hadoop  Co-organizer of Atlanta Hadoop User Group  Regular Speaker at Big Data Conferences
  • 4. 4 DATA – More Volume and More Types I N C R E A S I N G D ATA V A R I E T Y A N D C O M P L E X I T Y USER GENERATED CONTENT MOBILE WEB SMS/MMS SENTIMENT EXTERNAL DEMOGRAPHICS HD VIDEO SPEECH TO TEXT PRODUCT/ SERVICE LOGS SOCIAL NETWORK BUSINESS DATA FEEDS USER CLICK STREAM WEB LOGS OFFER HISTORY DYNAMIC PRICING A/B TESTING AFFILIATE NETWORKS SEARCH MARKETING BEHAVIORAL TARGETING DYNAMIC FUNNELSPAYMENT RECORD SUPPORT CONTACTS CUSTOMER TOUCHESPURCHASE DETAIL PURCHASE RECORD SEGMENTATIONOFFER DETAILS P E TA BY T E S T E R A BY T E S G I G A BY T E S E X A BY T E S E R P BIG DATA WEB CR M
  • 5. 5 Big Data Ecosystem Big Data Platform DATA REPOSITORIES Risk modeling Fraud detection Compliance (AML, KYC) Bank 3.0 Information security Single view of customer Trading applications Market data management ANALYSIS & VISUALIZATION Security Operations Governance &Integration °1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N YARN : Data Operating System Script SQL NoSQL Stream Search Others HDFS (Hadoop Distributed File System) In-Mem TRADITIONAL SOURCES EDW OLAP Datamarts Column Databases CRM RDBMS LENDING MARKETS TRADES COMPLIANCE DATA CREDIT CARD CASH & EQUITY FINANCE & GL RISK DATA EMERGING & NON-TRADITIONAL SOURCES SERVER LOGS CALL CENTER EMAILS WORD DOCUMENTS LOCATION DATA SENSOR DATA CUSTOMER SENTIMENT RESEARCH REPORTS
  • 6. 6 • HIPAA - Health Insurance Portability and Accountability Act of 1996 • HITECH - The Health Information Technology for Economic and Clinical Health Act • PCI DSS - Payment Card Industry Data Security Standard • SOX - The Sarbanes-Oxley Act of 2003 • ISO - International Organization Standardization • COBIT - Control Objectives for Information and Related Technology • Corporate Security Policies Compliance Adherences
  • 8. 8 • Authentication • Authorization • Audit • Data at rest/in-motion Encryption • Centralized Administration 5 Pillars of Security
  • 9. 9 Big Data Ecosystem Big Data Platform DATA REPOSITORIES Risk modeling Fraud detection Compliance (AML, KYC) Bank 3.0 Information security Single view of customer Trading applications Market data management ANALYSIS & VISUALIZATION Security Operations Governance &Integration °1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N YARN : Data Operating System Script SQL NoSQL Stream Search Others HDFS (Hadoop Distributed File System) In-Mem TRADITIONAL SOURCES EDW OLAP Datamarts Column Databases CRM RDBMS LENDING MARKETS TRADES COMPLIANCE DATA CREDIT CARD CASH & EQUITY FINANCE & GL RISK DATA EMERGING & NON-TRADITIONAL SOURCES SERVER LOGS CALL CENTER EMAILS WORD DOCUMENTS LOCATION DATA SENSOR DATA CUSTOMER SENTIMENT RESEARCH REPORTS 1 1 Knox 2 Kerberos 3 Ranger 4 HDFS Enc. 5 LDAP 2 3 4 5
  • 10. 10 • Authentication -> • Authorization -> • Audit -> • Data Protection -> • Centralized Administration -> 5 Pillars of Security
  • 12. 12 Why Knox? Simplified Access • Kerberos encapsulation • Extends API reach • Single access point • Multi-cluster support • Single SSL certificate Centralized Control • Central REST API auditing • Service-level authorization • Alternative to SSH “edge node” Enterprise Integration • LDAP integration • Active Directory integration • SSO integration • Apache Shiro extensibility • Custom extensibility Enhanced Security • Protect network details • Partial SSL for non-SSL services • WebApp vulnerability filter
  • 13. 13 Knox Deployment with Hadoop Cluster Application Tier DMZ Switch Switch …. Master Nodes Rack 1 Switch NN SNN …. Slave Nodes Rack 2 …. Slave Nodes Rack N SwitchSwitch DN DN Web Tier LB Knox Hadoop CLIs
  • 14. 14 REST API Hadoop Services What does Perimeter Security really mean? Gateway Firewall User Firewall required at perimeter (today) Knox Gateway controls all Hadoop REST API access through firewall Hadoop cluster mostly unaffected Firewall only allows connections through specific ports from Knox host Hive Host HBase Host WebHDFS HBase Host HBase Host
  • 16. 16 Why Kerberos? Provides Strong Authentication Establishes identity for users, services and hosts Prevents impersonation on unauthorized account Supports token delegation model Works with existing directory services Basis for Authorization Page 16
  • 17. 17 Don’t be afraid of Kerberos…..
  • 18. 18 Security Implications $ whoami baduser $ hadoop fs -ls /tmp Found 2 items drwx-wx-wx - ambari-qa hdfs 0 2015-07-14 18:38 /tmp/hive drwx------ - hdfs hdfs 0 2015-07-14 20:33 /tmp/secure $ hadoop fs -ls /tmp/secure ls: Permission denied: user=baduser, access=READ_EXECUTE, inode="/tmp/secure":hdfs:hdfs:drwx------ Good right?
  • 19. 19 Security Implications $ whoami baduser $ hadoop fs -ls /tmp Found 2 items drwx-wx-wx - ambari-qa hdfs 0 2015-07-14 18:38 /tmp/hive drwx------ - hdfs hdfs 0 2015-07-14 20:33 /tmp/secure $ hadoop fs -ls /tmp/secure ls: Permission denied: user=baduser, access=READ_EXECUTE, inode="/tmp/secure":hdfs:hdfs:drwx------ Good right? – Look Again!!! $ HADOOP_USER_NAME=hdfs hadoop fs -ls /tmp/secure Found 1 items drwxr-xr-x - hdfs hdfs 0 2015-07-14 20:35 /tmp/secure/blah
  • 20. 20 Kerberos Primer Page 20 Client KDC NN DN 1. kinit - Login and get Ticket Granting Ticket (TGT) 3. Get NameNode Service Ticket (NN-ST) 2. Client Stores TGT in Ticket Cache 4. Client Stores NN-ST in Ticket Cache 5. Read/write file given NN-ST and file name; returns block locations, block IDs and Block Access Tokens if access permitted 6. Read/write block given Block Access Token and block ID Client’s Kerberos Ticket Cache
  • 22. 22 Plugin PluginPlugin PluginPlugin Plugin Apache Ranger authZ Architecture Hive YARN Knox Storm Solr Kafka Plugin HDFS Plugin Audit Server Policy Server Administration Portal REST APIs DB SOLR HDFS KMS LDAP/AD user/group syncLog4j HBase
  • 23. 23 Sample Simplified Workflow - HDFS Policy Manager Plugin Admin sets policies for HDFS files/folder Data scientist runs a map reduce job User Application Users access HDFS data through application Name Node IT users access HDFS through CLI Namenode uses Plugin for Authorization Audit Database Audit logs pushed to DB Namenode provides resource access to user/client 1 2 2 2 3 4 5
  • 24. 24 Ranger Stacks • Apache Ranger v0.5 supports stack-model to enable easier onboarding of new components, without requiring code changes in Apache Ranger. Ranger Side Changes Define Service-type Secured Components Side Changes Develop Ranger Authorization Plugin • Create a JSON file with following details : - Resources - Access types - Config to connect • Load the JSON into Ranger. • Include plugin library in the secure component. • During initialization of the service: Init RangerBasePlugIn & RangerDefaultAuditHandler class. • To authorize access to a resource: Use RangerAccessRequest.isAccessAllowed() • To support resource lookup: Implement RangerBaseService.lookupResource() & RangerBaseService.validateConfig() https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741207
  • 26. 26 Data Protection Hadoop allows you to apply data protection policy at two different layers across the Hadoop stack Layer What? How ? Storage Encrypt data in disk Volume level: LUKS (Linux), BitLocker (Windows) Native in Hadoop: HDFS Encryption Partners: Voltage, Protegrity, DataGuise, Vormetric OS level encrypt Transmission Encrypt data as it moves Native in Hadoop: SSL & SASL AES 256 for SSL & DTP with SASL
  • 27. 27 Data at rest Encryption Protection Volume Level Encryption (Open Source - LUKS, DMCrypt) OS File Level Encryption (Open Source - eCryptfs) Hadoop Level Encryption (HDFS TDE*, Hive CLE**, HBase** )
  • 28. 28 1 ° ° ° ° ° ° ° ° ° ° ° ° ° N° HDFS Encryption – How it works DATA ACCESS DATA MANAGEMENT 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° SECURITY YARN HDFS Client ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° °HDFS (Hadoop Distributed File System) Encryption Zone (attributes - EZKey ID, version) HDFS-6134 Encrypted File (attributes - EDEK, IV) Name Node KeyProvider API KeyProvider API Key Management System (KMS) Hadoop-10433 KeyProvider API – Hadoop-10141 EDEK DEK Crypto Stream (r/w with DEK) DEKs EZKs Acronym Description EZ Encryption Zone (an HDFS directory) EZK Encryption Zone Key; master key associated with all files in an EZ DEK Data Encryption Key, unique key associated with each file. EZ Key used to generate DEK EDEK Encrypted DEK, Name Node only has access to encrypted DEK. IV Initialization Vector EDEK EDEK
  • 29. 29 As HDFS Admin HDFS Encryption – Common Commands • Run KMS Server – ./kms.sh run • Create Encryption Key – hadoop key create key1 -size 128 – # Key size can be 128, 192 or 256. 256 requires unlimited strength JCE file. • List all Encryption Keys – hadoop key list –metadata • As an Admin(hdfs user) create an encryption Zone – hdfs crypto -createZone -keyName key1 -path /secure1 – Point to an existing & empty directory • List all Encryption Zones – hdfs crypto –listZones • Read/Write to HDFS unchanged – hdfs dfs -copyFromLocal /tmp/vinay.txt /secure1 – hdfs dfs -cat /securehive/sal.txt Run this as user not in HDFS admin role As HDFS End-user
  • 30. 30 Encrypting Data In-Motion Page 30 Protocol Communication Point Encryption Mechanism • REST • WebHDFS (Client to Cluster) • Client to Knox • REST over SSL • Knox Gateway SSL • SPNEGO - provides a mechanism for extending Kerberos to Web applications through the standard HTTP protocol • HTTP • NameNode/JobTracker UI • MapReduce Shuffle • HTTPS • Encrypted MapReduce Shuffle (MAPREDUCE-4117) • RPC • Hadoop Client (Client to Cluster, Intra-Cluster) • SASL – The Hadoop RPC system implements SASL which provides different QoP including encryption • JDBC/ODBC • HiveServer2 • SSL • TCP/IP • Data Transfer (Client to Cluster, Intra-Cluster) • Encrypted DataTransfer Protocol available in Hadoop • Adding SASL support to the DataTransferProtocol