Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
+
Building Secure Applications
With HBase / Accumulo
Sujee Maniyam
sujee@elephantscale.com
Nosql now! 2014 Conference
Aug ...
+
About This Talk…
n Some practical tips & design patterns on
building secure applications using HBase
and Accumulo
n A ...
+
Who Invited This Guy?
n  HI, I am Sujee Maniyam
n  Founder / Principal @ Elephant Scale
Consulting & Training in Big D...
+
NoSQL eco-system (too many!)
+
HBase : Quick Intro
n  Modeled after Google Big Table
n  Distributed, Nosql store built on Hadoop / HDFS
n  Apache pr...
+
Accumulo : Quick Intro
n  Developed by the National Security Agency (NSA) !
n  Google Big Table implementation
n  Nos...
+
HBase & Accumulo
n  Both are Big Table implementation
n  Based on HDFS
n  Written in Java
n  Apache open source proj...
+
Approach to Security in Hadoop
Until Recently…
+
But Security Picture Has Improved
Rapidly…
n  Lot of work going on in the eco system
n  Hadoop vendors (Cloudera / Hor...
+
Next : Building Secure Applications
+
What Does It Mean to be ‘Secure’?
n  1) Control who can get in?
n  2) Verify the person’s identity
n  3) safeguard co...
+
1) Who can get in
n  Control which machines can connect to NoSQL cluster
n  Don’t expose the cluster to public
n  Too...
+
Trusted Environment
+
2) User Authentication
n  Wolf: Knock… Knock…
n  Pig :Who is there?
n  Wolf : It is me… little pig
n  How can we ver...
+
Kerberos : Quick Primer
n  Kerberos is a authentication protocol for networked
machines
n  Validates client to server ...
+
Kerberos Protocol for Getting a
Beer in a Carnival / Fair J_
+
Kerberos Protocol Explained :
Getting Beer @ Fair / Party
n  Prove your age (identity) to wrist-band issuer
n  Ticket ...
+
Kerberos Integration
HBase Accumulo
Kerberos Integration yes Yes
(simple authentication
built-in also)
+
3) Secure Client Communication
n  Guard client / server communication (‘on the wire’)
n  Done by using SASL (certifica...
+
4) What Is Allowed For This User?
n  In unsecured environment users can read / write to any table
n  à not very secur...
+
Quick Primer on HBase Storage
n  Tables have many rows
n  Row has multiple columns (or qualifiers)
n  They are groupe...
+
HBase Allows Access Control At
Family Level
info secure
Customer_id name email phone Last 4
social
Full ssn
First level ...
+
Need More Fine Grained Access
n  We like to provide ‘cell level’ access controls
n  Greater flexibility in application...
+
Accumulo Data Model
Family : info
Columns à name email Last 4 ssn Ssn Gmail
password
Visibility
tokens à
Level 1 Level...
+
Users Are Assigned ‘Visibility
Tokens’
User id Visibility levels
User 1 Level 1
User 2 Level 1 + Level 2
Edward Snowden ...
+
Accumulo only returns cells visible
to user
family
Columns à name email Last 4 SSN Full SSN Gmail
password
person1 Joe ...
+
What Users Can See…
User Visibility Privilage Visible Cells
User 1 Level 1 Name
Email
Last 4 ssn
User 2 Level 1 +
Level ...
+
Good News For HBase
n  With release 0.98 Hbase also allows cell based access
controls
n  Called ‘tags’
n  Need to upg...
+
Visibility / Access Controls
n  Both HBase and Accumulo allow access control for the data
Hbase Accumulo
Cell Level Vis...
+
5) Final Step : Encrypt Data At Rest
n  Eventually data ends up in disk
n  We need to protect the ‘raw data’ on disk
n...
+
Solution : Encrypt Data
Transparently
n  Encryption is done via keys
n  Uses Java Cryptography Extension (JCE)
n  Dat...
+
HBase & Accumulo :Transparent
Encryption
+
Encryption : Key Management
n  The keys have to managed carefully…
n  Don’t loose them !
n  Don’t compromise them !!
...
+
Summary
HBase Accumulo
Runs in a trusted environment Yes
(outside
configuration)
Yes
(outside
configuration)
User Authen...
+
Useful Resources
n  Accumulo
n  http://www.slideshare.net/DonaldMiner/accumulo-
oct2013bofpresentation
n  HBase
n  h...
+
DEMO
+
Demo Explained
Name email ssn Gmail_pas
sword
Person1 Joe Smith joe@gmail.
com
123-45-6789 ‘JoeDaMan!’
Visibility
Level
...
+
Demo : Accumulo Users + Visibility
Accumulo
user
Table1
access
Access
level
Visible Columns
root yes all all
user1 yes L...
+
Thanks & Questions!
sujee@ElephantScale.com
http://ElephantScale.com
Expert consulting & training in Big Data
(Hadoop, N...
Upcoming SlideShare
Loading in …5
×

Building secure NoSQL applications nosqlnow_conf_2014

1,566 views

Published on

Tips on building secure NoSQL applications with HBase and Accumulo

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Building secure NoSQL applications nosqlnow_conf_2014

  1. 1. + Building Secure Applications With HBase / Accumulo Sujee Maniyam sujee@elephantscale.com Nosql now! 2014 Conference Aug 2014, San Jose, CA
  2. 2. + About This Talk… n Some practical tips & design patterns on building secure applications using HBase and Accumulo n A quick demo (fingers crossed!) n Audience : technical
  3. 3. + Who Invited This Guy? n  HI, I am Sujee Maniyam n  Founder / Principal @ Elephant Scale Consulting & Training in Big Data, NoSQL n  Co-Author of open source Hadoop book: http://hadoopilluminated.com n  Founder / Organizer of ‘Big Data Guru’ meetup http://www.meetup.com/BigDataGurus/ n  Open source : http://github.com/sujee n  http://sujee.net | http://www.linkedin.com/in/sujeemaniyam
  4. 4. + NoSQL eco-system (too many!)
  5. 5. + HBase : Quick Intro n  Modeled after Google Big Table n  Distributed, Nosql store built on Hadoop / HDFS n  Apache project n  http://hbase.apache.org/ HDFS HBase
  6. 6. + Accumulo : Quick Intro n  Developed by the National Security Agency (NSA) ! n  Google Big Table implementation n  Nosql store on top of HDFS n  Security is a first grade concept HDFS Accumulo
  7. 7. + HBase & Accumulo n  Both are Big Table implementation n  Based on HDFS n  Written in Java n  Apache open source projects HDFS HBase Accumulo
  8. 8. + Approach to Security in Hadoop Until Recently…
  9. 9. + But Security Picture Has Improved Rapidly… n  Lot of work going on in the eco system n  Hadoop vendors (Cloudera / HortonWorks ..) have been very actively working on security features n  ‘the core’ features are in n  Ease of use improving as well
  10. 10. + Next : Building Secure Applications
  11. 11. + What Does It Mean to be ‘Secure’? n  1) Control who can get in? n  2) Verify the person’s identity n  3) safeguard communications with user n  4) What is allowed for this user n  5) And finally… n  Protect data at rest
  12. 12. + 1) Who can get in n  Control which machines can connect to NoSQL cluster n  Don’t expose the cluster to public n  Too many open ports n  Too vulnerable n  Solutions: n  Run cluster behind firewall n  Restrict which machines can connect to cluster n  Linux / Network level security n  Outside the actual NoSQL
  13. 13. + Trusted Environment
  14. 14. + 2) User Authentication n  Wolf: Knock… Knock… n  Pig :Who is there? n  Wolf : It is me… little pig n  How can we verify the user? n  Username / password (gmail) n  Or use a third person (referee) n  Kerberos Source : http://1.bp.blogspot.com/
  15. 15. + Kerberos : Quick Primer n  Kerberos is a authentication protocol for networked machines n  Validates client to server and vice-versa n  Strong crypto algorithms (AES, 3DES…)
  16. 16. + Kerberos Protocol for Getting a Beer in a Carnival / Fair J_
  17. 17. + Kerberos Protocol Explained : Getting Beer @ Fair / Party n  Prove your age (identity) to wrist-band issuer n  Ticket Granting Ticket n  Get a wristband à qualifies you to get beer n  Service Ticket n  Go to bartender and ask for beer using your wrist-band n  Service Request n  Get Beer ! J n  For technically correct explanation see : http://www.roguelynn.com/words/explain-like-im-5- kerberos/
  18. 18. + Kerberos Integration HBase Accumulo Kerberos Integration yes Yes (simple authentication built-in also)
  19. 19. + 3) Secure Client Communication n  Guard client / server communication (‘on the wire’) n  Done by using SASL (certificates) n  Prevents snooping by third parties Hbase Accumulo Secure client communications Yes Yes
  20. 20. + 4) What Is Allowed For This User? n  In unsecured environment users can read / write to any table n  à not very secure! n  Control which data users can see..
  21. 21. + Quick Primer on HBase Storage n  Tables have many rows n  Row has multiple columns (or qualifiers) n  They are grouped into column families n  Each cell also has a timestamp (not shown here) info secure Customer_id name email phone Last 4 social Full ssn Family1 Cell Family2
  22. 22. + HBase Allows Access Control At Family Level info secure Customer_id name email phone Last 4 social Full ssn First level CSR can Only access this family Only supervisors can access this family
  23. 23. + Need More Fine Grained Access n  We like to provide ‘cell level’ access controls n  Greater flexibility in application development n  More fine grained access controls n  Meet Accumulo’s Data Model
  24. 24. + Accumulo Data Model Family : info Columns à name email Last 4 ssn Ssn Gmail password Visibility tokens à Level 1 Level 1 Level 1 Level 2 OR Top clearance Top clearance •  Every thing in HBase data model •  Plus each row has a ‘Visibility Token’
  25. 25. + Users Are Assigned ‘Visibility Tokens’ User id Visibility levels User 1 Level 1 User 2 Level 1 + Level 2 Edward Snowden Level 1 + Level 2 + Top Clearance
  26. 26. + Accumulo only returns cells visible to user family Columns à name email Last 4 SSN Full SSN Gmail password person1 Joe joe@gma il.com 6789 123-45-67 89 JoeSuper Man! Visibility tokens à Level 1 Level 1 Level 1 Level 2 OR Top clearance Top clearance
  27. 27. + What Users Can See… User Visibility Privilage Visible Cells User 1 Level 1 Name Email Last 4 ssn User 2 Level 1 + Level 2 Name Email Last 4 SSN Full SSN Edward Snowden Level 1 + Level 2 + Top Clearance Name Email Last 4 SSN Full SSN Gmail Password
  28. 28. + Good News For HBase n  With release 0.98 Hbase also allows cell based access controls n  Called ‘tags’ n  Need to upgrade to Hfile V3 (version 3) format
  29. 29. + Visibility / Access Controls n  Both HBase and Accumulo allow access control for the data Hbase Accumulo Cell Level Visibility Yes (Starting with v 0.98) Yes
  30. 30. + 5) Final Step : Encrypt Data At Rest n  Eventually data ends up in disk n  We need to protect the ‘raw data’ on disk n  To prevent n  Users going to disk directly n  Theft of hardware
  31. 31. + Solution : Encrypt Data Transparently n  Encryption is done via keys n  Uses Java Cryptography Extension (JCE) n  Data is encrypted before writing to HDFS n  Does not rely on HDFS or Linux level encryption n  Per family encryption is supported Hbase Accumulo Encryption At Rest Yes Yes
  32. 32. + HBase & Accumulo :Transparent Encryption
  33. 33. + Encryption : Key Management n  The keys have to managed carefully… n  Don’t loose them ! n  Don’t compromise them !! n  Possible storage mechanisms n  Database n  Remote file server n  Key management server n  Local file system
  34. 34. + Summary HBase Accumulo Runs in a trusted environment Yes (outside configuration) Yes (outside configuration) User Authentication Kerberos Kerberos + Built-in Secure client communications (via SSL) Yes Yes Visibility at cell level Yes (starting from v0.98) Yes Encrypt data at rest Yes Yes
  35. 35. + Useful Resources n  Accumulo n  http://www.slideshare.net/DonaldMiner/accumulo- oct2013bofpresentation n  HBase n  http://hbase.apache.org/book/hbase.encryption.server.html
  36. 36. + DEMO
  37. 37. + Demo Explained Name email ssn Gmail_pas sword Person1 Joe Smith joe@gmail. com 123-45-6789 ‘JoeDaMan!’ Visibility Level Level 1 Level 1 Level 2 Top Demonstrate cell level visibility feature of accumulo Here is how the data looks like:
  38. 38. + Demo : Accumulo Users + Visibility Accumulo user Table1 access Access level Visible Columns root yes all all user1 yes Level 1 Name, email user2 yes Level 1 + Level 2 Name, email + SSN esnowden yes Level 1 + Level 2 + Top Name, email + SSN + Gmail password J user3 no N/A N/A
  39. 39. + Thanks & Questions! sujee@ElephantScale.com http://ElephantScale.com Expert consulting & training in Big Data (Hadoop, NoSQL, Spark) Free, online Hadoop book ‘Hadoop illuminated’

×