New Security Features in Apache HBase 0.98: An Operator's Guide

8,855 views

Published on

Speakers: Andrew Purtell and Ramkrishna Vasudevan (Intel)

HBase 0.98 introduces several new security features: visibility labels, cell ACLs, transparent encryption, and coprocessor framework changes. This talk will cover the new capabilities available in HBase 0.98+, the threat models and use cases they cover, how these features stack up against other data stores in the Apache big data ecosystem, and how operators and security architects can take advantage of them.

Published in: Software, Technology, Business

New Security Features in Apache HBase 0.98: An Operator's Guide

  1. 1. Security Features in Apache HBase – An Operator’s Guide Anoop Sam John, Andrew Purtell, Ramkrishna S. Vasudevan Committers and PMC Members, Apache HBase, Apache Software Foundation Big Data US Research And Development, Intel v5
  2. 2. • New Security Features in Apache HBase 0.98 • Controlling Access To Data – Role-Based Access Control Using Groups and ACLs – Role-Based Access Control Using Labels – Attribute-Based Access Control Using Labels • Preventing Data Leaks – Transparent Encryption • Performance Considerations Outline
  3. 3. New Security Features in Apache HBase 0.98
  4. 4. Cell Tags • All values written to HBase are stored in cells • Cells can now also carry an arbitrary number of tags – Metadata, considered distinct from the key and the value – Compressed when persisted to HFiles – Server side only • Clients cannot get or send cells with tags directly • Tags will be correctly replicated if cross-cluster replication is enabled
  5. 5. Cell ACLs (HBASE-7662) • Extends the existing HBase ACL model with support for persisting and checking per-cell ACL data in tags – (R)ead, (W)rite, E(X)ecute, (A)dmin, (C)reate – Namespace → Table → Column Family → Cell • Backwards compatible with existing installs and code • Uses existing facilities (operation attributes) to carry cell ACLs to supporting servers
  6. 6. Cell ACLs (HBASE-7662) • Cell ACLs are scoped to the same point in time as the cell itself – Simple and straightforward evolution of security policy over time without expensive updates • We require that mutations have covering permission – The union of the user’s table perms, CF perms, and perms in the most recent visible[1] version, if the value already exists, must allow the pending mutation in order for it to be applied – For Deletes, in addition, all visible prior versions covered by the Delete must allow the Delete – Delete semantics are being refined • Complex Deletes may be rejected; just resubmit as simpler ops • Improved in 0.98.2, likely fully resolved in 0.98.3 1. Visible is defined here as not covered already by a committed delete marker
  7. 7. Cell Labels (HBASE-7663) • Visibility expression support via a new security coprocessor – Labels: arbitrary strings – Expressions: Labels joined in boolean expressions – Operators: &, |, !, ( ) secret secret | topsecret ( secret | topsecret ) & !probationary
  8. 8. Cell Labels (HBASE-7663) • New admin APIs and new shell commands for label management • The universe of labels and the maximal set of labels for a user are defined up front • Users label cells using visibility expressions • Other users ask for authorizations on Gets and Scans • We build a user’s effective set of authorizations per request in a pluggable way on the server • Scan results are filtered according to the user’s effective authorizations • VisibilityController and AccessController can be used together
  9. 9. Transparent Encryption (HBASE-7544) • Transparent encryption of HBase on disk data – HFile blocks are encrypted as written and decrypted as read – Write ahead log (WAL) serialization is pluggable; we provide new secure writers and readers that encrypt and decrypt edits • Built on a new extensible cryptographic codec and key management framework in HBase • Simple key management – Default provider integrates with the Java Keystore • Per column family configuration – Supports schema design that places sensitive information in only a subset of column families
  10. 10. Transparent Encryption (HBASE-7544)
  11. 11. Endpoint EXEC Grants (HBASE-6104) • HBase ACLs grant a familiar set of privileges to users and groups: – (R)ead, (W)rite, E(X)excute, (C)reate, (A)dmin • Versions prior to 0.98.0 ignored X • Now access to coprocessor Endpoint invocations can be controlled on a global, per-table, or per-column family basis
  12. 12. Controlling Access To Data
  13. 13. Our Example Schema • A simple user information table Row Key Column Family: i Column Family: pii uid i:fullname pii:address i:nick pii:phone pii:cc pii:cvv2 pii:expdate > create ‘user’, { NAME => ‘i’, COMPRESSION => ’snappy’, VERSIONS => 10 }, { NAME => ‘pii’, COMPRESSION => ’snappy’, VERSIONS => 10 }
  14. 14. Our Example Security Policy • Column family: i
  15. 15. Our Example Security Policy • Column family: pii
  16. 16. Getting Started • Enable HFile V3 – hfile.format.version=3 • Enable SASL+Kerberos authentication – RPC: Follow the steps in section 8.1 of the online manual: https://hbase.apache.org/book/security.html – ZooKeeper: Follow the steps in section 17.2 of the online manual: https://hbase.apache.org/book/zk.sasl.auth.html • Install security coprocessors – hbase.coprocessor.region.classes= org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.visibility.VisibilityController, org.apache.hadoop.hbase.security.token.TokenProvider
  17. 17. Getting Started – hbase.coprocessor.master.classes= org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.visibility.VisibilityController – hbase.coprocessor.regionserver.classes= org.apache.hadoop.hbase.security.access.AccessController • Enable Endpoint exec permission checks – hbase.security.exec.permission.checks=true • [Optional] Enable transport security – hbase.rpc.protection=auth-conf
  18. 18. Role-Based Access Control Using the Hadoop Group Mapping Service and ACLs • Map each role in the organization to a LDAP entity – Employee -> • cn=user, member: ou=users,dc=groups, dc=example,dc=org – Developer -> • cn=developer, member: ou=developers,dc=groups,dc=example,dc=org – Test User Account -> • cn=testuser, member: ou=users,dc=example,dc=org – Service Account -> • cn=service, member: ou=services,dc=example,dc=org – Admin -> • cn=manager,dc=example,dc=org
  19. 19. Role-Based Access Control Using the Hadoop Group Mapping Service and ACLs • Set up the Hadoop group mapper (core-site.xml) – hadoop.security.group.mapping= org.apache.hadoop.security.LdapGroupsMapping – hadoop.security.group.mapping.ldap.url=… – hadoop.security.group.mapping.ldap.bind.user=… – hadoop.security.group.mapping.ldap.search.filter.user= (& (|(objectclass=person)(objectclass=applicationProcess))(cn={0})) – hadoop.security.group.mapping.ldap.search.filter.group= (objectclass=groupofnames) – hadoop.security.group.mapping.ldap.search.attr.member=member – hadoop.security.group.mapping.ldap.search.attr.group.name=cn
  20. 20. Role-Based Access Control Using the Hadoop Group Mapping Service and ACLs • Confirm the configuration is working correctly hbase> whoami service (auth:KERBEROS) groups: services
  21. 21. Role-Based Access Control Using the Hadoop Group Mapping Service and ACLs • Grant permissions to groups and service and test accounts hbase> grant '@admins', 'RWXCA' hbase> grant 'service', 'RWXCA', 'user' hbase> grant '@developers', 'RW', 'user', 'i' hbase> grant 'testuser', 'RW', 'user', 'i' hbase> grant 'user', { '@developers' => 'RW', 'testuser' => 'R' }, { COLUMNS => 'pii', FILTER => "(PrefixFilter ('test'))" } Note: Cell grants done by the shell apply to existing cells only. This is useful for testing. In practice applications must add the desired cell ACL to the operation when submitting writes.
  22. 22. Role-Based Access Control Using Labels • Define labels corresponding to roles in the security policy admin service test developer
  23. 23. Role-Based Access Control Using Labels • Express access rules as visibility expressions admin | service admin | service | test admin | service | developer admin | service | developer | test • Define labels hbase> add_labels [ 'admin', 'service', 'developer', 'test' ]
  24. 24. Role-Based Access Control Using Labels • Assign one or more roles to each user by associating their principal with a label set hbase> set_auths 'service', [ 'service' ] hbase> set_auths 'testuser', [ 'test' ] hbase> set_auths 'manager', [ 'admin' ] hbase> set_auths 'dev', [ 'developer' ] hbase> set_auths 'qa', [ 'test', 'developer' ] hbase> …
  25. 25. Role-Based Access Control Using Labels • Apply appropriate visibility expressions to cells hbase> set_visibility 'user', 'admin|service|developer', { COLUMNS => 'i' } hbase> set_visibility 'user', 'admin|service', { COLUMNS => ' pii' } hbase> set_visibility 'user', 'admin|service|developer|test', { COLUMNS => [ 'i', 'pii' ], FILTER => "(PrefixFilter ('test'))" } Note: Visibility expressions added to cells by the shell apply to existing cells only. This is useful for testing. In practice applications must add the desired visibility expression to the operation when submitting writes.
  26. 26. Attribute-Based Access Control • We can construct the effective authorization set for a user in a pluggable and stackable way ← Retrieves principal for user ← Maps principal to group names ← Imports auths from request ← Enforces minimum auths Auths table ← Maps identity attributes to auths Directory
  27. 27. Attribute-Based Access Control • LDAP plugin can mix in auths corresponding to attributes of the subject’s identity – Expected soon in 0.98 (maybe 0.98.4) Query (&(objectClass=person) (userPrincipalName={0})) Attribute Mapping <attribute>: <regex> → <auth> memberOf: .+ -> $1 division: .+ -> $1 department: .+ -> $1 employeeID: P[0-9]+ -> probationary Directory
  28. 28. Attribute-Based Access Control Using Labels • Apply appropriate visibility expressions to cells hbase> set_visibility 'user', 'admin|service|(developer&(!probationary))', { COLUMNS => 'i' } hbase> set_visibility 'user', 'admin|service', { COLUMNS => ' pii' } hbase> set_visibility 'user', 'admin|service|((developer|test)&(!probationary))', { COLUMNS => [ 'i', 'pii' ], FILTER => "(PrefixFilter ('test'))" }
  29. 29. Attribute-Based Access Control Using ACLs • An area of future work – We could consider a HBase provided replacement for the Hadoop Group Mapper that also supports mapping object attributes to strings – For the VisibilityController, the mapped strings would be interpreted as auths (see slide #27) – For the AccessController, the mapped strings could be interpreted as group names – See HBASE-10919[1] or raise a discussion on user@hbase.apache.org 1. https://issues.apache.org/jira/browse/HBASE-10919
  30. 30. Preventing Data Leaks
  31. 31. Protecting Data At Rest • HBase is deployed into a layered system • Incorrect handling of permissions or storage volumes at the HDFS layer or below could expose sensitive information Apache HBase Apache ZooKeeper ZooKeeper ZooKeeper ZooKeeper Apache Hadoop Distributed File System (HDFS) DataNode MasterMaster (Standby) RegionServer DataNode DataNode DataNode DataNode RegionServer RegionServer RegionServer RegionServer
  32. 32. Getting Started • Create the cluster master key in a KeyStore file $ keytool -keystore hbase.jks -storetype jceks –genseckey -keyalg AES -keysize 128 -storepass secret -alias hbase-master-default • Deploy the KeyStore file to all site configuration directories and restrict local access to it $ chown hbase:hbase hbase.jks $ chmod 0600 hbase.jks (-rw-------) • Enable HFile V3 – hfile.format.version=3
  33. 33. Getting Started • Set up key provider configuration for KeyStore files – hbase.crypto.keyprovider= org.apache.hadoop.hbase.io.crypto.KeyStoreKeyProvider – hbase.crypto.keyprovider.parameters= jceks:///path/to/hbase/conf/hbase.jks?password=secret – hbase.crypto.master.key.name=hbase-master-default • Restrict local access to the site file $ chown hbase:hbase hbase-site.xml $ chmod 0600 hbase-site.xml (-rw-------) • The KeyStore password need not be embedded in the site file – Use ?passwordFile=/path/to/password/file and protect that instead
  34. 34. Getting Started • Enable WAL encryption – hbase.crypto.wal.key.name=hbase-master-default – hbase.regionserver.hlog.reader.impl= org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader – hbase.regionserver.hlog.writer.impl= org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter – hbase.regionserver.wal.encryption=true WAL encryption is configured separately from HFile encryption to enable storage management with tiered sensitivity • (JRE 8+) Enable AES-NI acceleration features – Add to hbase-env.sh: – XX:+UseAES –XX:+UseAESIntrinsics
  35. 35. Transparent Encryption • Segregate sensitive information into one or a few column families with HFile encryption enabled – We are storing sensitive personally identifiable customer information in the “pii” family – Enable encryption on “pii” only to mitigate performance impact – After changing schema, run a major compaction to insure all files are (eventually) transformed hbase> disable 'user' hbase> alter 'user', { NAME => 'pii', COMPRESSION => 'snappy', ENCRYPTION => 'aes' } hbase> enable 'user' hbase> major_compact 'user' Row Key Column Family: i Column Family: pii uid i:fullname pii:address i:nick pii:phone pii:cc pii:cvv2 pii:expdate
  36. 36. Transparent Encryption • Data key management – RegionServers retrieve and unwrap CF keys from descriptors as needed to encrypt HFiles – The data key for a CF can be modified at any time by the admin • Or, encryption can be enabled and disabled entirely • CF encryption is completely reversible! – HFiles contain the data key used for encryption, wrapped (encrypted) by the master key • Supports incremental rekeying without expensive IO or downtime – Simply trigger major compaction to normalize encryption and data keying state over the entire CF • Can be done on a region by region basis with a HBase shell script
  37. 37. Transparent Encryption • Master key rotation – Should be an infrequent operation, an attacker able to observe even all schema and HFiles gains very little information about it over time – Store a copy of the current master key with an alternate alias e.g. “hbase-master-alt” – Replace the master key with a new one – Update site file • hbase.crypto.master.alternate.key.name=hbase-master-alt – Do a rolling restart of all HBase server processes – Trigger a major compaction and wait for completion – Remove the old master key from the KMS and remove alt alias from site – Do another rolling restart of all HBase server processes
  38. 38. Key Providers • Any Key Management System with a Java KeyStore provider can be supported by the KeyStoreKeyProvider • Or natively, via custom HBase KeyProviders • Update site configuration hbase.crypto.keyprovider hbase.crypto.keyprovider.parameters HBase KeyStoreKeyProvider HBase YourKeyProvider JDK KeyStore provider framework Thales Luna CloudHSM . . .
  39. 39. Cipher Providers • We support alternate or accelerated ciphers with either: 1. Java Cryptography (JCE) algorithm provider • Install a signed JCE provider (supporting “AES/CTR/NoPadding” mode with 128 bit keys) • Add it with highest preference to the JCE site configuration file $JAVA_HOME/lib/security/java.security • Update site configuration hbase.crypto.algorithm.aes.provider hbase.crypto.algorithm.rng.provider 2. Custom HBase Cipher implementation • Start at org.apache.hadoop.hbase.io.crypto.CipherProvider • Make it available on the server classpath • Update site configuration hbase.crypto.cipherprovider
  40. 40. Performance Considerations
  41. 41. WAL Encryption • Performance implications of WAL encryption – As measured by HLogPerformanceEvaluation microbenchmark – Relative differences are what is interesting – WAL throughput ceiling ~10% lower with 7u45 – ~8% lower with 8u20 • Future mitigation: When HDFS storage tiering capability is in production, configure separate storage tiers for WAL and HFile data Test Throughput ops/sec Total cycles Insns per cycle Oracle Java 1.7.0_45-b18 - None 52658.302 8878179986750 0.47 Oracle Java 1.7.0_45-b18 - AES WAL encryption 48045.834 9911748458387 0.57 OpenJDK 1.8.0_20-b09 - None 54874.125 8662634367005 0.46 OpenJDK 1.8.0_20-b09 - AES WAL encryption 50659.507 9668111259270 0.61
  42. 42. Promoting Common ACLs • When designing security policy for a table, consider that table and column family level grants are inexpensive compared to cell level grants – Table and CF level grants are cached in memory – Cell level grants require region scanning • We consider permissions as the union of grants at all levels; a table or CF grant allows us to early out • If a user will always be granted permissions at the cell level, promote their access to a column family or table level grant
  43. 43. End Questions?

×