Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Security needs in Hadoop’s Current and Future – How Apache Ranger can help?

3,214 views

Published on

Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Balaji Ganesan
Don Bosco Durai
Hortonworks

Published in: Technology
  • Be the first to comment

Security needs in Hadoop’s Current and Future – How Apache Ranger can help?

  1. 1. Page1 Hadoop Summit, Brussels, April 2015 Security needs in Hadoop’s Current and Future – How Apache Ranger can help? Balaji Ganesan Don Bosco Durai @Hortonworks April 16, 2015
  2. 2. Page2 Hadoop Summit, Brussels, April 2015 Hadoop exacerbates the security challenge New Security Requirements • Hadoop as data lake – data being centralized • Different methods for accessing same data • Data security for multi tenant use cases • Need for centralized and consistent approach ANALYTICS Data Marts Business Analytics Visualization & Dashboards ANALYTICS Applications Business Analytics Visualization & Dashboards ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) YARN: Data Operating System Interactive Real-TimeBatch Partner ISVBatch BatchMP P EDW Clickstream Web & Social Geolocation Sensor & Machine Server Logs Unstructured SOURCES Existing Systems ERP CRM SCM
  3. 3. Page3 Hadoop Summit, Brussels, April 2015 Current State of Hadoop Security
  4. 4. Page4 Hadoop Summit, Brussels, April 2015 Security in Hadoop today First level of security requirements built in Administration Central management & consistent security Apache Ranger Authentication Authenticate users and systems Apache Knox, Native Kerberos Authorization Provision access to data Apache Ranger Audit Maintain a record of data access Apache Ranger, Hadoop native audit Data Protection Protect data at rest and in motion HDFS transparent, Hbase encryption, Vendor solutions
  5. 5. Page5 Hadoop Summit, Brussels, April 2015 Central Security Administration, Authorization & Audit Apache Ranger (fka XA Secure) • Delivers a ‘single pane of glass’ for the security administrator • Centralizes administration of security policy • Ensures consistent coverage across HDFS, Hive, Hbase, Storm and Knox
  6. 6. Page6 Hadoop Summit, Brussels, April 2015 Authentication – Kerberos What does Kerberos Do? • Establishes identity for clients, hosts and services • Prevents impersonation/passwords are never sent over the wire • Integrates w/ enterprise identity mgmt tools such as LDAP &Active Directory • More granular auditing of data access/job execution Ambari 2.0 automates Kerberos deployment
  7. 7. Page7 Hadoop Summit, Brussels, April 2015 Authentication - API Security with Knox • Eliminates SSH “edge node” • Central API management • Central audit control • Service level Authorization • SSO Integration – Siteminder and OAM* • LDAP & AD integration Apache Knox extends the reach of Hadoop REST API without Kerberos complexities. Integrated with existing systems to simplify identity maintenance Single, simple point of access for a cluster Central controls ensure consistency across one or more clusters • Kerberos Encapsulation • Single Hadoop access point • REST API hierarchy • Consolidated API calls • Multi-cluster support
  8. 8. Page8 Hadoop Summit, Brussels, April 2015 Data Protection Hadoop permits you to apply data protection policy at different layers across the Hadoop stack Layer What? How ? Storage Encrypt data while it is at rest HDFS file encryption, Hbase Encryption Transmission Encrypt data as it moves Supported in Hadoop
  9. 9. Page10 Hadoop Summit, Brussels, April 2015 Demo Don Bosco Durai
  10. 10. Page11 Hadoop Summit, Brussels, April 2015 Future of Hadoop Security How Apache Ranger can help?
  11. 11. Page12 Hadoop Summit, Brussels, April 2015 Security Requirements Beyond basic security.. Administration Central management & consistent security • Tag based policies • Extend beyond Hadoop Authentication Authenticate users and systems • Single Sign on Authorization Provision access to data • Dynamic, Attribute based access control (ABAC) Audit Maintain a record of data access • Activity monitoring, intrusion detection Data Protection Protect data at rest and in motion • Encryption as first class citizen, masking and anonymization
  12. 12. Page13 Hadoop Summit, Brussels, April 2015 Apache Atlas Future of Security – Data Classification w/ Apache Atlas Knowledge Store Knowledge store categorized with appropriate business- oriented taxonomy • Data sets & objects • Tables / Columns • Logical context • Source, destination Support exchange of metadata between foundation components and third-party applications/governance tools Leverages existing Hadoop metastores Audit Store Policy Engine Data Lifecycle Management Security REST API Services Search Lineage Exchange Healthcare HIPAA HL7 Financial SOX Dodd-Frank Custom CWM Retail PCI PII Other Knowledge Store ModelsType-System Policy RulesTaxonomies
  13. 13. Page14 Hadoop Summit, Brussels, April 2015 Hive Policy Table1, Col A | Marketing | Select Table 2, All | IT Admin | Create HDFS HiveServer 2 A B C Beeline Client Ranger Source Data ETL, Data Ingest Current Ranger Setup Sqoop, Flume
  14. 14. Page15 Hadoop Summit, Brussels, April 2015 HDFS HiveServer 2 A B C Beeline Client Ranger Source Data ETL, Data Ingest Flume, Sqoop Metadata Server Tag Policy Campaign | Marketing | Select Logs | IT Admin | Create Data Classification Table1, Col A | “Campaign” Table 2 | “Logs” Future of Security – Tag based Policies
  15. 15. Page16 Hadoop Summit, Brussels, April 2015 Future of Security - Administration Centralized Administration across big data applications • Ranger provides a pluggable architecture for policy administration and enforcement Future Needs • Custom plugins can be created for any data store, hooked up to Ranger admin • Build plugins to manage ACLs for big data BI applications, EDW • Provides “single pane of glass” for end users managing security for the entire big data environment
  16. 16. Page17 Hadoop Summit, Brussels, April 2015 Future of Security – Centralized Administration Ranger Stacks • Easily added a new “service” to Ranger • Enable customers and partners to add new component support easily Ranger Administration Portal HDFS Hive Server2 Ranger Policy Server Ranger Audit Server Ranger Plugin Ranger Plugin Hbase Ranger Plugin New Service Ranger Plugin*
  17. 17. Page18 Hadoop Summit, Brussels, April 2015 Future of Security – Adding new service to Ranger Adding a new service using JSON
  18. 18. Page19 Hadoop Summit, Brussels, April 2015 Future of Security – Adding new plugins Permission Interface Ranger Implementation Component Process (e.g. HiveServer2) Create/Ins ert Edit/Updat e View/Sele ct Other Actions Check Permission Ranger Policy Admin DB Ranger Centralized Audit Store Ranger Policy Cache
  19. 19. Page20 Hadoop Summit, Brussels, April 2015 Future of Security - Authorization Dynamic, Attribute based access control (ABAC) • Ranger currently provides hooks to embed dynamic rules in the policies Future Security Needs • Extend Ranger to support data or user attributes in policy decisions • Examples, • Use geo location of users to determine access • Access available only between 9a -5p local time
  20. 20. Page21 Hadoop Summit, Brussels, April 2015 Ranger – Dynamic Policy Conditions
  21. 21. Page22 Hadoop Summit, Brussels, April 2015 Future of Security - Auditing Monitoring, intrusion detection through audit data • Ranger currently captures detailed audit data, stores in HDFS or RDBMS Future Work • Ranger can stream audit data through Kafka, Storm into multiple datastores • Add support for correlation, processing in Storm • Alerts based on rules • Add support for feeding in audit data from external sources (network events, syslogs etc) • Ranger UI can provide dashboard to monitor audit events
  22. 22. Page23 Hadoop Summit, Brussels, April 2015 Future of Security - Auditing Ranger Audit Hive Storm Kafka Solr Other Audit Logs (Network, SNMP) Add context, Enrich, Alerts Long term store, Query Interactive Audit Query AnalyticalApplications
  23. 23. Page24 Hadoop Summit, Brussels, April 2015 Future of Security – Data Protection Encryption as first class citizen • Encryption introduced in HDFS and Hbase Future Roadmap - Build native encryption support in HDFS, Hive and Hbase - Ranger based key management to support encryption - Authorization policies for KMS in Ranger - Column level masking supported in Hive, Phoenix
  24. 24. Page25 Hadoop Summit, Brussels, April 2015 Ranger Community How to contribute?
  25. 25. Page26 Hadoop Summit, Brussels, April 2015 Apache Ranger Resources (ranger.incubator.apache.org)
  26. 26. Page27 Hadoop Summit, Brussels, April 2015 Ranger Resources - Wiki
  27. 27. Page28 Hadoop Summit, Brussels, April 2015

×