• Share
  • Email
  • Embed
  • Like
  • Private Content
Hadoop Operations: How to Secure and Control Cluster Access
 

Hadoop Operations: How to Secure and Control Cluster Access

on

  • 1,358 views

Learn about the different aspects of securing a multi-tenant cluster, how to deploy a secure cluster, and about data asset security and control.

Learn about the different aspects of securing a multi-tenant cluster, how to deploy a secure cluster, and about data asset security and control.

Statistics

Views

Total Views
1,358
Views on SlideShare
1,215
Embed Views
143

Actions

Likes
1
Downloads
108
Comments
0

3 Embeds 143

http://www.cloudera.com 132
http://author01.mtv.cloudera.com 6
http://cloudera.com 5

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Hadoop Operations: How to Secure and Control Cluster Access Hadoop Operations: How to Secure and Control Cluster Access Presentation Transcript

    • 1 Hadoop Operations: How to Secure and Control Cluster Access Eric Sammer Engineering Manager, Cloudera – Author, Hadoop Operations
    • 2 We’re here to talk about… •How common security constructs map onto services •How these constructs work in Hadoop •Security model and options for a few critical components •A few DOs and DON’Ts
    • 3 Warning •Security in distributed systems is complicated •This is just a whirlwind tour – Do your homework •Assumptions • You’re familiar with Hadoop’s architecture and functionality • You have a basic understanding of Kerberos
    • 4 The Three Questions •Identity: Who are you? •Authentication: Can you prove it? •Authorization: Are you allowed to do that?
    • 5 Hadoop’s “Simple” Mode •Identity: Usually the OS user of the client application •Authentication: Trust •Easy to impersonate other users •Stop good users from doing silly things •The default
    • 6 Hadoop’s “Simple” Mode •Use simple mode when: • No regulatory or compliance concerns • All users are trusted • Single purpose cluster (single-tenancy)
    • 7 Hadoop’s “Secure” Mode •Identity: Local part of the Kerberos principal •Authentication: Kerberos •User impersonation not possible except in specific (admin-configured) situations
    • 8 Hadoop’s “Secure” Mode •Use secure mode when: • Real regulatory concerns • Untrusted users • Running on untrusted infrastructure or in an untrusted environment • Multi-purpose cluster (multi-tenancy)
    • 9 Identity Management •Always • Use a central user database/directory service for OS users • Wire up the Kerberos KDC to use the central directory •Never • Use service users (e.g. hdfs, mapred) for anything other than running services • Share accounts, even for admin purposes
    • 10 Authentication •Simple mode: Trust what the client provides •Secure mode: Kerberos • Keytabs for services • Many options: Passphrase, M/TFA, X.509 for users • Depends on Kerberos implementation
    • 11 Authorization •Inherently service specific •Granularity of control varies by platform component •Examples • Filesystem object-level, POSIX-style • Role-based access control (RBAC) • Access control lists (ACLs) • Deferral to underlying components
    • 12 HDFS Security Model •POSIX-style users and groups •Traditional Unix-style octal permissions • Files: no execute, sticky, setuid, setgid • Directories: no setuid, always behave as if setgid is set •Authorization checks performed by NameNode
    • 13 HDFS User Levels User Level Privileges Description and Notes Cluster super user All User who started the daemons. Default: hdfs Administrators All Configuration property dfs.permissions.supergroup specifies the name of the group of admins. Default: supergroup Normal user Object-level All other users are beholden to the file and directory permissions, as specified.
    • 14 MapReduce Security Model •Configurable job queues •Queues have associated ACLs •ACLs control job submission and administrative ops •Authorization checks performed by JobTracker
    • 15 MapReduce User Levels User Level Privileges Queue Description and Notes Cluster super user All All User who started the daemons. Default: mapred Cluster admins All All Configuration property mapred.cluster.administrators specifies the admin ACL. Queue admins All Single Configuration property mapred.queue.queue-name.acl-administer- jobs specifies the admin ACL. Job owner Submit, Admin on own jobs Queue containing job Configuration property mapred.queue.queue-name.acl-submit-job specifies the submission ACL.
    • 16 Systems on top of MapReduce •Hive/Impala are the most featureful today • Without Sentry: Defers to HDFS object permissions • With Sentry, fine-grained RBAC on logical constructs (New!) • Scope: Server, database, table, view • Privileges: ALL, SELECT, INSERT, TRANSFORM • Removes direct access to files • Supports traditional techniques for controlling column-level access (i.e. views without sensitive columns) •Everything else: HDFS object permissions
    • 17 A note on auditing... •Winds up being service-specific •Cloudera Navigator handles this (and more)
    • 18 What we didn’t talk about •Configuration and deployment • Lots of options, lots of moving parts • Integration with existing infrastructure • Cloudera Manager turns days or weeks of work into minutes or hours; built to handle exactly these challenges •The other 80%: YARN applications, ZooKeeper, Flume, Sqoop, Oozie, Hue, Cloudera Search (Solr), multi-tenant gateway services, all of the administrative web interfaces, encryption of data at rest and on the wire, network footprint and exposure, ...
    • 19 Further reading and references •Hadoop Operations Chapter 6: Identity, Authentication, and Authorization (E. Sammer, O’Reilly) •Kerberos: The Definitive Guide (J. Garman, O’Reilly) •CDH4 Security Guide •CDH4 Sentry Guide •Cloudera Manager •Cloudera Navigator Submit questions in the Q&A panel Watch on-demand video of this webinar and many more at http://cloudera.com Follow Eric @esammer Follow Cloudera @ClouderaU Learn more at Strata + Hadoop World: http://tinyurl.com/hadoopworld Thank you for attending!