LDAP AND SAML

IN HUE
Abraham Elmahrek
WHAT

IS HUE?
WEB INTERFACE FOR MAKING
HADOOP EASIER TO USE


Suite of apps for each Hadoop component,

like Hive, Pig, Impala, Oozie, Solr, Sqoop2,
HBase...
VIEW FROM

30K FEET
Hadoop Web Server You and even

that friend 

that uses IE9 ;)
ECOSYSTEM
PIG
JO
B
BRO
W
SER
JO
B
DESIG
N
ER
O
O
ZIEH
IVE
IM
PALA
M
ETASTO
RE
BRO
W
SER
SEARC
HH
BASE
BRO
W
SER
SQ
O
O
P
ZO
O
KEEPER
U
SER
ADM
INDB
Q
U
ERY
SPARKH
O
M
E
...
FILE
BRO
W
SER
YARN JobTracker Oozie
Pig
HDFS
HiveServer2
Hive	

Metastore
Cloudera	

Impala
Solr
HBase
Sqoop2
Zookeeper
LDAP	

SAML
Hue Plugins
APPS
TARGET

OF HUE
GETTING STARTED WITH HADOOP


BEING PRODUCTIVE EXPLORING
DIFFERENT ANGLES OF THE PLATFORM
!
LET ANY USER FOCUS ON BIG DATA
PROCESSING

THE CORE

TEAM PLAYERS
team.gethue.com
ABRAHAM ELMAHREK
ROMAIN RIGAUX
ENRICO BERTI
CHANG BEER
HISTORY

HUE 1
Desktop-like in a browser,
did its job but pretty slow,
memory leaks and not very
IE friendly but definitely
advanced for its time
(2009-2010).
HISTORY

HUE 2
The first flat structure port,
with Twitter Bootstrap all
over the place.
HISTORY

HUE 2.5
New apps, improved the UX
adding new nice
functionalities like
autocomplete and drag &
drop.
HISTORY

HUE 3 ALPHA
Proposed design, didn’t
make it.
HISTORY

HUE 3
Transition to the new UI,
major improvements and
new apps.
HISTORY

HUE 3.5+
Where we are now, new UI,
several new apps, the most
user friendly features to
date.
LDAP
INTRO

1.Hierarchical entries
2.Entries contain attributes
3.Attributes available are defined
by object classes
TWO KINDS OF PROBLEMS

DIRECT BIND
Authenticate against a
directory service using
simple direct bind.
SEARCH
Authenticate, import,
synchronize, etc. against an
LDAP service by searching
for a particular entry
EXISTING FEATURES

LOGIN
Authenticate against a
directory service using
simple direct bind or search
for the distinguished name
to bind with.
USERADMIN
Add new users and groups;
Synchronize existing users
and groups; Support posix
accounts, posix groups, DN
import, general LDAP search,
etc.
CLI
Command line interface for
synchronizing LDAP users
and groups.
SUBGROUPS
Import subgroups and
members of subgroups when
synchronizing a group.
Subgroup defined as a
subordinate group.
LOWERCASE
Force usernames to lower
case.
CONFIGURABLE
User filter, user name
attribute, group filter, group
name attribute
NEW FEATURES
MULTIDOMAIN
Be able to choose which
domain to authenticate
against.
NESTED GROUPS
Be able to import nested
groups and members of
nested groups.
EXAMPLE CONFIGURATIONS - BASIC
[[ldap]]
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldap://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
!
[[[[[users]]]]]
user_filter=“objectclass=Person"
user_name_attr=uid
!
[[[[[groups]]]]]
group_filter="objectclass=groupOfNames"
EXAMPLE CONFIGURATIONS - LDAPS
[[ldap]]
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldaps://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
ldap_cert=/etc/certs/root-ca-cert.pem
!
[[[[[users]]]]]
user_filter="objectclass=Person"
user_name_attr=uid
!
[[[[[groups]]]]]
group_filter="objectclass=groupOfNames"
EXAMPLE CONFIGURATIONS - NESTED GROUPS
[[ldap]]
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldap://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
subgroups=nested
!
[[[[[users]]]]]
user_filter="objectclass=Person"
user_name_attr=uid
!
[[[[[groups]]]]]
group_filter="objectclass=groupOfNames"
EXAMPLE CONFIGURATIONS - DIRECT BIND
[[ldap]]
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldap://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
search_bind_authentication=false
ldap_username_pattern=“uid=<username>,ou=People,dc=hue-
search,dc=ent,dc=cloudera,dc=com”
!
[[[[[users]]]]]
user_filter=“objectclass=Person”
user_name_attr=uid
!
[[[[[groups]]]]]
group_filter=“objectclass=groupOfNames”
EXAMPLE CONFIGURATIONS - ACTIVE DIRECTORY
[[ldap]]
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldap://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
search_bind_authentication=false
nt_domain=cloudera.com
!
[[[[[users]]]]]
user_filter=“objectclass=Person”
user_name_attr=uid
!
[[[[[groups]]]]]
group_filter=“objectclass=groupOfNames”
EXAMPLE CONFIGURATIONS - ADVANCED
[[ldap]]
subgroups=nested
ignore_username_case=true
force_username_lowercase=true
!
[[[ldap_servers]]]
[[[[mycompany]]]]
base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com"
ldap_url=ldap://hue-search.ent.cloudera.com
bind_dn="CN=Directory Manager"
bind_password=cloudera
!
[[[[[users]]]]]
user_filter="objectclass=Person"
user_name_attr=samaccountname
!
[[[[[groups]]]]]
group_filter="objectclass=groupOfNames"
SAML
INTRO

1.Service provider (SP)
2.Identity provider (IdP)
3.Signed/encrypted requests and
responses
4.IdP Identity source can be LDAP
5.Secure SSO as defined by the
OASIS group standards
THE CHALLENGES
LIBRARIES
Python libraries have bad
licenses, are poorly written,
and rely on system libraries
not found in primary
repositories.
COMPLEX CONFIGURATION
Service Provider and Identity
Provider definition is
obscure. Protocol is
configurable. Every IdP is
slightly different.
TESTABILITY
Opensource IdPs are
incomplete. We use
Shibboleth.
THE BREAK DOWN
HACKING
https://github.com/abec/
djangosaml2
https://github.com/abec/
pysaml2
PACKAGING/CONFIGURATION
Do not package SAML
libraries. Instead, require
users to install manually.
Configure via Hue.
TESTABILITY
We need help!
[libsaml]
xmlsec_binary=/opt/local/bin/xmlsec1
entity_id="http://192.168.92.1:8080/saml2/
metadata/"
metadata_file=/Users/abe/Desktop/idp-
metadata.xml
key_file=/Users/abe/Desktop/idp.key
cert_file=/Users/abe/Desktop/idp.crt
DEPENDENCIES
XMLSEC1
Requires xmlsec1 (a
nonstandard system package)
DJANGOSAML2
Django application for
pysaml2
PYSAML2
Python binding with two
implementations: 0.4.x line
and 1.x line. 1.x line has had
major updates and there is a
2.x line now.
diff --git a/src/saml2/client_base.py b/src/saml2/client_base.py	
index f1aadf3..9206a95 100644	
--- a/src/saml2/client_base.py	
+++ b/src/saml2/client_base.py	
@@ -124,11 +124,7 @@ class Base(Entity):	
else:	
setattr(self, foo, False)	
	
- # extra randomness	
- self.allow_unsolicited = self.config.getattr("allow_unsolicited", "sp")	
-	
self.artifact2response = {}	
- self.logout_requests_signed = False	
	
#	
# Private methods	
@@ -533,8 +529,8 @@ class Base(Entity):	
if resp is None:	
return None	
elif isinstance(resp, AuthnResponse):	
- #self.users.add_information_about_person(resp.session_info())	
- #logger.info("--- ADDED person info ----")	
+ self.users.add_information_about_person(resp.session_info())	
+ logger.info("--- ADDED person info ----")	
pass	
else:	
logger.error("Response type not supported: %s" % (
INSTALLATION
yum install xmlsec1	
!
build/env/bin/pip install -e git+https://github.com/abec/pysaml2@HEAD#egg=pysaml2	
!
build/env/bin/pip install -e git+https://github.com/abec/
djangosaml2@HEAD#egg=djangosaml2
USERNAME SOURCE
ATTRIBUTES
Fetch username for SAML
from attributes returned by
the IdP
NAMEID
Use transient or persistent
Name ID to be username for
SAML
[libsaml]
…
username_source=nameid
…
IT’S COMPLICATED
https://wiki.cloudera.com/display/engineering/Hue+and+SAML
FRESH IDEAS
REPLACE XMLSEC1
Python libraries have bad
licenses, are poorly written,
and rely on system libraries
not found in primary
repositories.
REPLACE PYSAML2
Pysaml2 doesn’t use
intelligent libraries, uses
xmlsec1, code base is messy.
SINGLE LOGOUT
Some IdPs provides single
logout. Needs to be tested.
DOCUMENTATION
More documentation
around all the various
IdPs and how to support
them is necessary.
TEST ON SITEMINDER
Many customers seem to be
using SiteMinder and every
IdP is slightly different.
SYSTEM LEVEL TESTS
More system level
testing as customers
start to use SAML.
DEMO
TIME

LINKS

DEMO
http://demo.gethue.com
TWITTER
@gethue
USER GROUP
hue-user@
WEBSITE
http://gethue.com
LEARN
http://learn.gethue.com
!
THANK YOU



www.gethue.com

LDAP, SAML and Hue

  • 1.
    LDAP AND SAML
 INHUE Abraham Elmahrek
  • 2.
    WHAT
 IS HUE? WEB INTERFACEFOR MAKING HADOOP EASIER TO USE 
 Suite of apps for each Hadoop component,
 like Hive, Pig, Impala, Oozie, Solr, Sqoop2, HBase...
  • 3.
    VIEW FROM
 30K FEET HadoopWeb Server You and even that friend that uses IE9 ;)
  • 4.
  • 5.
  • 6.
    TARGET
 OF HUE GETTING STARTEDWITH HADOOP 
 BEING PRODUCTIVE EXPLORING DIFFERENT ANGLES OF THE PLATFORM ! LET ANY USER FOCUS ON BIG DATA PROCESSING

  • 7.
    THE CORE
 TEAM PLAYERS team.gethue.com ABRAHAMELMAHREK ROMAIN RIGAUX ENRICO BERTI CHANG BEER
  • 8.
    HISTORY
 HUE 1 Desktop-like ina browser, did its job but pretty slow, memory leaks and not very IE friendly but definitely advanced for its time (2009-2010).
  • 9.
    HISTORY
 HUE 2 The firstflat structure port, with Twitter Bootstrap all over the place.
  • 10.
    HISTORY
 HUE 2.5 New apps,improved the UX adding new nice functionalities like autocomplete and drag & drop.
  • 11.
    HISTORY
 HUE 3 ALPHA Proposeddesign, didn’t make it.
  • 12.
    HISTORY
 HUE 3 Transition tothe new UI, major improvements and new apps.
  • 13.
    HISTORY
 HUE 3.5+ Where weare now, new UI, several new apps, the most user friendly features to date.
  • 14.
  • 15.
    INTRO
 1.Hierarchical entries 2.Entries containattributes 3.Attributes available are defined by object classes
  • 16.
    TWO KINDS OFPROBLEMS
 DIRECT BIND Authenticate against a directory service using simple direct bind. SEARCH Authenticate, import, synchronize, etc. against an LDAP service by searching for a particular entry
  • 17.
    EXISTING FEATURES
 LOGIN Authenticate againsta directory service using simple direct bind or search for the distinguished name to bind with. USERADMIN Add new users and groups; Synchronize existing users and groups; Support posix accounts, posix groups, DN import, general LDAP search, etc. CLI Command line interface for synchronizing LDAP users and groups. SUBGROUPS Import subgroups and members of subgroups when synchronizing a group. Subgroup defined as a subordinate group. LOWERCASE Force usernames to lower case. CONFIGURABLE User filter, user name attribute, group filter, group name attribute
  • 18.
    NEW FEATURES MULTIDOMAIN Be ableto choose which domain to authenticate against. NESTED GROUPS Be able to import nested groups and members of nested groups.
  • 19.
    EXAMPLE CONFIGURATIONS -BASIC [[ldap]] [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldap://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera ! [[[[[users]]]]] user_filter=“objectclass=Person" user_name_attr=uid ! [[[[[groups]]]]] group_filter="objectclass=groupOfNames"
  • 20.
    EXAMPLE CONFIGURATIONS -LDAPS [[ldap]] [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldaps://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera ldap_cert=/etc/certs/root-ca-cert.pem ! [[[[[users]]]]] user_filter="objectclass=Person" user_name_attr=uid ! [[[[[groups]]]]] group_filter="objectclass=groupOfNames"
  • 21.
    EXAMPLE CONFIGURATIONS -NESTED GROUPS [[ldap]] [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldap://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera subgroups=nested ! [[[[[users]]]]] user_filter="objectclass=Person" user_name_attr=uid ! [[[[[groups]]]]] group_filter="objectclass=groupOfNames"
  • 22.
    EXAMPLE CONFIGURATIONS -DIRECT BIND [[ldap]] [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldap://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera search_bind_authentication=false ldap_username_pattern=“uid=<username>,ou=People,dc=hue- search,dc=ent,dc=cloudera,dc=com” ! [[[[[users]]]]] user_filter=“objectclass=Person” user_name_attr=uid ! [[[[[groups]]]]] group_filter=“objectclass=groupOfNames”
  • 23.
    EXAMPLE CONFIGURATIONS -ACTIVE DIRECTORY [[ldap]] [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldap://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera search_bind_authentication=false nt_domain=cloudera.com ! [[[[[users]]]]] user_filter=“objectclass=Person” user_name_attr=uid ! [[[[[groups]]]]] group_filter=“objectclass=groupOfNames”
  • 24.
    EXAMPLE CONFIGURATIONS -ADVANCED [[ldap]] subgroups=nested ignore_username_case=true force_username_lowercase=true ! [[[ldap_servers]]] [[[[mycompany]]]] base_dn="DC=hue-search,DC=ent,DC=cloudera,DC=com" ldap_url=ldap://hue-search.ent.cloudera.com bind_dn="CN=Directory Manager" bind_password=cloudera ! [[[[[users]]]]] user_filter="objectclass=Person" user_name_attr=samaccountname ! [[[[[groups]]]]] group_filter="objectclass=groupOfNames"
  • 25.
  • 26.
    INTRO
 1.Service provider (SP) 2.Identityprovider (IdP) 3.Signed/encrypted requests and responses 4.IdP Identity source can be LDAP 5.Secure SSO as defined by the OASIS group standards
  • 27.
    THE CHALLENGES LIBRARIES Python librarieshave bad licenses, are poorly written, and rely on system libraries not found in primary repositories. COMPLEX CONFIGURATION Service Provider and Identity Provider definition is obscure. Protocol is configurable. Every IdP is slightly different. TESTABILITY Opensource IdPs are incomplete. We use Shibboleth.
  • 28.
    THE BREAK DOWN HACKING https://github.com/abec/ djangosaml2 https://github.com/abec/ pysaml2 PACKAGING/CONFIGURATION Donot package SAML libraries. Instead, require users to install manually. Configure via Hue. TESTABILITY We need help! [libsaml] xmlsec_binary=/opt/local/bin/xmlsec1 entity_id="http://192.168.92.1:8080/saml2/ metadata/" metadata_file=/Users/abe/Desktop/idp- metadata.xml key_file=/Users/abe/Desktop/idp.key cert_file=/Users/abe/Desktop/idp.crt
  • 29.
    DEPENDENCIES XMLSEC1 Requires xmlsec1 (a nonstandardsystem package) DJANGOSAML2 Django application for pysaml2 PYSAML2 Python binding with two implementations: 0.4.x line and 1.x line. 1.x line has had major updates and there is a 2.x line now. diff --git a/src/saml2/client_base.py b/src/saml2/client_base.py index f1aadf3..9206a95 100644 --- a/src/saml2/client_base.py +++ b/src/saml2/client_base.py @@ -124,11 +124,7 @@ class Base(Entity): else: setattr(self, foo, False) - # extra randomness - self.allow_unsolicited = self.config.getattr("allow_unsolicited", "sp") - self.artifact2response = {} - self.logout_requests_signed = False # # Private methods @@ -533,8 +529,8 @@ class Base(Entity): if resp is None: return None elif isinstance(resp, AuthnResponse): - #self.users.add_information_about_person(resp.session_info()) - #logger.info("--- ADDED person info ----") + self.users.add_information_about_person(resp.session_info()) + logger.info("--- ADDED person info ----") pass else: logger.error("Response type not supported: %s" % (
  • 30.
    INSTALLATION yum install xmlsec1 ! build/env/bin/pipinstall -e git+https://github.com/abec/pysaml2@HEAD#egg=pysaml2 ! build/env/bin/pip install -e git+https://github.com/abec/ djangosaml2@HEAD#egg=djangosaml2
  • 31.
    USERNAME SOURCE ATTRIBUTES Fetch usernamefor SAML from attributes returned by the IdP NAMEID Use transient or persistent Name ID to be username for SAML [libsaml] … username_source=nameid …
  • 32.
  • 33.
    FRESH IDEAS REPLACE XMLSEC1 Pythonlibraries have bad licenses, are poorly written, and rely on system libraries not found in primary repositories. REPLACE PYSAML2 Pysaml2 doesn’t use intelligent libraries, uses xmlsec1, code base is messy. SINGLE LOGOUT Some IdPs provides single logout. Needs to be tested. DOCUMENTATION More documentation around all the various IdPs and how to support them is necessary. TEST ON SITEMINDER Many customers seem to be using SiteMinder and every IdP is slightly different. SYSTEM LEVEL TESTS More system level testing as customers start to use SAML.
  • 34.
  • 35.
  • 36.