Open up interactive big data analysis for your enterprise
Upcoming SlideShare
Loading in...5
×
 

Open up interactive big data analysis for your enterprise

on

  • 306 views

Open up interactive big data analysis for your enterprise ...

Open up interactive big data analysis for your enterprise
Hadoop brings many data crunching possibilities but also comes with a lot of complexity: the ecosystem is large and continuously changing, interactions happens on the command line, interfaces are built for engineers…
This talk describes how Hue can be integrated with existing Hadoop deployments with minimal changes/disturbances. Enrico covers details on how Hue can leverage the existing authentication system and security model of your company.
This talk describes through an interactive demo and dialog based on open source Hue how users can get started with Hadoop. We will detail how one can start or use an existing Hadoop cluster to setup Hue. The best practices about how to integrate your company directory and security will be shared. Moreover, the underlying technical details about interact with the ecosystem.
The presentation will continue with real life analytics business use cases. It will show how data can be imported and loaded into the cluster for then being queried interactively with SQL or a search dashboard. All through your Web Browser!
To sum-up, attendees of this talk will learn how Hadoop can be made more accessible and why Hue is the ideal gateway for quickly getting started or using the platform more efficiently.

Statistics

Views

Total Views
306
Views on SlideShare
174
Embed Views
132

Actions

Likes
0
Downloads
2
Comments
0

3 Embeds 132

http://gethue.com 125
http://feedly.com 5
http://www.slideee.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Open up interactive big data analysis for your enterprise Open up interactive big data analysis for your enterprise Presentation Transcript

  • OPEN UP INTERACTIVE BIG DATA ANALYSIS FOR YOUR ENTERPRISE Enrico Berti Budapest DW Forum, Jun 4, 2014
  • GOAL
 OF HUE WEB INTERFACE FOR ANALYZING DATA WITH APACHE HADOOP   ! SIMPLIFY AND INTEGRATE
 
 FREE AND OPEN SOURCE ! —> OPEN UP BIG DATA
  • VIEW FROM
 30K FEET Hadoop Web Server You, your colleagues and even that friend that uses IE9 ;)
  • OPEN SOURCE
 ~3350 COMMITS   
 38 CONTRIBUTORS
 
 698 STARS
 
 245 FORKS ! 
 github.com/cloudera/hue
  • THE CORE
 TEAM PLAYERS Join  us  at  team.gethue.com Romain  Rigaux Enrico  Ber9Chang Abraham  ElmahrekAmstel
  • TALKS Meetups  and  events  in  NYC,  Paris,   LA,  Tokyo,  SF,  Stockholm,  Vienna,   San  Jose,  Singapore,  Budapest…
 Coming  up  in  London,  West  coast AROUND
 THE WORLD RETREATS Nov  13  Koh  Chang,  Thailand   May  14  Curaçao,  Netherlands  An9lles   Nov  14  Goa,  India
  • TREND: GROWTH gethue.com
  • HISTORY
 HUE 1 Desktop-­‐like  in  a  browser,  did  its   job  but  preWy  slow,  memory  leaks   and  not  very  IE  friendly  but   definitely  advanced  for  its  9me   (2009-­‐2010).
  • HISTORY
 HUE 2 The  first  flat  structure  port,  with   TwiWer  Bootstrap  all  over  the   place. HUE 2.5 New  apps,  improved  the  UX   adding  new  nice  func9onali9es   like  autocomplete  and  drag  &   drop.
  • HISTORY
 HUE 3 ALPHA Proposed  design,  didn’t  make  it.
  • HISTORY
 HUE 3.5 New  UI,  several  new  apps,  the   most  user  friendly  features  to   date.
  • HISTORY
 HUE 3.6+ Where  we  are  now,  a  brand  new   way  to  search  and  explore  your   data.
  • WHICH VERSION TO USE? 2500+  commits  later,  new  UI,   interac9ve  search  /  SQL,   dashboards... 1-­‐2  years  old,  use  only  if  you   depend  on  Hive  <  0.12   HUE 2.X HUE 3.X
  • WHICH DISTRIBUTION? Advanced  preview The  most  stable  and  cross   component  checked Very  latest GITHUB CDH / CMTARBALL HACKER ADVANCED USER NORMAL USER
  • WHERE TO PUT HUE? IN ONE MACHINE
  • WHERE TO PUT HUE? OUTSIDE THE CLUSTER
  • WHERE TO PUT HUE? INSIDE THE CLUSTER
  • Python  2.4  2.6
 
 That’s  it  if  using  a  packaged  version.  If  building  from  the   source,  here  are  the  extra  packages SERVER CLIENT Web  Browser
 
 IE  9+,  FF  10+,  Chrome,  Safari WHAT DO YOU NEED? Hi  there,  I’m  “just”  a  web  server.
  • HOW DOES THE HUE SERVICE LOOK LIKE? Process  serving  pages  and  also   static  content 1 SERVER 1 DB For  cookies,  saved  queries,   workflows,  … Hi  there,  I’m  “just”  a  web  server.
  • HOW TO CONFIGURE HUE HUE.INI Similar  to  core-­‐site.xml  but   with  .INI  syntax   ! Where?   /etc/hue/conf/hue.ini
 or   $HUE_HOME/desktop/conf/ pseudo-distributed.ini [desktop] [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, or sqlite3 engine=sqlite3 ## host= ## port= ## user= ## password= name=desktop/desktop.db
  • AUTHENTICATION Login/Password  in  a  Database   (SQLite,  MySQL,  …) SIMPLE ENTERPRISE LDAP  (most  used),  OAuth,   OpenID,  SAML
  • DB BACKEND
  • LDAP BACKEND Integrate  your  employees:  LDAP  How  to  guide
  • USERS Can  give  and  revoke   permissions  to  single  users  or   group  of  users ADMIN USER Regular  user  +  permissions
  • LIST OF GROUPS AND PERMISSIONS A  permission  can:   - allow  access  to  one  app  (e.g.   Hive  Editor)   - modify  data  from  the  app  (e.g   drop  Hive  Tables  or  edit  cells  in   HBase  Browser) CONFIGURE APPS
 AND PERMISSIONS A  list  of  permissions
  • PERMISSIONS IN ACTION User  ‘test’  belonging  to  the  group   ‘hiveonly’  that  has  just  the  ‘hive’   permissions CONFIGURE APPS
 AND PERMISSIONS
  • HOW HUE INTERACTS
 WITH HADOOP YARN JobTracker Oozie Hue Plugins LDAP SAML Pig HDFS HiveServer2 Hive Metastore Cloudera Impala Solr HBase Sqoop2 Zookeeper
  • RCP CALLS TO ALL
 THE HADOOP COMPONENTS HDFS EXAMPLE WebHDFS REST DN DN DN … DN NN hWp://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
  • HOW List  all  the  host/port  of  Hadoop   APIs  in  the  hue.ini   ! For  example  here  HBase  and  Hive. RCP CALLS TO ALL
 THE HADOOP COMPONENTS Full  list [hbase] # Comma-separated list of HBase Thrift servers for # clusters in the format of '(name|host:port)'. hbase_clusters=(Cluster|localhost:9090) ! [beeswax] hive_server_host=host-abc hive_server_port=10000
  • HTTPS SSL DBSSL WITH HIVESERVER2 READ MORE …AUDITING SECURITY
 FEATURES KERBEROS
  • 2  Hue  instances   HA  proxy   Mul9  DB   Performances:  like  a  website,   mostly  RPC  calls HIGH AVAILABILITY HOW
  • Impala,  Hive  integra9on,  Spark   (Shark  too)   Interac9ve  SQL  editor     Integra9on  with  MapReduce,   Metastore,  HDFS SQL WHAT
  • Solr  &  Cloud  integra9on   Custom  interac9ve  dashboards   Drag  &  drop  widgets  (charts,   9meline…) SEARCH WHAT
  • Simple  custom  query  language   Supports  HBase  filter  language   Supports  selec9on  &  Copy  +  Paste,   gracefully  degrades  in  IE   Autocomplete  Help  Menu   Row$Key$ Scan$Length$ Prefix$Scan$ Column/Family$Filters$ Thri=$Filterstring$ Searchbar(Syntax(Breakdown( HBASE BROWSER WHAT
  • DEMO TIME

  • SUM-UP Enable  Hadoop  Service  APIs   for  Hue  as  a  proxy  user Configure  hue.ini  to  point  to   each  Service  API Get  help  on  @gethue  or  hue-­‐ user Install  Hue  on  one  machine Use  an  LDAP  backend INSTALL CONFIGUREENABLE HELPLDAP
  • ROADMAP
 NEXT 6 MONTHS Sentry   Search,  Spark,  SQL   More  dashboards!   Oozie  v2   Inter  component  integra9ons   (HBase  <-­‐>  Search,  create  index   wizards,  document  permissions),   Hadoop  Web  apps  SDK   Your  idea  here. WHAT
  • CONFIGURATIONS ARE HARD… …GIVE CLOUDERA MANAGER A TRY! vimeo.com/91805055
  • MISSED
 SOMETHING? learn.gethue.com
  • TWITTER @gethue USER GROUP hue-­‐user@ WEBSITE hWp://gethue.com LEARN hWp://learn.gethue.com THANK YOU!