SlideShare a Scribd company logo
1 of 50
Download to read offline
INTEGRATE HUE
WITH YOUR
HADOOP CLUSTER
Romain Rigaux

Y! HUG Apr 16, 2014
WHAT

IS HUE?
WEB INTERFACE FOR MAKING
HADOOP EASIER TO USE


Suite of apps for each Hadoop component,

like Hive, Pig, Impala, Oozie, Solr, Sqoop2,
HBase...
VIEW FROM

30K FEET
Hadoop Web Server You and even

that friend 

that uses IE9 ;)
YARN JobTracker Oozie
Pig
HDFS
HiveServer2
Hive	

Metastore
Cloudera	

Impala
Solr
HBase
Sqoop2
Zookeeper
LDAP	

SAML
Hue Plugins
ECOSYSTEM

AND APPS
TARGET

OF HUE
GETTING STARTED WITH HADOOP


BEING PRODUCTIVE EXPLORING
DIFFERENT ANGLES OF THE PLATFORM
!
LET ANY USER FOCUS ON BIG DATA
PROCESSING



BEING COMPATIBLE WITH ANY HADOOP
VERSION (0.20/1.2.0/2.3.0)
OPEN SOURCE

~3000 COMMITS


33 CONTRIBUTORS



648 STARS



212 FORKS
!


github.com/cloudera/hue
THE CORE

TEAM PLAYERS
team.gethue.com
ABRAHAM ELMAHREK
ROMAIN RIGAUX
ENRICO BERTI
CHANG BEER
TALKS
Meetups and events in NYC,
Paris, LA, Tokyo, SF,
Stockholm, Vienna, San Jose,
Singapore…

Coming up in London, West
coast
AROUND

THE WORLD
RETREATS
Nov 13 Koh Chang, Thailand
May 14 Curaçao, Netherlands
Antilles
FAST PACE
LAST 30 DAYS
41 issues created and 38
resolved.
Core team + Community
TREND: GROWTH
gethue.com
HISTORY

HUE 1
Desktop-like in a browser,
did its job but pretty slow,
memory leaks and not very
IE friendly but definitely
advanced for its time
(2009-2010).
HISTORY

HUE 2
The first flat structure port,
with Twitter Bootstrap all
over the place.
HISTORY

HUE 2.5
New apps, improved the UX
adding new nice
functionalities like
autocomplete and drag &
drop.
HISTORY

HUE 3 ALPHA
Proposed design, didn’t
make it.
HISTORY

HUE 3.5+
Where we are now, new UI,
several new apps, the most
user friendly features to
date.
WHICH VERSION TO USE?
6 months
 1k commits later1-2 years old
HUE 2.X HUE 3.X HUE 3.5 + 1/2 3.6
WHICH DISTRIBUTION?
Advanced preview The most stable and
cross component
checked
Very latest
GITHUB CDH / CMTARBALL
HACKER ADVANCED USER NORMAL USER
WHERE TO PUT HUE? IN ONE MACHINE
WHERE TO PUT HUE? INSIDE THE CLUSTER
WHERE TO PUT HUE? OUTSIDE THE CLUSTER
WHAT DO YOU NEED?
Python 2.4 2.6



That’s it if using a packaged version. If
building from the source, here are the extra
packages
SERVER CLIENT
Web Browser



IE 9+, FF 10+, Chrome, Safari
HOW DOES THE HUE SERVICE LOOK LIKE?
Process serving pages
and also static content
1 SERVER 1 DB
For cookies, saved
queries, workflows, …
HOW TO CONFIGURE HUE
HUE.INI
Similar to core-site.xml but
with .INI syntax
!
Where?
/etc/hue/conf/hue.ini

or
$HUE_HOME/desktop/conf/
pseudo-distributed.ini
[desktop]
[[database]]
# Database engine is typically one of:
# postgresql_psycopg2, mysql, or sqlite3
engine=sqlite3
## host=
## port=
## user=
## password=
name=desktop/desktop.db
AUTHENTICATE / LOGIN
[desktop]
[[auth]]
# - django.contrib.auth.backends.ModelBackend (entirely Django backend)
# - desktop.auth.backend.AllowAllBackend (allows everyone)
# - desktop.auth.backend.AllowFirstUserDjangoBackend
# - desktop.auth.backend.LdapBackend
# - desktop.auth.backend.OAuthBackend
# ...
## backend=desktop.auth.backend.AllowFirstUserDjangoBackend
USERS
Can give and revoke
permissions to single
users or group of users
ADMIN USER
Regular user +
permissions
DB BACKEND
LDAP BACKEND
Integrate your employees: LDAP How to guide
LIST OF GROUPS AND PERMISSIONS
A permission can:
- allow access to one app
(e.g. Hive Editor)
- modify data from the app
(e.g drop Hive Tables or
edit cells in HBase Browser)
CONFIGURE APPS

AND PERMISSIONS
A list of permissions
PERMISSIONS IN ACTION
User ‘test’ belonging to the
group ‘hiveonly’ that has just
the ‘hive’ permissions
CONFIGURE APPS

AND PERMISSIONS
HOW HUE INTERACTS

WITH HADOOP
YARN
JobTracker
Oozie
Hue Plugins
LDAP	

SAML
Pig
HDFS HiveServer2
Hive	

Metastore
Cloudera	

Impala
Solr
HBase
Sqoop2
Zookeeper
RCP CALLS TO ALL

THE HADOOP COMPONENTS
HDFS EXAMPLE
WebHDFS
REST
DN
DN
DN
…
DN
NN
http://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
HOW
Host/port of all services like
Oozie, Yarn, HDFS, HBase…
APIs are specified in hue.ini
on sections, e.g. [hbase] by
major service, Hue core
[desktop] or Hue lib
[liboozie]
[hbase]
# Comma-separated list of HBase Thrift servers for
# clusters in the format of '(name|host:port)'.
hbase_clusters=(Cluster|localhost:9090)
!
[liboozie]
# The URL where the Oozie service runs on.
# oozie_url=http://hue.ent.cloudera.com:11000/oozie
RCP CALLS TO ALL

THE HADOOP COMPONENTS
Full list
KERBEROS
1 Hue ticket/ principal - no user ticket
!
Hue uses its ticket for authenticating to every other service
(HDFS, Oozie, …)

read more on the Hue Security Guide
HUE KERBEROS TICKET
kadmin: addprinc -randkey hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM
Add Hue user principal to Kerberos
$ kinit -k -t /etc/hue/hue.keytab hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM
Test
Ticket should be renewable (krb5.conf and kdc.conf)
[desktop]
[[kerberos]]
# Path to Hue's Kerberos keytab file
hue_keytab=/etc/hue/hue.keytab
# Kerberos principal name for Hue
hue_principal=hue/FQDN@REALM
# add kinit path for non root users
kinit_path=/usr/kerberos/bin/kinit
hue.ini
HOW
Hue is a “super proxy”


Client could be on a
Windows machine, phone…
and interact with all the
Hadoop services
http://localhost:50070/webhdfs/v1/tmp?
op=GETFILESTATUS&user.name=hue&doas=bob
IMPERSONATION
<!-- Hue WebHDFS proxy user setting -->

<property>

<name>hadoop.proxyuser.hue.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.hue.groups</name>

<value>*</value>

</property>
Call for getting the information about an HDFS file
WebHDFS, add to core-site.xml
HTTPS SSL DBSSL WITH HIVESERVER2
READ MORE …AUDITING
OTHER SECURITY

FEATURES
2 Hue instances
HA proxy
Multi DB
Performances: like a website,
mostly RPC calls
HIGH AVAILABILITY
HOW
DEMO
TIME

SUM-UP
Enable Hadoop Service
APIs for Hue as a proxy
user
Configure hue.ini to
point to each Service API
Get help on @gethue or
hue-user
Install Hue on one
machine + Hue Kerberos
ticket
Use an LDAP backend
INSTALL CONFIGUREENABLE
HELPLDAP
CONFIGURATIONS ARE HARD…
…GIVE CLOUDERA MANAGER A TRY!
vimeo.com/91805055
MISSED

SOMETHING?
learn.gethue.com
LINKS

TWITTER
@gethue
USER GROUP
hue-user@
WEBSITE
http://gethue.com
LEARN
http://learn.gethue.com
GET HUE

Try in advance the latest
and greatest but you’ll
have to configure
everything on your own.
Get to play with Hue and
various Hadoop
components in 5
minutes. It’s a self
contained CDH
environment ready to
use.
Newer version than HDP,
close to the original 2.5
minus apps like HBase,
Impala, Sqoop, Search.
The newest addition,
ships Hue 3.0 through
the GreenButton
products.
Stable and highly tested
releases perfectly
integrated with the
Hadoop ecosystem,
automagically configured
by Cloudera Manager.
In HDP there’s an old
forked version of Hue
2.3.
CLOUDERA’S CDH TARBALL CLOUDERA’S DEMO VM
HORTONWORKS* MAPR* HP CLOUD*
* YOUR MILEAGE MAY VARY.
BIGTOP EMBEDDED/DEMO IN IND. COMPANIES
WHAT ARE YOUR USE
CASES?
WHICH COMPONENTS DO
YOU USE?
WHAT WOULD YOU LIKE TO
SEE IN HUE?
INTERESTED IN
CONTRIBUTING?
WANNA SAY HELLO?
DO YOU WANT A TAILOR
MADE TEAM RETREAT?
QUESTIONS?
TEAM@
GETHUE.COM
THANK YOU! 

gethue.com
APPENDIX

HOW
Add Hue as WebHDFS proxy
user setting like 3 slides ago



Add the property on the
right in hdfs-site.xml to
enable WebHDFS in the
NameNode and DataNodes
<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>
HDFS FILE BROWSER
[hadoop]
[[hdfs_clusters]]
# HA support by using HttpFs
!
[[[default]]]
# Enter the filesystem uri
##fs_defaultfs=hdfs://localhost:8020
!
# Use WebHdfs/HttpFs as the communication mechanism.
##webhdfs_url=http://localhost:50070/webhdfs/v1
hdfs-site.xml
hue.ini
HOW
Example of config for having
Hue interact with Yarn
[hadoop]
[[yarn_clusters]]
!
[[[default]]]
# Enter the host on which you are running the ResourceManager
resourcemanager_host=localhost
!
# The port where the ResourceManager IPC listens on
## resourcemanager_port=8032
!
# Whether to submit jobs to this cluster
submit_to=True
!
# Change this if your YARN cluster is Kerberos-secured
## security_enabled=false
!
# URL of the ResourceManager API
## resourcemanager_api_url=http://localhost:8088
!
# URL of the ProxyServer API
## proxy_api_url=http://localhost:8088
!
# URL of the HistoryServer API
# history_server_api_url=http://localhost:19888
!
[[[ha]]]
# Enter the host on which you are running the failover Resource Manager
resourcemanager_api_url=http://localhost:8088
## logical_name=
submit_to=True
YARN / MR2
HOW
Based on HiveServer2
interface
!
Note for Hive:

<property>

<name>hive.server2.enable.doAs</
name>

<value>true</value>

</property>
!
Video demo

Setup tutorial
[beeswax]
# Host where Hive server Thrift daemon is running.
# If Kerberos security is enabled, use fully-qualified domain
name (FQDN).
## hive_server_host=localhost
## hive_server_port=10000
!
# Hive configuration directory, where hive-site.xml is located
## hive_conf_dir=/etc/hive/conf
HIVE (IMPALA / SHARK)
HOW
Make sure share lib is
installed
!
Alternative Dashboard and
Editors
[liboozie]
#oozie_url=http://localhost.com:11000/oozie
OOZIE
HOW
Comes with Oozie, no PigServer yet
Oozie sharelib
Oozie credentials for security
PIG

More Related Content

What's hot

DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee Nur Ahammad
 
Hue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGHue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGgethue
 
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platform
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platformDrupal camp South Florida 2011 - Introduction to the Aegir hosting platform
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platformHector Iribarne
 
LDAP, SAML and Hue
LDAP, SAML and HueLDAP, SAML and Hue
LDAP, SAML and Huegethue
 
Hue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG FranceHue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG Francegethue
 
ironing out Docker
ironing out Dockerironing out Docker
ironing out Dockernindustries
 
httpd — Apache Web Server
httpd — Apache Web Serverhttpd — Apache Web Server
httpd — Apache Web Serverwebhostingguy
 
Utosc2007_Apache_Configuration.ppt
Utosc2007_Apache_Configuration.pptUtosc2007_Apache_Configuration.ppt
Utosc2007_Apache_Configuration.pptwebhostingguy
 
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User GroupHBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Groupgethue
 
Hadoop Israel - HBase Browser in Hue
Hadoop Israel - HBase Browser in HueHadoop Israel - HBase Browser in Hue
Hadoop Israel - HBase Browser in Huegethue
 
Drupal cambs ansible for drupal april 2015
Drupal cambs ansible for drupal april 2015Drupal cambs ansible for drupal april 2015
Drupal cambs ansible for drupal april 2015Ryan Brown
 
Linux apache installation
Linux apache installationLinux apache installation
Linux apache installationDima Gomaa
 

What's hot (16)

DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee DSpace Manual for BALID Trainee
DSpace Manual for BALID Trainee
 
Hue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUGHue: The Hadoop UI - Stockholm HUG
Hue: The Hadoop UI - Stockholm HUG
 
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platform
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platformDrupal camp South Florida 2011 - Introduction to the Aegir hosting platform
Drupal camp South Florida 2011 - Introduction to the Aegir hosting platform
 
LDAP, SAML and Hue
LDAP, SAML and HueLDAP, SAML and Hue
LDAP, SAML and Hue
 
Hue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG FranceHue: The Hadoop UI - HUG France
Hue: The Hadoop UI - HUG France
 
ironing out Docker
ironing out Dockerironing out Docker
ironing out Docker
 
httpd — Apache Web Server
httpd — Apache Web Serverhttpd — Apache Web Server
httpd — Apache Web Server
 
Sahu
SahuSahu
Sahu
 
Utosc2007_Apache_Configuration.ppt
Utosc2007_Apache_Configuration.pptUtosc2007_Apache_Configuration.ppt
Utosc2007_Apache_Configuration.ppt
 
Drupal from scratch
Drupal from scratchDrupal from scratch
Drupal from scratch
 
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User GroupHBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
 
Hadoop Israel - HBase Browser in Hue
Hadoop Israel - HBase Browser in HueHadoop Israel - HBase Browser in Hue
Hadoop Israel - HBase Browser in Hue
 
Drupal cambs ansible for drupal april 2015
Drupal cambs ansible for drupal april 2015Drupal cambs ansible for drupal april 2015
Drupal cambs ansible for drupal april 2015
 
Anatomy of a reusable module
Anatomy of a reusable moduleAnatomy of a reusable module
Anatomy of a reusable module
 
Linux apache installation
Linux apache installationLinux apache installation
Linux apache installation
 
Linux
LinuxLinux
Linux
 

Similar to April 2014 HUG : Integrating HUE with Multi-tenant cluster

Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014gethue
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystemtfmailru
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linuxTRCK
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Alluxio, Inc.
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...Big Data Spain
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows habeebulla g
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slidesryancox
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости HadoopPositive Hack Days
 
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSH
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSHTame Your Build And Deployment Process With Hudson, PHPUnit, and SSH
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSHDavid Stockton
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start TutorialCarl Steinbach
 
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0gethue
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationAnkit Desai
 
How to? Drupal developer toolkit. Dennis Povshedny.
How to? Drupal developer toolkit. Dennis Povshedny.How to? Drupal developer toolkit. Dennis Povshedny.
How to? Drupal developer toolkit. Dennis Povshedny.DrupalCampDN
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopAiden Seonghak Hong
 
KNOX-HTTPFS-ONEFS-WP
KNOX-HTTPFS-ONEFS-WPKNOX-HTTPFS-ONEFS-WP
KNOX-HTTPFS-ONEFS-WPBoni Bruno
 

Similar to April 2014 HUG : Integrating HUE with Multi-tenant cluster (20)

Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Running hadoop on ubuntu linux
Running hadoop on ubuntu linuxRunning hadoop on ubuntu linux
Running hadoop on ubuntu linux
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
 
Lumen
LumenLumen
Lumen
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
 
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSH
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSHTame Your Build And Deployment Process With Hudson, PHPUnit, and SSH
Tame Your Build And Deployment Process With Hudson, PHPUnit, and SSH
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
 
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 
How to? Drupal developer toolkit. Dennis Povshedny.
How to? Drupal developer toolkit. Dennis Povshedny.How to? Drupal developer toolkit. Dennis Povshedny.
How to? Drupal developer toolkit. Dennis Povshedny.
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
 
KNOX-HTTPFS-ONEFS-WP
KNOX-HTTPFS-ONEFS-WPKNOX-HTTPFS-ONEFS-WP
KNOX-HTTPFS-ONEFS-WP
 

More from Yahoo Developer Network

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaYahoo Developer Network
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Yahoo Developer Network
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanYahoo Developer Network
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Yahoo Developer Network
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathYahoo Developer Network
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuYahoo Developer Network
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolYahoo Developer Network
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Yahoo Developer Network
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Yahoo Developer Network
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathYahoo Developer Network
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Yahoo Developer Network
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathYahoo Developer Network
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsYahoo Developer Network
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Yahoo Developer Network
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondYahoo Developer Network
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Yahoo Developer Network
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...Yahoo Developer Network
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexYahoo Developer Network
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsYahoo Developer Network
 

More from Yahoo Developer Network (20)

Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon MediaDeveloping Mobile Apps for Performance - Swapnil Patel, Verizon Media
Developing Mobile Apps for Performance - Swapnil Patel, Verizon Media
 
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
Athenz - The Open-Source Solution to Provide Access Control in Dynamic Infras...
 
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo JapanAthenz & SPIFFE, Tatsuya Yano, Yahoo Japan
Athenz & SPIFFE, Tatsuya Yano, Yahoo Japan
 
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
Athenz with Istio - Single Access Control Model in Cloud Infrastructures, Tat...
 
CICD at Oath using Screwdriver
CICD at Oath using ScrewdriverCICD at Oath using Screwdriver
CICD at Oath using Screwdriver
 
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, OathBig Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
 
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenuHow @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
How @TwitterHadoop Chose Google Cloud, Joep Rottinghuis, Lohit VijayaRenu
 
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, AmpoolThe Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
The Future of Hadoop in an AI World, Milind Bhandarkar, CEO, Ampool
 
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
Apache YARN Federation and Tez at Microsoft, Anupam Upadhyay, Adrian Nicoara,...
 
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
Containerized Services on Apache Hadoop YARN: Past, Present, and Future, Shan...
 
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, OathHDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
HDFS Scalability and Security, Daryn Sharp, Senior Engineer, Oath
 
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
Hadoop {Submarine} Project: Running deep learning workloads on YARN, Wangda T...
 
Moving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, OathMoving the Oath Grid to Docker, Eric Badger, Oath
Moving the Oath Grid to Docker, Eric Badger, Oath
 
Architecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI ApplicationsArchitecting Petabyte Scale AI Applications
Architecting Petabyte Scale AI Applications
 
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
Introduction to Vespa – The Open Source Big Data Serving Engine, Jon Bratseth...
 
Jun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step BeyondJun 2017 HUG: YARN Scheduling – A Step Beyond
Jun 2017 HUG: YARN Scheduling – A Step Beyond
 
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
Jun 2017 HUG: Large-Scale Machine Learning: Use Cases and Technologies
 
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
February 2017 HUG: Slow, Stuck, or Runaway Apps? Learn How to Quickly Fix Pro...
 
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache ApexFebruary 2017 HUG: Exactly-once end-to-end processing with Apache Apex
February 2017 HUG: Exactly-once end-to-end processing with Apache Apex
 
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data AnalyticsFebruary 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
February 2017 HUG: Data Sketches: A required toolkit for Big Data Analytics
 

Recently uploaded

wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 

Recently uploaded (20)

wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 

April 2014 HUG : Integrating HUE with Multi-tenant cluster

  • 1. INTEGRATE HUE WITH YOUR HADOOP CLUSTER Romain Rigaux Y! HUG Apr 16, 2014
  • 2. WHAT
 IS HUE? WEB INTERFACE FOR MAKING HADOOP EASIER TO USE 
 Suite of apps for each Hadoop component,
 like Hive, Pig, Impala, Oozie, Solr, Sqoop2, HBase...
  • 3. VIEW FROM
 30K FEET Hadoop Web Server You and even that friend that uses IE9 ;)
  • 5. TARGET
 OF HUE GETTING STARTED WITH HADOOP 
 BEING PRODUCTIVE EXPLORING DIFFERENT ANGLES OF THE PLATFORM ! LET ANY USER FOCUS ON BIG DATA PROCESSING
 
 BEING COMPATIBLE WITH ANY HADOOP VERSION (0.20/1.2.0/2.3.0)
  • 6. OPEN SOURCE
 ~3000 COMMITS 
 33 CONTRIBUTORS
 
 648 STARS
 
 212 FORKS ! 
 github.com/cloudera/hue
  • 7. THE CORE
 TEAM PLAYERS team.gethue.com ABRAHAM ELMAHREK ROMAIN RIGAUX ENRICO BERTI CHANG BEER
  • 8. TALKS Meetups and events in NYC, Paris, LA, Tokyo, SF, Stockholm, Vienna, San Jose, Singapore…
 Coming up in London, West coast AROUND
 THE WORLD RETREATS Nov 13 Koh Chang, Thailand May 14 Curaçao, Netherlands Antilles
  • 9. FAST PACE LAST 30 DAYS 41 issues created and 38 resolved. Core team + Community
  • 11. HISTORY
 HUE 1 Desktop-like in a browser, did its job but pretty slow, memory leaks and not very IE friendly but definitely advanced for its time (2009-2010).
  • 12. HISTORY
 HUE 2 The first flat structure port, with Twitter Bootstrap all over the place.
  • 13. HISTORY
 HUE 2.5 New apps, improved the UX adding new nice functionalities like autocomplete and drag & drop.
  • 14. HISTORY
 HUE 3 ALPHA Proposed design, didn’t make it.
  • 15. HISTORY
 HUE 3.5+ Where we are now, new UI, several new apps, the most user friendly features to date.
  • 16. WHICH VERSION TO USE? 6 months 1k commits later1-2 years old HUE 2.X HUE 3.X HUE 3.5 + 1/2 3.6
  • 17. WHICH DISTRIBUTION? Advanced preview The most stable and cross component checked Very latest GITHUB CDH / CMTARBALL HACKER ADVANCED USER NORMAL USER
  • 18. WHERE TO PUT HUE? IN ONE MACHINE
  • 19. WHERE TO PUT HUE? INSIDE THE CLUSTER
  • 20. WHERE TO PUT HUE? OUTSIDE THE CLUSTER
  • 21. WHAT DO YOU NEED? Python 2.4 2.6
 
 That’s it if using a packaged version. If building from the source, here are the extra packages SERVER CLIENT Web Browser
 
 IE 9+, FF 10+, Chrome, Safari
  • 22. HOW DOES THE HUE SERVICE LOOK LIKE? Process serving pages and also static content 1 SERVER 1 DB For cookies, saved queries, workflows, …
  • 23. HOW TO CONFIGURE HUE HUE.INI Similar to core-site.xml but with .INI syntax ! Where? /etc/hue/conf/hue.ini
 or $HUE_HOME/desktop/conf/ pseudo-distributed.ini [desktop] [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, or sqlite3 engine=sqlite3 ## host= ## port= ## user= ## password= name=desktop/desktop.db
  • 24. AUTHENTICATE / LOGIN [desktop] [[auth]] # - django.contrib.auth.backends.ModelBackend (entirely Django backend) # - desktop.auth.backend.AllowAllBackend (allows everyone) # - desktop.auth.backend.AllowFirstUserDjangoBackend # - desktop.auth.backend.LdapBackend # - desktop.auth.backend.OAuthBackend # ... ## backend=desktop.auth.backend.AllowFirstUserDjangoBackend
  • 25. USERS Can give and revoke permissions to single users or group of users ADMIN USER Regular user + permissions
  • 27. LDAP BACKEND Integrate your employees: LDAP How to guide
  • 28. LIST OF GROUPS AND PERMISSIONS A permission can: - allow access to one app (e.g. Hive Editor) - modify data from the app (e.g drop Hive Tables or edit cells in HBase Browser) CONFIGURE APPS
 AND PERMISSIONS A list of permissions
  • 29. PERMISSIONS IN ACTION User ‘test’ belonging to the group ‘hiveonly’ that has just the ‘hive’ permissions CONFIGURE APPS
 AND PERMISSIONS
  • 30. HOW HUE INTERACTS
 WITH HADOOP YARN JobTracker Oozie Hue Plugins LDAP SAML Pig HDFS HiveServer2 Hive Metastore Cloudera Impala Solr HBase Sqoop2 Zookeeper
  • 31. RCP CALLS TO ALL
 THE HADOOP COMPONENTS HDFS EXAMPLE WebHDFS REST DN DN DN … DN NN http://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
  • 32. HOW Host/port of all services like Oozie, Yarn, HDFS, HBase… APIs are specified in hue.ini on sections, e.g. [hbase] by major service, Hue core [desktop] or Hue lib [liboozie] [hbase] # Comma-separated list of HBase Thrift servers for # clusters in the format of '(name|host:port)'. hbase_clusters=(Cluster|localhost:9090) ! [liboozie] # The URL where the Oozie service runs on. # oozie_url=http://hue.ent.cloudera.com:11000/oozie RCP CALLS TO ALL
 THE HADOOP COMPONENTS Full list
  • 33. KERBEROS 1 Hue ticket/ principal - no user ticket ! Hue uses its ticket for authenticating to every other service (HDFS, Oozie, …)
 read more on the Hue Security Guide
  • 34. HUE KERBEROS TICKET kadmin: addprinc -randkey hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM Add Hue user principal to Kerberos $ kinit -k -t /etc/hue/hue.keytab hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM Test Ticket should be renewable (krb5.conf and kdc.conf) [desktop] [[kerberos]] # Path to Hue's Kerberos keytab file hue_keytab=/etc/hue/hue.keytab # Kerberos principal name for Hue hue_principal=hue/FQDN@REALM # add kinit path for non root users kinit_path=/usr/kerberos/bin/kinit hue.ini
  • 35. HOW Hue is a “super proxy” 
 Client could be on a Windows machine, phone… and interact with all the Hadoop services http://localhost:50070/webhdfs/v1/tmp? op=GETFILESTATUS&user.name=hue&doas=bob IMPERSONATION <!-- Hue WebHDFS proxy user setting -->
 <property>
 <name>hadoop.proxyuser.hue.hosts</name>
 <value>*</value>
 </property>
 <property>
 <name>hadoop.proxyuser.hue.groups</name>
 <value>*</value>
 </property> Call for getting the information about an HDFS file WebHDFS, add to core-site.xml
  • 36. HTTPS SSL DBSSL WITH HIVESERVER2 READ MORE …AUDITING OTHER SECURITY
 FEATURES
  • 37. 2 Hue instances HA proxy Multi DB Performances: like a website, mostly RPC calls HIGH AVAILABILITY HOW
  • 39. SUM-UP Enable Hadoop Service APIs for Hue as a proxy user Configure hue.ini to point to each Service API Get help on @gethue or hue-user Install Hue on one machine + Hue Kerberos ticket Use an LDAP backend INSTALL CONFIGUREENABLE HELPLDAP
  • 40. CONFIGURATIONS ARE HARD… …GIVE CLOUDERA MANAGER A TRY! vimeo.com/91805055
  • 43. GET HUE
 Try in advance the latest and greatest but you’ll have to configure everything on your own. Get to play with Hue and various Hadoop components in 5 minutes. It’s a self contained CDH environment ready to use. Newer version than HDP, close to the original 2.5 minus apps like HBase, Impala, Sqoop, Search. The newest addition, ships Hue 3.0 through the GreenButton products. Stable and highly tested releases perfectly integrated with the Hadoop ecosystem, automagically configured by Cloudera Manager. In HDP there’s an old forked version of Hue 2.3. CLOUDERA’S CDH TARBALL CLOUDERA’S DEMO VM HORTONWORKS* MAPR* HP CLOUD* * YOUR MILEAGE MAY VARY. BIGTOP EMBEDDED/DEMO IN IND. COMPANIES
  • 44. WHAT ARE YOUR USE CASES? WHICH COMPONENTS DO YOU USE? WHAT WOULD YOU LIKE TO SEE IN HUE? INTERESTED IN CONTRIBUTING? WANNA SAY HELLO? DO YOU WANT A TAILOR MADE TEAM RETREAT? QUESTIONS? TEAM@ GETHUE.COM
  • 47. HOW Add Hue as WebHDFS proxy user setting like 3 slides ago
 
 Add the property on the right in hdfs-site.xml to enable WebHDFS in the NameNode and DataNodes <property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
 </property> HDFS FILE BROWSER [hadoop] [[hdfs_clusters]] # HA support by using HttpFs ! [[[default]]] # Enter the filesystem uri ##fs_defaultfs=hdfs://localhost:8020 ! # Use WebHdfs/HttpFs as the communication mechanism. ##webhdfs_url=http://localhost:50070/webhdfs/v1 hdfs-site.xml hue.ini
  • 48. HOW Example of config for having Hue interact with Yarn [hadoop] [[yarn_clusters]] ! [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=localhost ! # The port where the ResourceManager IPC listens on ## resourcemanager_port=8032 ! # Whether to submit jobs to this cluster submit_to=True ! # Change this if your YARN cluster is Kerberos-secured ## security_enabled=false ! # URL of the ResourceManager API ## resourcemanager_api_url=http://localhost:8088 ! # URL of the ProxyServer API ## proxy_api_url=http://localhost:8088 ! # URL of the HistoryServer API # history_server_api_url=http://localhost:19888 ! [[[ha]]] # Enter the host on which you are running the failover Resource Manager resourcemanager_api_url=http://localhost:8088 ## logical_name= submit_to=True YARN / MR2
  • 49. HOW Based on HiveServer2 interface ! Note for Hive:
 <property>
 <name>hive.server2.enable.doAs</ name>
 <value>true</value>
 </property> ! Video demo
 Setup tutorial [beeswax] # Host where Hive server Thrift daemon is running. # If Kerberos security is enabled, use fully-qualified domain name (FQDN). ## hive_server_host=localhost ## hive_server_port=10000 ! # Hive configuration directory, where hive-site.xml is located ## hive_conf_dir=/etc/hive/conf HIVE (IMPALA / SHARK)
  • 50. HOW Make sure share lib is installed ! Alternative Dashboard and Editors [liboozie] #oozie_url=http://localhost.com:11000/oozie OOZIE HOW Comes with Oozie, no PigServer yet Oozie sharelib Oozie credentials for security PIG