SPARK AND SOLR
IN HUE
Enrico Berti

INDEX AND VISUALIZE DATA IN A FEW CLICKS!
GOAL

OF HUE
WEB INTERFACE FOR ANALYZING DATA
WITH APACHE HADOOP	
  
SIMPLIFY AND INTEGRATE



FREE AND OPEN SOURCE
—> OPEN UP BIG DATA
VIEW FROM

30K FEET
Hadoop Web Server
You, your colleagues and even that
friend that uses IE9 ;)
OPEN SOURCE

~5230 COMMITS	
  


65 CONTRIBUTORS



1137 STARS



443 FORKS


github.com/cloudera/hue
TREND: GROWTH
gethue.com
WHERE TO PUT HUE? IN ONE MACHINE
WHERE TO PUT HUE? OUTSIDE THE CLUSTER
WHERE TO PUT HUE? INSIDE THE CLUSTER
FULL SUITE OF APPS
LIST OF GROUPS AND PERMISSIONS
A	
  permission	
  can:	
  
- allow	
  access	
  to	
  one	
  app	
  (e.g.	
  
Hive	
  Editor)	
  
- modify	
  data	
  from	
  the	
  app	
  (e.g	
  
drop	
  Hive	
  Tables	
  or	
  edit	
  cells	
  in	
  
HBase	
  Browser)
CONFIGURE APPS

AND PERMISSIONS
A	
  list	
  of	
  permissions
PERMISSIONS IN ACTION
User	
  ‘test’	
  belonging	
  to	
  the	
  group	
  
‘hiveonly’	
  that	
  has	
  just	
  the	
  ‘hive’	
  
permissions
CONFIGURE APPS

AND PERMISSIONS
• Impala,	
  Hive	
  integraGon	
  
• InteracGve	
  SQL	
  editor	
  	
  
• IntegraGon	
  with	
  MapReduce,	
  
Metastore,	
  HDFS
SQL
WHAT
• AuthorizaGon	
  for	
  SQL,	
  HDFS	
  URI	
  
• Who	
  can	
  see	
  and	
  do	
  what	
  on	
  the	
  
database/tables	
  
• Edit	
  security	
  role/privileges	
  
• Next:	
  Solr
SENTRY APP
WHAT
• Solr	
  &	
  Cloud	
  integraGon	
  
• Custom	
  interacGve	
  dashboards	
  
• Drag	
  &	
  drop	
  widgets	
  (charts,	
  
Gmeline…)
SEARCH
WHAT
• Open	
  source	
  REST	
  for	
  Spark	
  Shell	
  
• Runs	
  locally	
  or	
  inside	
  YARN	
  
• Spark	
  Scala,	
  PySpark	
  and	
  jar/py	
  
submission
LIVY SERVER
WHAT
hZps://github.com/cloudera/hue/tree/master/apps/spark/java
LIVY ARCH
YARN LOCAL
Livy	
  Server
Livy	
  REPL
Spark	
  Contexts
Spark	
  Worker
Livy	
  Server
YARN	
  Master
YARN	
  Node
Livy	
  REPL
Spark	
  Context	
  /	
  PySpark
YARN	
  Node
Spark	
  Worker
YARN	
  Node
Spark	
  Worker
1
2
3
4
LIVY ARCH
Livy	
  Server+ =
• Scala	
  
• Python	
  
• Java	
  
• SQL	
  
• Text
SPARK NOTEBOOKS
WHAT
DEMO
TIME

TWITTER
@gethue
USER GROUP
hue-­‐user@
WEBSITE
hZp://gethue.com
LEARN
hZp://learn.gethue.com
THANKS!


Harness the power of Spark and Solr in Hue: Big Data Amsterdam v.2.0