SlideShare a Scribd company logo
1 of 66
Download to read offline
BIG DATA WEB APPLICATIONS FOR INTERACTIVE HADOOP 
ENRICO BERTI 
UI ENGINEER CLOUDERA'S HUE
BIG DATA WEB APPS 
FOR INTERACTIVE 
HADOOP 
Enrico Berti 
Big Data Spain, Nov 17, 2014
GOAL 
OF HUE 
WEB INTERFACE FOR ANALYZING DATA 
WITH APACHE HADOOP 
! 
SIMPLIFY AND INTEGRATE 
FREE AND OPEN SOURCE 
! 
—> OPEN UP BIG DATA
VIEW FROM 
30K FEET 
Hadoop Web Server You, your colleagues and even that 
friend that uses IE9 ;)
OPEN SOURCE 
~4000 COMMITS 
56 CONTRIBUTORS 
911 STARS 
337 FORKS 
! 
github.com/cloudera/hue
THE CORE 
TEAM PLAYERS 
Romain 
Rigaux Chang Enrico 
Ber9 Amstel 
Join 
us 
at 
team.gethue.com 
Longboard 
Lager 
Dorada 
San 
Miguel 
….
AROUND 
THE WORLD 
TALKS 
Meetups 
and 
events 
in 
NYC, 
Paris, 
LA, 
Tokyo, 
SF, 
Stockholm, 
Vienna, 
San 
Jose, 
Singapore, 
Budapest, 
DC, 
Madrid… 
RETREATS 
Nov 
13 
Koh 
Chang, 
Thailand 
May 
14 
Curaçao, 
Netherlands 
An9lles 
Aug 
14 
Big 
Island, 
Hawaii 
Nov 
14 
Tenerife, 
Spain 
Nov 
14 
Nicaragua 
and 
Belize 
Jan 
15 
Philippines
TREND: GROWTH 
gethue.com
HISTORY 
HUE 1 
Desktop-­‐like 
in 
a 
browser, 
did 
its 
job 
but 
preYy 
slow, 
memory 
leaks 
and 
not 
very 
IE 
friendly 
but 
definitely 
advanced 
for 
its 
9me 
(2009-­‐2010).
HISTORY 
HUE 2 
The 
first 
flat 
structure 
port, 
with 
TwiYer 
Bootstrap 
all 
over 
the 
place. 
HUE 2.5 
New 
apps, 
improved 
the 
UX 
adding 
new 
nice 
func9onali9es 
like 
autocomplete 
and 
drag 
& 
drop.
HISTORY 
HUE 3 ALPHA 
Proposed 
design, 
didn’t 
make 
it.
HISTORY 
HUE 3.6+ 
Where 
we 
are 
now, 
a 
brand 
new 
way 
to 
search 
and 
explore 
your 
data.
WHICH DISTRIBUTION? 
HACKER ADVANCED USER NORMAL USER 
Advanced 
preview The 
most 
stable 
and 
cross 
component 
checked 
Very 
latest 
GITHUB TARBALL CDH / CM
WHERE TO PUT HUE? IN ONE MACHINE
WHERE TO PUT HUE? OUTSIDE THE CLUSTER
WHERE TO PUT HUE? INSIDE THE CLUSTER
SERVER CLIENT 
Python 
2.4 
2.6 
That’s 
it 
if 
using 
a 
packaged 
version. 
If 
building 
from 
the 
source, 
here 
are 
the 
extra 
packages 
Web 
Browser 
IE 
9+, 
FF 
10+, 
Chrome, 
Safari 
WHAT DO YOU NEED? 
Hi 
there, 
I’m 
“just” 
a 
web 
server.
HOW DOES THE HUE SERVICE LOOK LIKE? 
1 SERVER 1 DB 
Process 
serving 
pages 
and 
also 
static 
content 
For 
cookies, 
saved 
queries, 
workflows, 
… 
Hi 
there, 
I’m 
“just” 
a 
web 
server.
HOW TO CONFIGURE HUE 
HUE.INI 
Similar 
to 
core-­‐site.xml 
but 
with 
.INI 
syntax 
! 
Where? 
/etc/hue/conf/hue.ini 
or 
$HUE_HOME/desktop/conf/ 
pseudo-distributed.ini 
[desktop] 
[[database]] 
# Database engine is typically one of: 
# postgresql_psycopg2, mysql, or sqlite3 
engine=sqlite3 
## host= 
## port= 
## user= 
## password= 
name=desktop/desktop.db
AUTHENTICATION 
SIMPLE ENTERPRISE 
Login/Password 
in 
a 
Database 
(SQLite, 
MySQL, 
…) 
LDAP 
(most 
used), 
OAuth, 
OpenID, 
SAML
DB BACKEND
LDAP BACKEND 
Integrate 
your 
employees: 
LDAP 
How 
to 
guide
USERS 
ADMIN USER 
Can 
give 
and 
revoke 
permissions 
to 
single 
users 
or 
group 
of 
users 
Regular 
user 
+ 
permissions
CONFIGURE APPS 
AND PERMISSIONS 
LIST OF GROUPS AND PERMISSIONS 
A 
permission 
can: 
- allow 
access 
to 
one 
app 
(e.g. 
Hive 
Editor) 
- modify 
data 
from 
the 
app 
(e.g 
drop 
Hive 
Tables 
or 
edit 
cells 
in 
HBase 
Browser) 
A 
list 
of 
permissions
CONFIGURE APPS 
AND PERMISSIONS 
PERMISSIONS IN ACTION 
User 
‘test’ 
belonging 
to 
the 
group 
‘hiveonly’ 
that 
has 
just 
the 
‘hive’ 
permissions
HOW HUE INTERACTS 
WITH HADOOP 
YARN 
JobTracker 
Oozie 
LDAP 
SAML 
Hue Plugins 
Pig 
HDFS HiveServer2 
Hive 
Metastore 
Zookeeper 
Cloudera 
Impala 
Sqoop2 
HBase 
Solr
RCP CALLS TO ALL 
THE HADOOP COMPONENTS 
HDFS EXAMPLE 
WebHDFS 
REST 
DN 
DN 
DN 
… 
DN 
NN 
hYp://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
RCP CALLS TO ALL 
THE HADOOP COMPONENTS 
HOW 
List 
all 
the 
host/port 
of 
Hadoop 
APIs 
in 
the 
hue.ini 
! 
For 
example 
here 
HBase 
and 
Hive. 
Full 
list 
[hbase] 
# Comma-separated list of HBase Thrift servers for 
# clusters in the format of '(name|host:port)'. 
hbase_clusters=(Cluster|localhost:9090) 
! 
[beeswax] 
hive_server_host=host-abc 
hive_server_port=10000
HTTPS SSL WITH HIVESERVER2 SSL DB 
READ MORE … 
SECURITY 
FEATURES 
SENTRY KERBEROS
HIGH AVAILABILITY 
HOW 
2 
Hue 
instances 
HA 
proxy 
Mul9 
DB 
Performances: 
like 
a 
website, 
mostly 
RPC 
calls
FULL SUITE OF APPS
Simple 
custom 
query 
language 
Supports 
HBase 
filter 
language 
Supports 
selec9on 
& 
Copy 
+ 
Paste, 
gracefully 
degrades 
in 
IE 
Autocomplete 
Help 
Menu 
Row$Key$ 
Prefix$Scan$ 
Scan$Length$ 
Thri=$Filterstring$ 
Column/Family$Filters$ 
Searchbar(Syntax(Breakdown( 
HBASE BROWSER 
WHAT
SQL 
WHAT 
Impala, 
Hive 
integra9on, 
Spark 
Interac9ve 
SQL 
editor 
Integra9on 
with 
MapReduce, 
Metastore, 
HDFS
SENTRY APP
SEARCH 
WHAT 
Solr 
& 
Cloud 
integra9on 
Custom 
interac9ve 
dashboards 
Drag 
& 
drop 
widgets 
(charts, 
9meline…)
JUST A VIEW 
ON TOP OF SOLR API 
REST
HISTORY 
V1 USER
HISTORY 
V1 ADMIN
HISTORY 
V2 USER
HISTORY 
V2 ADMIN
ARCHITECTURE 
REST AJAX 
/select 
/admin/collections 
/get 
/luke... 
/add_widget 
/zoom_in 
/select_facet 
/select_range... 
www…. 
Templates 
+ 
JS Model
ARCHITECTURE 
UI FOR FACETS 
All the 2D positioning (cell ids), visual, drag&drop 
Dashboard, fields, template, widgets (ids) 
Search terms, selected facets (q, fqs) 
LAYOUT 
COLLECTION 
QUERY
ADDING A WIDGET 
LIFECYCLE 
REST AJAX 
/solr/zookeeper/clusterstate.json 
/solr/admin/luke… 
/get_collection 
Load the initial page 
Edit mode and Drag&Drop
ADDING A WIDGET 
LIFECYCLE 
Guess ranges (number or dates) 
Rounding (number or dates) 
REST AJAX 
Select the field 
/solr/select?stats=true /new_facet
ADDING A WIDGET 
LIFECYCLE 
Query part 1 
facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000& 
f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10 
Query Part 2 
q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000] 
Augment Solr response 
{ ! 
'facet_counts':{ ! 
'facet_ranges':{ ! 
'bytes':{ ! 
'start':10000,! 
'counts':[ ! 
'900000',! 
3423,! 
'1800000',! 
339,! 
! ! ...! 
]! 
}! 
}! 
{! 
...,! 
'normalized_facets':[ ! 
{ ! 
'extraSeries':[ ! 
! 
],! 
'label':'bytes',! 
'field':'bytes',! 
'counts':[ ! 
{ ! 
'from’:'900000',! 
'to':'1800000',! 
'selected':True,! 
'value':3423,! 
'field’:'bytes',! 
'exclude':False! 
}! 
], ...! 
}! 
}! 
}
JSON TO WIDGET 
{ ! 
"field":"rate_code",! 
"counts":[ ! 
{ ! 
"count":97797,! 
"exclude":true,! 
"selected":false,! 
"value":"1",! 
"cat":"rate_code"! 
} ... 
{ ! 
"field":"medallion",! 
"counts":[ ! 
{ ! 
"count":159,! 
"exclude":true,! 
"selected":false,! 
"value":"6CA28FC49A4C49A9A96",! 
"cat":"medallion"! 
} …. 
{ ! 
"extraSeries":[ ! 
! 
],! 
"label":"trip_time_in_secs",! 
"field":"trip_time_in_secs",! 
"counts":[ ! 
{ ! 
"from":"0",! 
"to":"10",! 
"selected":false,! 
"value":527,! 
"field":"trip_time_in_secs",! 
"exclude":true! 
} ... 
{ ! 
"field":"passenger_count",! 
"counts":[ ! 
{ ! 
"count":74766,! 
"exclude":true,! 
"selected":false,! 
"value":"1",! 
"cat":"passenger_count"! 
} ...
REPEAT UNTIL…
ENTERPRISE FEATURES 
- Access to Search App configurable, LDAP/SAML auths 
- Share by link 
- Solr Cloud (or non Cloud) 
- Proxy user 
/solr/jobs_demo/select?user.name=hue&doAs=romain&q= 
- Security 
Kerberos 
- Sentry 
Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
SPARK IGNITER
HISTORY 
OCT 2013 
Submit 
through 
Oozie 
! 
Shell 
like 
for 
Java, 
Scala, 
Python
HISTORY 
JAN 2014 
V2 
Spark 
Igniter 
Spark 
0.8 
Java, 
Scala 
with 
Spark 
Job 
Server 
APR 2014 
Spark 
0.9 
JUN 2014 
Ironing 
+ 
How 
to 
deploy
“JUST A VIEW” 
ON TOP OF SPARK 
submit 
list apps 
list jobs 
list contexts 
Saved script metadata Hue Job Server 
eg. name, args, classname, jar name…
HOW TO TALK 
TO SPARK? 
Hue Spark Job Server 
Spark
APP 
LIFE CYCLE 
Hue Spark Job Server 
Spark
… extend SparkJob 
.scala 
sbt _/package 
JAR 
Upload 
APP 
LIFE CYCLE
… extend SparkJob 
.scala 
sbt _/package 
JAR 
Upload 
APP 
LIFE CYCLE 
Context 
create context: auto or manual
SPARK JOB SERVER 
WHERE 
curl -d "input.string = a b c a b see" 'localhost:8090/jobs? 
appName=test&classPath=spark.jobserver.WordCountExample' 
{ 
"status": "STARTED", 
"result": { 
"jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", 
"context": "b7ea0eb5-spark.jobserver.WordCountExample" 
} 
} 
hYps://github.com/ooyala/spark-­‐jobserver 
WHAT 
REST 
job 
server 
for 
Spark 
WHEN 
Spark 
Summit 
talk 
Monday 
5:45pm: 
Spark 
Job 
Server: 
Easy 
Spark 
Job 
Management 
by 
Ooyala
FOCUS ON UX 
curl -d "input.string = a b c a b see" 'localhost:8090/jobs? 
appName=test&classPath=spark.jobserver.WordCountExample' 
{ 
"status": "STARTED", 
"result": { 
"jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", 
"context": "b7ea0eb5-spark.jobserver.WordCountExample" 
} 
} 
VS
TRAIT SPARKJOB 
/**! 
* This trait is the main API for Spark jobs submitted to the Job Server.! 
*/! 
trait SparkJob {! 
/**! 
* This is the entry point for a Spark Job Server to execute Spark jobs.! 
* */! 
def runJob(sc: SparkContext, jobConfig: Config): Any! 
! 
/**! 
* This method is called by the job server to allow jobs to validate their input and reject! 
* invalid job requests. */! 
def validate(sc: SparkContext, config: Config): SparkJobValidation! 
}!
DEMO 
TIME
SUM-UP 
INSTALL ENABLE CONFIGURE 
Enable 
Hadoop 
Service 
APIs 
for 
Hue 
as 
a 
proxy 
user 
Configure 
hue.ini 
to 
point 
to 
each 
Service 
API 
LDAP HELP 
Get 
help 
on 
@gethue 
or 
hue-­‐ 
user 
Install 
Hue 
on 
one 
machine 
Use 
an 
LDAP 
backend
ROADMAP 
NEXT 6 MONTHS 
WHAT 
Oozie 
v2 
Spark 
v2 
SQL 
v2 
More 
dashboards! 
Inter 
component 
integra9ons 
(HBase 
<-­‐> 
Search, 
create 
index 
wizards, 
document 
permissions), 
Hadoop 
Web 
apps 
SDK 
Your 
idea 
here.
CONFIGURATIONS ARE HARD… 
…GIVE CLOUDERA MANAGER A TRY! 
vimeo.com/91805055
MISSED 
SOMETHING? 
learn.gethue.com
GRACIAS! 
WEBSITE 
hYp://gethue.com 
LEARN 
hYp://learn.gethue.com 
TWITTER 
@gethue 
USER GROUP 
hue-­‐user@
17TH ~ 18th NOV 2014 
MADRID (SPAIN)

More Related Content

What's hot

Terraform Introduction
Terraform IntroductionTerraform Introduction
Terraform Introductionsoniasnowfrog
 
Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practicesRadek Simko
 
Beeswax Hive editor in Hue
Beeswax Hive editor in HueBeeswax Hive editor in Hue
Beeswax Hive editor in HueRomain Rigaux
 
Ground Control to Nomad Job Dispatch
Ground Control to Nomad Job DispatchGround Control to Nomad Job Dispatch
Ground Control to Nomad Job DispatchMichael Lange
 
AnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and TricksAnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and Tricksjimi-c
 
"Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ..."Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ...Anton Babenko
 
Infrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformInfrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformAlexander Popov
 
Creating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoCreating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoBrian Hogan
 
A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices Nebulaworks
 
Terraform in deployment pipeline
Terraform in deployment pipelineTerraform in deployment pipeline
Terraform in deployment pipelineAnton Babenko
 
Everything as Code with Terraform
Everything as Code with TerraformEverything as Code with Terraform
Everything as Code with TerraformAll Things Open
 
Create Development and Production Environments with Vagrant
Create Development and Production Environments with VagrantCreate Development and Production Environments with Vagrant
Create Development and Production Environments with VagrantBrian Hogan
 
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)Stephane Jourdan
 
Ansible : what's ansible & use case by REX
Ansible :  what's ansible & use case by REXAnsible :  what's ansible & use case by REX
Ansible : what's ansible & use case by REXSaewoong Lee
 
Ansible leveraging 2.0
Ansible leveraging 2.0Ansible leveraging 2.0
Ansible leveraging 2.0bcoca
 
DevOps and Chef
DevOps and ChefDevOps and Chef
DevOps and ChefPiXeL16
 
Getting Started with Ansible
Getting Started with AnsibleGetting Started with Ansible
Getting Started with Ansibleahamilton55
 
V2 and beyond
V2 and beyondV2 and beyond
V2 and beyondjimi-c
 

What's hot (20)

Terraform Introduction
Terraform IntroductionTerraform Introduction
Terraform Introduction
 
Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practices
 
Beeswax Hive editor in Hue
Beeswax Hive editor in HueBeeswax Hive editor in Hue
Beeswax Hive editor in Hue
 
Terraform at Scale
Terraform at ScaleTerraform at Scale
Terraform at Scale
 
Ground Control to Nomad Job Dispatch
Ground Control to Nomad Job DispatchGround Control to Nomad Job Dispatch
Ground Control to Nomad Job Dispatch
 
AnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and TricksAnsibleFest 2014 - Role Tips and Tricks
AnsibleFest 2014 - Role Tips and Tricks
 
"Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ..."Continuously delivering infrastructure using Terraform and Packer" training ...
"Continuously delivering infrastructure using Terraform and Packer" training ...
 
Infrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to TerraformInfrastructure as Code: Introduction to Terraform
Infrastructure as Code: Introduction to Terraform
 
Creating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with HugoCreating and Deploying Static Sites with Hugo
Creating and Deploying Static Sites with Hugo
 
Everything as Code with Terraform
Everything as Code with TerraformEverything as Code with Terraform
Everything as Code with Terraform
 
A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices
 
Terraform in deployment pipeline
Terraform in deployment pipelineTerraform in deployment pipeline
Terraform in deployment pipeline
 
Everything as Code with Terraform
Everything as Code with TerraformEverything as Code with Terraform
Everything as Code with Terraform
 
Create Development and Production Environments with Vagrant
Create Development and Production Environments with VagrantCreate Development and Production Environments with Vagrant
Create Development and Production Environments with Vagrant
 
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)
Using Terraform.io (Human Talks Montpellier, Epitech, 2014/09/09)
 
Ansible : what's ansible & use case by REX
Ansible :  what's ansible & use case by REXAnsible :  what's ansible & use case by REX
Ansible : what's ansible & use case by REX
 
Ansible leveraging 2.0
Ansible leveraging 2.0Ansible leveraging 2.0
Ansible leveraging 2.0
 
DevOps and Chef
DevOps and ChefDevOps and Chef
DevOps and Chef
 
Getting Started with Ansible
Getting Started with AnsibleGetting Started with Ansible
Getting Started with Ansible
 
V2 and beyond
V2 and beyondV2 and beyond
V2 and beyond
 

Viewers also liked

Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Big Data Spain
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceBig Data Spain
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014Big Data Spain
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Big Data Spain
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Big Data Spain
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...Big Data Spain
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...Big Data Spain
 

Viewers also liked (20)

Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
 
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 

Similar to Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data Spain 2014

Be a microservices hero
Be a microservices heroBe a microservices hero
Be a microservices heroOpenRestyCon
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Diseño y Desarrollo de APIs
Diseño y Desarrollo de APIsDiseño y Desarrollo de APIs
Diseño y Desarrollo de APIsRaúl Neis
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Michael Rys
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar SlidesDuraSpace
 
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Alluxio, Inc.
 
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015Sri Ambati
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and PythonPiXeL16
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Julian Hyde
 
HTML5 tutorial: canvas, offfline & sockets
HTML5 tutorial: canvas, offfline & socketsHTML5 tutorial: canvas, offfline & sockets
HTML5 tutorial: canvas, offfline & socketsRemy Sharp
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App developmentLuca Garulli
 
Sensu wrapper-sensu-summit
Sensu wrapper-sensu-summitSensu wrapper-sensu-summit
Sensu wrapper-sensu-summitLee Briggs
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariJayush Luniya
 
Introduction to REST and Hypermedia
Introduction to REST and HypermediaIntroduction to REST and Hypermedia
Introduction to REST and HypermediaNordic APIs
 
JSON REST API for WordPress
JSON REST API for WordPressJSON REST API for WordPress
JSON REST API for WordPressTaylor Lovett
 
Debugging Hive with Hadoop-in-the-Cloud
Debugging Hive with Hadoop-in-the-CloudDebugging Hive with Hadoop-in-the-Cloud
Debugging Hive with Hadoop-in-the-CloudSoam Acharya
 
De-Bugging Hive with Hadoop-in-the-Cloud
De-Bugging Hive with Hadoop-in-the-CloudDe-Bugging Hive with Hadoop-in-the-Cloud
De-Bugging Hive with Hadoop-in-the-CloudDataWorks Summit
 
Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Giacomo Vacca
 

Similar to Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data Spain 2014 (20)

Be a microservices hero
Be a microservices heroBe a microservices hero
Be a microservices hero
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Diseño y Desarrollo de APIs
Diseño y Desarrollo de APIsDiseño y Desarrollo de APIs
Diseño y Desarrollo de APIs
 
Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)Big Data Processing with .NET and Spark (SQLBits 2020)
Big Data Processing with .NET and Spark (SQLBits 2020)
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides
 
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
Building a Cloud Native Stack with EMR Spark, Alluxio, and S3
 
Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015Machine Learning with H2O, Spark, and Python at Strata 2015
Machine Learning with H2O, Spark, and Python at Strata 2015
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and Python
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Polyalgebra
PolyalgebraPolyalgebra
Polyalgebra
 
HTML5 tutorial: canvas, offfline & sockets
HTML5 tutorial: canvas, offfline & socketsHTML5 tutorial: canvas, offfline & sockets
HTML5 tutorial: canvas, offfline & sockets
 
OrientDB for real & Web App development
OrientDB for real & Web App developmentOrientDB for real & Web App development
OrientDB for real & Web App development
 
Ext Js
Ext JsExt Js
Ext Js
 
Sensu wrapper-sensu-summit
Sensu wrapper-sensu-summitSensu wrapper-sensu-summit
Sensu wrapper-sensu-summit
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Introduction to REST and Hypermedia
Introduction to REST and HypermediaIntroduction to REST and Hypermedia
Introduction to REST and Hypermedia
 
JSON REST API for WordPress
JSON REST API for WordPressJSON REST API for WordPress
JSON REST API for WordPress
 
Debugging Hive with Hadoop-in-the-Cloud
Debugging Hive with Hadoop-in-the-CloudDebugging Hive with Hadoop-in-the-Cloud
Debugging Hive with Hadoop-in-the-Cloud
 
De-Bugging Hive with Hadoop-in-the-Cloud
De-Bugging Hive with Hadoop-in-the-CloudDe-Bugging Hive with Hadoop-in-the-Cloud
De-Bugging Hive with Hadoop-in-the-Cloud
 
Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017Homer - Workshop at Kamailio World 2017
Homer - Workshop at Kamailio World 2017
 

More from Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

More from Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data Spain 2014

  • 1. BIG DATA WEB APPLICATIONS FOR INTERACTIVE HADOOP ENRICO BERTI UI ENGINEER CLOUDERA'S HUE
  • 2. BIG DATA WEB APPS FOR INTERACTIVE HADOOP Enrico Berti Big Data Spain, Nov 17, 2014
  • 3. GOAL OF HUE WEB INTERFACE FOR ANALYZING DATA WITH APACHE HADOOP ! SIMPLIFY AND INTEGRATE FREE AND OPEN SOURCE ! —> OPEN UP BIG DATA
  • 4. VIEW FROM 30K FEET Hadoop Web Server You, your colleagues and even that friend that uses IE9 ;)
  • 5. OPEN SOURCE ~4000 COMMITS 56 CONTRIBUTORS 911 STARS 337 FORKS ! github.com/cloudera/hue
  • 6. THE CORE TEAM PLAYERS Romain Rigaux Chang Enrico Ber9 Amstel Join us at team.gethue.com Longboard Lager Dorada San Miguel ….
  • 7. AROUND THE WORLD TALKS Meetups and events in NYC, Paris, LA, Tokyo, SF, Stockholm, Vienna, San Jose, Singapore, Budapest, DC, Madrid… RETREATS Nov 13 Koh Chang, Thailand May 14 Curaçao, Netherlands An9lles Aug 14 Big Island, Hawaii Nov 14 Tenerife, Spain Nov 14 Nicaragua and Belize Jan 15 Philippines
  • 9. HISTORY HUE 1 Desktop-­‐like in a browser, did its job but preYy slow, memory leaks and not very IE friendly but definitely advanced for its 9me (2009-­‐2010).
  • 10. HISTORY HUE 2 The first flat structure port, with TwiYer Bootstrap all over the place. HUE 2.5 New apps, improved the UX adding new nice func9onali9es like autocomplete and drag & drop.
  • 11. HISTORY HUE 3 ALPHA Proposed design, didn’t make it.
  • 12. HISTORY HUE 3.6+ Where we are now, a brand new way to search and explore your data.
  • 13. WHICH DISTRIBUTION? HACKER ADVANCED USER NORMAL USER Advanced preview The most stable and cross component checked Very latest GITHUB TARBALL CDH / CM
  • 14. WHERE TO PUT HUE? IN ONE MACHINE
  • 15. WHERE TO PUT HUE? OUTSIDE THE CLUSTER
  • 16. WHERE TO PUT HUE? INSIDE THE CLUSTER
  • 17. SERVER CLIENT Python 2.4 2.6 That’s it if using a packaged version. If building from the source, here are the extra packages Web Browser IE 9+, FF 10+, Chrome, Safari WHAT DO YOU NEED? Hi there, I’m “just” a web server.
  • 18. HOW DOES THE HUE SERVICE LOOK LIKE? 1 SERVER 1 DB Process serving pages and also static content For cookies, saved queries, workflows, … Hi there, I’m “just” a web server.
  • 19. HOW TO CONFIGURE HUE HUE.INI Similar to core-­‐site.xml but with .INI syntax ! Where? /etc/hue/conf/hue.ini or $HUE_HOME/desktop/conf/ pseudo-distributed.ini [desktop] [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, or sqlite3 engine=sqlite3 ## host= ## port= ## user= ## password= name=desktop/desktop.db
  • 20. AUTHENTICATION SIMPLE ENTERPRISE Login/Password in a Database (SQLite, MySQL, …) LDAP (most used), OAuth, OpenID, SAML
  • 22. LDAP BACKEND Integrate your employees: LDAP How to guide
  • 23. USERS ADMIN USER Can give and revoke permissions to single users or group of users Regular user + permissions
  • 24. CONFIGURE APPS AND PERMISSIONS LIST OF GROUPS AND PERMISSIONS A permission can: - allow access to one app (e.g. Hive Editor) - modify data from the app (e.g drop Hive Tables or edit cells in HBase Browser) A list of permissions
  • 25. CONFIGURE APPS AND PERMISSIONS PERMISSIONS IN ACTION User ‘test’ belonging to the group ‘hiveonly’ that has just the ‘hive’ permissions
  • 26. HOW HUE INTERACTS WITH HADOOP YARN JobTracker Oozie LDAP SAML Hue Plugins Pig HDFS HiveServer2 Hive Metastore Zookeeper Cloudera Impala Sqoop2 HBase Solr
  • 27. RCP CALLS TO ALL THE HADOOP COMPONENTS HDFS EXAMPLE WebHDFS REST DN DN DN … DN NN hYp://localhost:50070/webhdfs/v1/<PATH>?op=LISTSTATUS
  • 28. RCP CALLS TO ALL THE HADOOP COMPONENTS HOW List all the host/port of Hadoop APIs in the hue.ini ! For example here HBase and Hive. Full list [hbase] # Comma-separated list of HBase Thrift servers for # clusters in the format of '(name|host:port)'. hbase_clusters=(Cluster|localhost:9090) ! [beeswax] hive_server_host=host-abc hive_server_port=10000
  • 29. HTTPS SSL WITH HIVESERVER2 SSL DB READ MORE … SECURITY FEATURES SENTRY KERBEROS
  • 30. HIGH AVAILABILITY HOW 2 Hue instances HA proxy Mul9 DB Performances: like a website, mostly RPC calls
  • 32. Simple custom query language Supports HBase filter language Supports selec9on & Copy + Paste, gracefully degrades in IE Autocomplete Help Menu Row$Key$ Prefix$Scan$ Scan$Length$ Thri=$Filterstring$ Column/Family$Filters$ Searchbar(Syntax(Breakdown( HBASE BROWSER WHAT
  • 33. SQL WHAT Impala, Hive integra9on, Spark Interac9ve SQL editor Integra9on with MapReduce, Metastore, HDFS
  • 35. SEARCH WHAT Solr & Cloud integra9on Custom interac9ve dashboards Drag & drop widgets (charts, 9meline…)
  • 36. JUST A VIEW ON TOP OF SOLR API REST
  • 41. ARCHITECTURE REST AJAX /select /admin/collections /get /luke... /add_widget /zoom_in /select_facet /select_range... www…. Templates + JS Model
  • 42. ARCHITECTURE UI FOR FACETS All the 2D positioning (cell ids), visual, drag&drop Dashboard, fields, template, widgets (ids) Search terms, selected facets (q, fqs) LAYOUT COLLECTION QUERY
  • 43. ADDING A WIDGET LIFECYCLE REST AJAX /solr/zookeeper/clusterstate.json /solr/admin/luke… /get_collection Load the initial page Edit mode and Drag&Drop
  • 44. ADDING A WIDGET LIFECYCLE Guess ranges (number or dates) Rounding (number or dates) REST AJAX Select the field /solr/select?stats=true /new_facet
  • 45. ADDING A WIDGET LIFECYCLE Query part 1 facet.range={!ex=bytes}bytes&f.bytes.facet.range.start=0&f.bytes.facet.range.end=9000000& f.bytes.facet.range.gap=900000&f.bytes.facet.mincount=0&f.bytes.facet.limit=10 Query Part 2 q=Chrome&fq={!tag=bytes}bytes:[900000+TO+1800000] Augment Solr response { ! 'facet_counts':{ ! 'facet_ranges':{ ! 'bytes':{ ! 'start':10000,! 'counts':[ ! '900000',! 3423,! '1800000',! 339,! ! ! ...! ]! }! }! {! ...,! 'normalized_facets':[ ! { ! 'extraSeries':[ ! ! ],! 'label':'bytes',! 'field':'bytes',! 'counts':[ ! { ! 'from’:'900000',! 'to':'1800000',! 'selected':True,! 'value':3423,! 'field’:'bytes',! 'exclude':False! }! ], ...! }! }! }
  • 46. JSON TO WIDGET { ! "field":"rate_code",! "counts":[ ! { ! "count":97797,! "exclude":true,! "selected":false,! "value":"1",! "cat":"rate_code"! } ... { ! "field":"medallion",! "counts":[ ! { ! "count":159,! "exclude":true,! "selected":false,! "value":"6CA28FC49A4C49A9A96",! "cat":"medallion"! } …. { ! "extraSeries":[ ! ! ],! "label":"trip_time_in_secs",! "field":"trip_time_in_secs",! "counts":[ ! { ! "from":"0",! "to":"10",! "selected":false,! "value":527,! "field":"trip_time_in_secs",! "exclude":true! } ... { ! "field":"passenger_count",! "counts":[ ! { ! "count":74766,! "exclude":true,! "selected":false,! "value":"1",! "cat":"passenger_count"! } ...
  • 48. ENTERPRISE FEATURES - Access to Search App configurable, LDAP/SAML auths - Share by link - Solr Cloud (or non Cloud) - Proxy user /solr/jobs_demo/select?user.name=hue&doAs=romain&q= - Security Kerberos - Sentry Collection level, Solr calls like /admin, /query, Solr UI, ZooKeeper
  • 50. HISTORY OCT 2013 Submit through Oozie ! Shell like for Java, Scala, Python
  • 51. HISTORY JAN 2014 V2 Spark Igniter Spark 0.8 Java, Scala with Spark Job Server APR 2014 Spark 0.9 JUN 2014 Ironing + How to deploy
  • 52. “JUST A VIEW” ON TOP OF SPARK submit list apps list jobs list contexts Saved script metadata Hue Job Server eg. name, args, classname, jar name…
  • 53. HOW TO TALK TO SPARK? Hue Spark Job Server Spark
  • 54. APP LIFE CYCLE Hue Spark Job Server Spark
  • 55. … extend SparkJob .scala sbt _/package JAR Upload APP LIFE CYCLE
  • 56. … extend SparkJob .scala sbt _/package JAR Upload APP LIFE CYCLE Context create context: auto or manual
  • 57. SPARK JOB SERVER WHERE curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } hYps://github.com/ooyala/spark-­‐jobserver WHAT REST job server for Spark WHEN Spark Summit talk Monday 5:45pm: Spark Job Server: Easy Spark Job Management by Ooyala
  • 58. FOCUS ON UX curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } VS
  • 59. TRAIT SPARKJOB /**! * This trait is the main API for Spark jobs submitted to the Job Server.! */! trait SparkJob {! /**! * This is the entry point for a Spark Job Server to execute Spark jobs.! * */! def runJob(sc: SparkContext, jobConfig: Config): Any! ! /**! * This method is called by the job server to allow jobs to validate their input and reject! * invalid job requests. */! def validate(sc: SparkContext, config: Config): SparkJobValidation! }!
  • 61. SUM-UP INSTALL ENABLE CONFIGURE Enable Hadoop Service APIs for Hue as a proxy user Configure hue.ini to point to each Service API LDAP HELP Get help on @gethue or hue-­‐ user Install Hue on one machine Use an LDAP backend
  • 62. ROADMAP NEXT 6 MONTHS WHAT Oozie v2 Spark v2 SQL v2 More dashboards! Inter component integra9ons (HBase <-­‐> Search, create index wizards, document permissions), Hadoop Web apps SDK Your idea here.
  • 63. CONFIGURATIONS ARE HARD… …GIVE CLOUDERA MANAGER A TRY! vimeo.com/91805055
  • 65. GRACIAS! WEBSITE hYp://gethue.com LEARN hYp://learn.gethue.com TWITTER @gethue USER GROUP hue-­‐user@
  • 66. 17TH ~ 18th NOV 2014 MADRID (SPAIN)