Your SlideShare is downloading. ×
A Web Application for interactive data analysis with Spark
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

A Web Application for interactive data analysis with Spark

3,802
views

Published on

How to build and use a Web application for interactive data analysis with Spark …

How to build and use a Web application for interactive data analysis with Spark
A Hue Spark application was recently created. It lets users execute and monitor Spark jobs directly from their browser and be more productive.
The Spark Application is based on Spark Job Server contributed by Ooyala at the last Spark Summit 2013. This new server will enable a real interactivity with Spark and is closer to the community.
This talk will describe the architecture of the application and demo several business use cases now made easy with this application.

Published in: Data & Analytics, Technology

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,802
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
36
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. A WEB APPLICATION FOR INTERACTIVE DATA ANALYSIS WITH SPARK Romain Rigaux Spark Summit, Jul 1, 2014
  • 2. GOAL
 OF HUE WEB INTERFACE FOR ANALYZING DATA WITH APACHE HADOOP   ! SIMPLIFY AND INTEGRATE
 
 FREE AND OPEN SOURCE ! —> OPEN UP BIG DATA
  • 3. VIEW FROM
 30K FEET Hadoop Web Server You, your colleagues and even that friend that uses IE9 ;)
  • 4. LATEST HUE
 HUE 3.6+ Where  we  are  now,  a  brand  new   way  to  search  and  explore  your   data.
  • 5. SPARK IGNITER
  • 6. HISTORY OCT 2013 Submit  through  Oozie   ! Shell  like  for  Java,  Scala,  Python  
  • 7. HISTORY JAN 2014 V2  Spark  Igniter Spark  0.8 Java,  Scala  with  Spark  Job  Server APR 2014 Spark  0.9 JUN 2014 Ironing  +  How  to  deploy
  • 8. “JUST A VIEW”
 ON TOP OF SPARK Saved script metadata Hue Job Server eg. name, args, classname, jar name… submit list apps list jobs list contexts
  • 9. HOW TO TALK
 TO SPARK? Hue Spark Job Server Spark
  • 10. APP
 LIFE CYCLE Hue Spark Job Server Spark
  • 11. … extend SparkJob .scala sbt _/package JAR Upload APP
 LIFE CYCLE
  • 12. … extend SparkJob .scala sbt _/package JAR Upload APP
 LIFE CYCLE Context create context: auto or manual
  • 13. SPARK JOB SERVER WHERE curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } hJps://github.com/ooyala/spark-­‐jobserver WHAT REST  job  server  for  Spark WHEN Spark  Summit  talk  Monday  5:45pm:     Spark  Job  Server:  Easy  Spark  Job     Management  by  Ooyala
  • 14. FOCUS ON UX curl -d "input.string = a b c a b see" 'localhost:8090/jobs? appName=test&classPath=spark.jobserver.WordCountExample' { "status": "STARTED", "result": { "jobId": "5453779a-f004-45fc-a11d-a39dae0f9bf4", "context": "b7ea0eb5-spark.jobserver.WordCountExample" } } VS
  • 15. TRAIT SPARKJOB /**! * This trait is the main API for Spark jobs submitted to the Job Server.! */! trait SparkJob {! /**! * This is the entry point for a Spark Job Server to execute Spark jobs.! * */! def runJob(sc: SparkContext, jobConfig: Config): Any! ! /**! * This method is called by the job server to allow jobs to validate their input and reject! * invalid job requests. */! def validate(sc: SparkContext, config: Config): SparkJobValidation! }!
  • 16. DEMO TIME

  • 17. LIVE
 DEMO demo.gethue.com/spark
  • 18. STANDALONE APP SCALA 2.10 SPARK 0.9 CURRENT TECH
 SUM-UP HUE C5+
  • 19. ROADMAP -­‐  YARN
 -­‐  HUE-­‐2134  [spark]  App  revamp  and  Job  Server  needs
          -­‐  ImpersonaDon
          -­‐  Status  report
          -­‐  Fetch  N  from  result  set
          -­‐  Python?
 -­‐  Full  Hue  integraDon  with  HDFS,  JobBrowser,  Hive,  charts…
 -­‐  On  the  fly  compile  of  Scala,  Java?
 -­‐  ? WHAT
  • 20. TWITTER @gethue USER GROUP hue-­‐user@ WEBSITE hUp://gethue.com LEARN hUp://gethue.com/category/spark/ THANK YOU! 
 hUp://gethue.com/get-­‐started-­‐with-­‐spark-­‐deploy-­‐spark-­‐ server-­‐and-­‐compute-­‐pi-­‐from-­‐your-­‐web-­‐browser/