Successfully reported this slideshow.
Your SlideShare is downloading. ×

Apache Zeppelin and Helium @ApacheCon 2017 may, FL

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Apache zeppelin 0.7.0   helium
Apache zeppelin 0.7.0 helium
Loading in …3
×

Check these out next

1 of 49 Ad

More Related Content

Slideshows for you (20)

Similar to Apache Zeppelin and Helium @ApacheCon 2017 may, FL (20)

Advertisement

Recently uploaded (20)

Apache Zeppelin and Helium @ApacheCon 2017 may, FL

  1. 1. Helium makes Zeppelin Fly Moon soo Lee (moon@zepl.com) Hoon Park (1ambda@zepl.com) Ahyoung Ryu (ahyoungryu@zepl.com) @ZEPL
  2. 2. Who we are?
  3. 3. What is Apache Zeppelin? A web-based notebook that enables interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and more.
  4. 4. When do you need Zeppelin? Engineers Data Scientists Business user Visualizations Access control Report When multiple tools and different peoples are involved in your data pipeline JDBC
  5. 5. Notebook Zeppelin allows use multiple interpreter (language backend) at the same time ● Real-time collaboration ● Authentication ● Notebook ACL ● Interpreter ACL ● Fine grained interpreter session - notebook/user mapping ● Built-in scheduler ● Pluggable notebook storage ● Pluggable interpreter Zeppelin supports multi-user environment
  6. 6. Built-in Visualization 6 Basic visualizations are built-in, (in addition to matplotlib, ggplot integration) (Table, Bar chart, Pie chart, Area chart, Scatter chart, Line chart)
  7. 7. liumHe 2 4.0026 Hoon Park ZEPL linkedin.com/in/1ambda @
  8. 8. Why isn’t it easy to add new visualizations? - Dependent on Zeppelin release cycle - Restricted License (e.g. commercial chart) Problems: Built-in Visualizations “What if I want to display things differently?”
  9. 9. Let’s add pluggable visualizations - external add-ons - can update frequently - highly customizable - shared via online registry Solution? Problems: Built-in Visualizations Why isn’t it easy to add new visualizations? - Restricted License (e.g. highchart) - Dependent on Zeppelin release cycle liumHe 2 4.0026
  10. 10. Apache Zeppelin: Helium VISUALIZATION DEMO
  11. 11. Helium Visualization Examples
  12. 12. “What about extending Interpreters?” “Can we leverage Helium framework for interpreters?”
  13. 13. Backgrounds: Interpreter Execute paragraph (code) and return output - separated JVM process - %spark, %jdbc, %python, ...
  14. 14. Apache Zeppelin: Helium SPELL
  15. 15. It is not easy to add a new interpreter - interpreters written in Java - sometimes need to handle HTML dynamically Can’t be combined with other interpreters - e.g. Spark + Markdown - Interpreter != Display System Problems: Backend Interpreter “Can we write interpreters easily?”
  16. 16. Backgrounds: Display System customize interpreted outputs in frontend - can be combined with interpreters - %html, %table, %angular, ...
  17. 17. Problems: Backend Interpreter “Frontend interpreter can be a display system”
  18. 18. Easy to create and handle HTML - written in JS - can utilize many existing JS libraries (flowchart, sigmajs, vega, papaparse, ...) Can be a display system like %html, %table - e.g. Spark Interpreter + Markdown Display - allows to customize output %myGraph SPELL: Frontend Interpreter
  19. 19. Apache Zeppelin: Helium SPELL DEMO
  20. 20. Helium Spell Examples
  21. 21. lium Online RegistryHe 2 4.0026 Ahyoung Ryu @ ZEPL linkedin.com/in/AhyoungRyu
  22. 22. How can we share VISUALIZATION or SPELL packages?
  23. 23. ONLINE REGISTRIES
  24. 24. ONLINE REGISTRIES
  25. 25. ONLINE REGISTRIES
  26. 26. Online Registry for Helium packages?
  27. 27. THEN WE SHOULD CONSIDER 1. Who/ How can we build up the infra and operate the service? - Need to set up the authentication system - Need user/ package DB - Versioning/ building / packaging - … 2. External library Licenses
  28. 28. Too complicated.. Is there any other SIMPLE way to solve this?
  29. 29. A SOLUTION ALREADY EXISTED
  30. 30. A SOLUTION ALREADY EXISTED Helium VISUALIZATION & SPELL packages are package
  31. 31. A SOLUTION ALREADY EXISTED The package information can be saved in registry! Helium VISUALIZATION & SPELL packages are package
  32. 32. Then, is there any way to fetch ONLY Helium package information from registry?
  33. 33. Create a Helium package HOW?
  34. 34. HOW? Publish it to Registry http://registry.npmjs.org/ Create a Helium package
  35. 35. HOW? Publish it to Registry Filter Helium pkg & fetch only necessary metadata Create a Helium package
  36. 36. HOW? Publish it to Registry Does the package has zeppelin-vis or zeppelin-spell as its dependency? Filter Helium pkg & fetch only necessary metadata Create a Helium package
  37. 37. HOW? Publish it to Registry If so, take only necessary metadata: - name - description - version - license - ... Filter Helium pkg & fetch only necessary metadata Create a Helium package
  38. 38. HOW? Publish it to Registry Integrate whole data and create helium.json Filter Helium pkg & fetch only necessary metadata Create a Helium package
  39. 39. HOW? Publish it to Registry Integrate whole data and create helium.json Save the file in Filter Helium pkg & fetch only necessary metadata Create a Helium package
  40. 40. HOW? Publish it to Registry Save the file in Trigger Lambda function every 1 hour using Integrate whole data and create helium.json Filter Helium pkg & fetch only necessary metadata Create a Helium package
  41. 41. HOW? Publish it to Registry Read helium.json Integrate whole data and create helium.json Trigger Lambda function every 1 hour using Save the file in Filter Helium pkg & fetch only necessary metadata Create a Helium package
  42. 42. JIRA ISSUES - ZEPPELIN-1973 : List all available Helium packages in Zeppelin website - ZEPPELIN-2004 : List helium packages in Zeppelin GUI by reading file
  43. 43. WHEN CAN I USE THIS? - Not included in Zeppelin latest version 0.7.X - Will be available in Zeppelin 0.8.0 - Release plan ?
  44. 44. extends Zeppelin eco-system Interpreters liumHe 2 4.0026 Visualizations Spell Map Heatmap Range Bubble Spline Sigma D3 Markdown Translator Flowchart Spark Python JDBC Groovy Geode FlinkCassandra Kylin Users / Developers 2833 3rd parties
  45. 45. Useful service Zeppelin notebook online viewer https://www.zeppelinhub.com/viewer Notebook sharing and collaboration https://www.zeppelinhub.com
  46. 46. Future Roadmap 0.7.2 - Maintenance release. 0.8.0 - Helium online registry. - Interpreter Cluster mode. 1.0 - Finest, the most stable release 2Q /2017 3Q /2017 3-4Q /2017 Hopefully
  47. 47. Thanks Q&A

×