Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Toulouse Data Science meetup - Apache zeppelin

360 views

Published on

The (very) short field trip
by Guillaume ALLEON & Gerard DUPONT

Published in: Technology
  • Be the first to comment

Toulouse Data Science meetup - Apache zeppelin

  1. 1. Apache Zeppelin The (very) short field trip by G.Alléon & G.DupontTDS meetup - 2016.06.30
  2. 2. Who are we? Guillaume Alleon - AIRBUS Group Innovation (corporate research center) Research leader for more than 30 people from UK to China, tackling problems in massive data processing and information extraction. Was already in “big data” when it was still called HPC… Gerard Dupont - AIRBUS Defence & Space (space systems) Technical coordinator for R&T studies on distributed processing systems. Spend way too much time processing web data for intelligence, now looking to the sky (satellite data ;-)
  3. 3. Zeppelin moto “A web-based notebook that enables interactive data analytics.”
  4. 4. Origins & history Missing piece in HADOOP landscape: a modern analytic playground. 2012.12 - Data analytics solution (NFLabs) 2013.10 - Opensourced 2014.12 - ASF incubation 2015 - 3 stable releases 2016.05 - Maturing to Apache top level project
  5. 5. 3000 feet view
  6. 6. What’s cool about Zeppelin ⊕interactive ⊕out-of-the-box spark integration ⊕out-of-the-box visualization options ⊕direct access to DOM for customized visualization ⊕nice UI (bootstrap & angular) ⊕notebook run scheduler ⊕easy to configure ⊕extensibility, extensibility and extensibility...
  7. 7. What’s cool about Zeppelin ⊕interactive ⊕out-of-the-box spark integration ⊕out-of-the-box visualization options ⊕direct access to DOM for customized visualization ⊕nice UI (bootstrap & angular) ⊕notebook run scheduler ⊕easy to configure ⊕extensibility, extensibility and extensibility... … the dark side ⊝hard to install ⊝need to build from the source (for customized version) ⊝not (yet) multi-users
  8. 8. Overview/look & feel Interpreter text (aka your code) Interpreter config Interactive results
  9. 9. DEMO time credits: https://www.weasyl.com/~uszatyarbuz
  10. 10. Under the hood ○ Interpreter isolation with their own JVM ○ Dynamic dependencies loading ○ REST & websocket on front ○ Thrift in back (or whatever you add) ○ Process scheduler (cron-like)
  11. 11. Roadmap Enterprise Ready ○ Multi-tenancy ○ Job scheduler ○ HA Usability Improvement ○ UX improvement ○ Table data support ○ Dynamic interpreter integration ○ Reusable analytic application catalog
  12. 12. Thx Offical website: https://zeppelin.apache.org/ Notebook sample: https://www.zeppelinhub.com/viewer Source code: https://github.com/apache/incubator-zeppelin Mailing lists: http://zeppelin.apache.org/community.html This TDS notebook: http://tinyurl.com/zeppelin-tds Sources for this presentation: ○ http://www.slideshare.net/FlinkForward/moon-soo-lee-data-science-lifecycle-with-apache-flink-and-apache-zeppelin/23 ○ http://www.slideshare.net/HadoopSummit/apache-zeppelin-helium-and-beyond ○ http://www.slideshare.net/felixcss/interactive-data-science-from-scratch-with-apache-zeppelin-and-apache-spark ○ http://www.slideshare.net/BrunoBonnin/explorez-vos-donnes-avec-apache-zeppelin credits: https://www.weasyl.com/~uszatyarbuz
  13. 13. BACKUP
  14. 14. Origins & history Active core teams Descent number of external contributors Plenty of interpreters (official and external) 0.6.0-SNAPSHOT (pending stabilization)
  15. 15. 3000 feet view

×