Successfully reported this slideshow.

Toulouse Data Science meetup - Apache zeppelin

2

Share

Upcoming SlideShare
Tds — big science dec 2021
Tds — big science dec 2021
Loading in …3
×
1 of 15
1 of 15

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Toulouse Data Science meetup - Apache zeppelin

  1. 1. Apache Zeppelin The (very) short field trip by G.Alléon & G.DupontTDS meetup - 2016.06.30
  2. 2. Who are we? Guillaume Alleon - AIRBUS Group Innovation (corporate research center) Research leader for more than 30 people from UK to China, tackling problems in massive data processing and information extraction. Was already in “big data” when it was still called HPC… Gerard Dupont - AIRBUS Defence & Space (space systems) Technical coordinator for R&T studies on distributed processing systems. Spend way too much time processing web data for intelligence, now looking to the sky (satellite data ;-)
  3. 3. Zeppelin moto “A web-based notebook that enables interactive data analytics.”
  4. 4. Origins & history Missing piece in HADOOP landscape: a modern analytic playground. 2012.12 - Data analytics solution (NFLabs) 2013.10 - Opensourced 2014.12 - ASF incubation 2015 - 3 stable releases 2016.05 - Maturing to Apache top level project
  5. 5. 3000 feet view
  6. 6. What’s cool about Zeppelin ⊕interactive ⊕out-of-the-box spark integration ⊕out-of-the-box visualization options ⊕direct access to DOM for customized visualization ⊕nice UI (bootstrap & angular) ⊕notebook run scheduler ⊕easy to configure ⊕extensibility, extensibility and extensibility...
  7. 7. What’s cool about Zeppelin ⊕interactive ⊕out-of-the-box spark integration ⊕out-of-the-box visualization options ⊕direct access to DOM for customized visualization ⊕nice UI (bootstrap & angular) ⊕notebook run scheduler ⊕easy to configure ⊕extensibility, extensibility and extensibility... … the dark side ⊝hard to install ⊝need to build from the source (for customized version) ⊝not (yet) multi-users
  8. 8. Overview/look & feel Interpreter text (aka your code) Interpreter config Interactive results
  9. 9. DEMO time credits: https://www.weasyl.com/~uszatyarbuz
  10. 10. Under the hood ○ Interpreter isolation with their own JVM ○ Dynamic dependencies loading ○ REST & websocket on front ○ Thrift in back (or whatever you add) ○ Process scheduler (cron-like)
  11. 11. Roadmap Enterprise Ready ○ Multi-tenancy ○ Job scheduler ○ HA Usability Improvement ○ UX improvement ○ Table data support ○ Dynamic interpreter integration ○ Reusable analytic application catalog
  12. 12. Thx Offical website: https://zeppelin.apache.org/ Notebook sample: https://www.zeppelinhub.com/viewer Source code: https://github.com/apache/incubator-zeppelin Mailing lists: http://zeppelin.apache.org/community.html This TDS notebook: http://tinyurl.com/zeppelin-tds Sources for this presentation: ○ http://www.slideshare.net/FlinkForward/moon-soo-lee-data-science-lifecycle-with-apache-flink-and-apache-zeppelin/23 ○ http://www.slideshare.net/HadoopSummit/apache-zeppelin-helium-and-beyond ○ http://www.slideshare.net/felixcss/interactive-data-science-from-scratch-with-apache-zeppelin-and-apache-spark ○ http://www.slideshare.net/BrunoBonnin/explorez-vos-donnes-avec-apache-zeppelin credits: https://www.weasyl.com/~uszatyarbuz
  13. 13. BACKUP
  14. 14. Origins & history Active core teams Descent number of external contributors Plenty of interpreters (official and external) 0.6.0-SNAPSHOT (pending stabilization)
  15. 15. 3000 feet view

Editor's Notes

  • Interactive & extensible

    Ingestion, Discovery, Analytics, Visualization, Collaboration, Data product

    Toward better capitalization of analytical application (helium)
  • ~4 years
    top level apache project after less than 18 months of incubation
  • Scala & spark integration
    Direct DOM for super cool visualization
  • ×