Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache zeppelin the missing component for the big data ecosystem

1,700 views

Published on

Apache Zeppelin presentation @ Voxxed Days Vienna

Published in: Technology

Apache zeppelin the missing component for the big data ecosystem

  1. 1. @doanduyhai#VoxxedVienna Apache Zeppelin the missing GUI for your BigData eco-system DuyHai DOAN Apache Cassandra Evangelist
  2. 2. @doanduyhai Who Am I ? Duy Hai DOAN Cassandra technical advocate •  talks, meetups, confs •  open-source devs (Achilles, …) •  OSS Cassandra point of contact ☞ duy_hai.doan@datastax.com ☞ @doanduyhai 2
  3. 3. @doanduyhai Datastax •  Founded in April 2010 •  We contribute a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 400+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features 3
  4. 4. What is Apache Zeppelin ? Presentation Architecture
  5. 5. @doanduyhai Zeppelin Presentation 5
  6. 6. @doanduyhai Demo https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  7. 7. @doanduyhai Zeppelin Architecture Zeppelin Server Zeppelin Engine 7 R E S TWebSocket Spark Interpreter Group Spark SparkSQL Zeppelin Interpreter Factory Tajo Interpreter Flink Interpreter Cassandra Interpreter JVM JVM JVM JVM JVM
  8. 8. @doanduyhai What does Zeppelin provide ? Front-end & display system for free Generic back-end with REST APIs & WebSocket Pluggable interpreters system Task scheduler (à la CRON) 8
  9. 9. Zeppelin UI Layout Notebook Paragraph UI elements
  10. 10. @doanduyhai Demo https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  11. 11. Zeppelin Display System Raw, Table, HTML, Angular with Scala Available graphs View modes Dynamic form Iframe export
  12. 12. @doanduyhai Demo https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  13. 13. Interpreter to Front-End Streaming
  14. 14. @doanduyhai Interpreter to front-end streaming Zeppelin Server 14 WebSocket Interpreter JVM JVM
  15. 15. @doanduyhai Demo https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  16. 16. Interpreter system Core interpreters Third-parties interpreters Interpreters conf & usage
  17. 17. @doanduyhai Interpreter processing lifecycle ①  Receive input commands/data •  as raw text •  from form data ②  Process the input commands/data by the external back-end ③  Format the response using Zeppelin display system ④  Send response back to the Zeppelin engine 17
  18. 18. @doanduyhai Core interpreters •  Spark (Spark core, SparkSQL/DataFrame, PySpark) •  Spark core = default (or %spark) •  SparkSQL = %sql •  Shell (%sh) •  Markdown (%md) •  AngularJS (%angular) 18
  19. 19. @doanduyhai Third-parties interpreters •  Hive •  Phoenix •  Tajo •  Flink •  Ignite •  Lens •  Cassandra •  Geode •  PostgreSQL •  Kylin •  ElasticSearch 19
  20. 20. @doanduyhai Interpreter conf & usage https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  21. 21. Writing An Interpreter How To Simple interpreter example (AsciiDoc) Complex interpreter example (Cassandra)
  22. 22. @doanduyhai Steps to write your own interpreter •  Create a class that extends Interpreter base class •  Register it in a static block •  Optionnally define default config params 22 static { Interpreter.register("MyInterpreterName", MyClassName.class.getName()); } static { Interpreter.register("MyInterpreterName", MyClassName.class.getName(), new InterpreterPropertyBuilder() .add("property1", "default value", "Description of property1").build()); }
  23. 23. @doanduyhai To register your interpreter as default •  Edit the enum ZeppelinConfiguration.ConfVars •  Add your interpreter FQCN in the property ZEPPELIN_INTERPRETERS 23
  24. 24. @doanduyhai To register your interpreter in config files •  Create conf/zeppelin-site.xml from conf/zeppelin-site.xml.template •  Add your interpreter FQCN in the property zeppelin.interpreters 24 <property> <name>zeppelin.interpreters</name> <value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter, org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter, org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter, org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter </value> </property>
  25. 25. @doanduyhai Update interpreter pom.xml 25
  26. 26. @doanduyhai Update main pom.xml 26
  27. 27. @doanduyhai Simple AsciiDoc Interpreter 27 Zeppelin Server AsciiDoc Interpreter JVMZeppelin Engine Raw Text Block Raw Text Block Converted To HTML HTML Output ① ② ③④ JVM
  28. 28. @doanduyhai Simple interpreter (AsciiDoc) https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  29. 29. @doanduyhai Cassandra Interpreter Architecture 29 Cassandra Interpreter JVM Display Results as HTML ① ② ⑤ Zeppelin Server JVM Raw Text Block Raw Text Block Cassandra Cassandra Java Driver ③ Async CQL statements ④Render HTML ⑥
  30. 30. @doanduyhai Cassandra Interpreter Commands 30 Native CQL statements SELECT * FROM …; INSERT INTO …; … Schema commands DESCRIBE TABLE …; DESCRIBE KEYSPACE …; … Prepared statements Commands @prepare …; @bind …; @remove_prepared …; Help command HELP; Options Commands @consistency …; @retryPolicy …; @fetchSize …;
  31. 31. @doanduyhai Complex interpreter (Cassandra) https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
  32. 32. @doanduyhai Cassandra Online Interpreter Docs 32 •  http://zeppelin.incubator.apache.org/docs/interpreter/cassandra.html
  33. 33. Zeppelin Future Roadmap
  34. 34. @doanduyhai Enterprise Ready •  Apache Shiro authentication (ZEPPELIN-548) •  Note authorization (PR #681) •  Multi-tenancy 34
  35. 35. @doanduyhai Usability •  UX improvement •  Better table data support •  Export data as CSV etc . (PR #725, PR #714, PR #6, PR #89) •  Table pagination … 35
  36. 36. @doanduyhai Pluggability •  Pluggable visualization •  Pluggable interpreter •  Repository and registry for pluggable components 36
  37. 37. @doanduyhai More interpreters 37
  38. 38. @doanduyhai Q & R ! "
  39. 39. @doanduyhai Thank You @doanduyhai duy_hai.doan@datastax.com http://zeppelin.incubator.apache.org/

×