@doanduyhai#VoxxedVienna
Apache Zeppelin
the missing GUI for your
BigData eco-system
DuyHai DOAN
Apache Cassandra Evangelist
@doanduyhai
Who Am I ?
Duy Hai DOAN
Cassandra technical advocate
•  talks, meetups, confs
•  open-source devs (Achilles, …)
•  OSS Cassandra point of contact
☞ duy_hai.doan@datastax.com
☞ @doanduyhai
2
@doanduyhai
Datastax
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 400+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
3
What is Apache Zeppelin ?
Presentation
Architecture
@doanduyhai
Zeppelin Presentation
5
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Zeppelin Architecture
Zeppelin Server
Zeppelin Engine
7
R
E
S
TWebSocket
Spark Interpreter Group
Spark SparkSQL
Zeppelin
Interpreter
Factory
Tajo Interpreter
Flink Interpreter
Cassandra Interpreter
JVM
JVM
JVM
JVM
JVM
@doanduyhai
What does Zeppelin provide ?
Front-end & display system for free
Generic back-end with REST APIs & WebSocket
Pluggable interpreters system
Task scheduler (à la CRON)
8
Zeppelin UI Layout
Notebook
Paragraph
UI elements
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Zeppelin Display System
Raw, Table, HTML, Angular with Scala
Available graphs
View modes
Dynamic form
Iframe export
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Interpreter to Front-End Streaming
@doanduyhai
Interpreter to front-end streaming
Zeppelin Server
14
WebSocket
Interpreter
JVM
JVM
@doanduyhai
Demo
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Interpreter system
Core interpreters
Third-parties interpreters
Interpreters conf & usage
@doanduyhai
Interpreter processing lifecycle
①  Receive input commands/data
•  as raw text
•  from form data
②  Process the input commands/data by the external back-end
③  Format the response using Zeppelin display system
④  Send response back to the Zeppelin engine
17
@doanduyhai
Core interpreters
•  Spark (Spark core, SparkSQL/DataFrame, PySpark)
•  Spark core = default (or %spark)
•  SparkSQL = %sql
•  Shell (%sh)
•  Markdown (%md)
•  AngularJS (%angular)
18
@doanduyhai
Third-parties interpreters
•  Hive
•  Phoenix
•  Tajo
•  Flink
•  Ignite
•  Lens
•  Cassandra
•  Geode
•  PostgreSQL
•  Kylin
•  ElasticSearch
19
@doanduyhai
Interpreter conf & usage
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
Writing An Interpreter
How To
Simple interpreter example (AsciiDoc)
Complex interpreter example (Cassandra)
@doanduyhai
Steps to write your own interpreter
•  Create a class that extends Interpreter base class
•  Register it in a static block
•  Optionnally define default config params
22
static {
Interpreter.register("MyInterpreterName", MyClassName.class.getName());
}
static {
Interpreter.register("MyInterpreterName", MyClassName.class.getName(),
new InterpreterPropertyBuilder()
.add("property1", "default value", "Description of property1").build());
}
@doanduyhai
To register your interpreter as default
•  Edit the enum ZeppelinConfiguration.ConfVars
•  Add your interpreter FQCN in the property ZEPPELIN_INTERPRETERS
23
@doanduyhai
To register your interpreter in config files
•  Create conf/zeppelin-site.xml from conf/zeppelin-site.xml.template
•  Add your interpreter FQCN in the property zeppelin.interpreters
24
<property>
<name>zeppelin.interpreters</name>
<value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,
org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,
org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,
org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter
</value>
</property>
@doanduyhai
Update interpreter pom.xml
25
@doanduyhai
Update main pom.xml
26
@doanduyhai
Simple AsciiDoc Interpreter
27
Zeppelin Server
AsciiDoc Interpreter
JVMZeppelin Engine
Raw
Text
Block
Raw
Text
Block
Converted
To
HTML
HTML
Output
① ②
③④
JVM
@doanduyhai
Simple interpreter (AsciiDoc)
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Cassandra Interpreter Architecture
29
Cassandra
Interpreter
JVM
Display
Results as
HTML
① ②
⑤
Zeppelin
Server
JVM
Raw
Text
Block
Raw
Text
Block
Cassandra
Cassandra
Java
Driver
③
Async CQL
statements
④Render
HTML
⑥
@doanduyhai
Cassandra Interpreter Commands
30
Native CQL statements
SELECT * FROM …;
INSERT INTO …;
…
Schema commands
DESCRIBE TABLE …;
DESCRIBE KEYSPACE …;
…
Prepared statements
Commands
@prepare …;
@bind …;
@remove_prepared …;
Help command
HELP;
Options Commands
@consistency …;
@retryPolicy …;
@fetchSize …;
@doanduyhai
Complex interpreter (Cassandra)
https://github.com/doanduyhai/incubator-zeppelin/tree/ZeppelinPresentation
@doanduyhai
Cassandra Online Interpreter Docs
32
•  http://zeppelin.incubator.apache.org/docs/interpreter/cassandra.html
Zeppelin Future
Roadmap
@doanduyhai
Enterprise Ready
•  Apache Shiro authentication (ZEPPELIN-548)
•  Note authorization (PR #681)
•  Multi-tenancy
34
@doanduyhai
Usability
•  UX improvement
•  Better table data support
•  Export data as CSV etc . (PR #725, PR #714, PR #6, PR #89)
•  Table pagination …
35
@doanduyhai
Pluggability
•  Pluggable visualization
•  Pluggable interpreter
•  Repository and registry for pluggable components
36
@doanduyhai
More interpreters
37
@doanduyhai
Q & R
! "
@doanduyhai
Thank You
@doanduyhai
duy_hai.doan@datastax.com
http://zeppelin.incubator.apache.org/

Apache zeppelin the missing component for the big data ecosystem