SlideShare a Scribd company logo
1
Zeppelin Meetup
Moonsoo Lee / Creator of Zeppelin
moon@zepl.com
@apachezeppelin
2
Agenda
⬢ Demo: Real-time Streaming
⬢ Demo: Zeppelin on Kubernetes
⬢ Zeppelin Roadmap
⬢ Q&A
3
DEMO
Real-time
Streaming
4
+ +
5
DEMO
Zeppelin on Kubernetes
6
Zeppelin server
nginx
DNS
resolver
Pod
Kubernetes
ApiServer
Pod
Python
Interpreter
python-intp
rpc 12321
Pod
Spark
Interpreter
spark-intp
rpc 12321
spark-driver 22321
spark-block
manager
22322
spark-ui 4040
Service
Spark
exec
Spark
execzeppelin-server
http 80
rpc 12320
Create interpreter pod Create spark executor pod
Ingress
Service
Service
7
Benefits
MULTI-TENANCY
Each note and/or user has own
container for interpreters
SCALABILITY
Single host does not run all
interpreters anymore
SECURITY
Each container is isolated
(filesystem, process etc.)
8
Usage
$ kubectl apply -f ${ZEPPELIN_HOME}/k8s/zeppelin-server.yaml
* Need to build your own Zeppelin and Spark docker image before 0.9.0 is released
1. Build Zeppelin distribution package mvn package -Pbuild-distr …
2. Build Zeppelin docker image cd scripts/docker/zeppelin/bin; docker build -t …
3. Build Spark docker image <spark-distribution>/bin/docker-image-tool.sh -m -t 2.4.0 build
Available in 0.9.0-SNAPSHOT
http://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/quickstart/kubernetes.html
Run
9
Zeppelin Roadmap
- Zeppelin on Kubernetes
- Apply network policy to isolate Interpreter Pod
- Schedule note on background as a Job in Kubernetes
- Run extra application such as terminal, tensorboard, the sameway SparkUI works
- Modernize front-end stack
- Currently AngularJS
- Dark theme?
- Visualization
- Realtime data visualization
- Pivot in the backend side, instead of doing it in a front-end that require transfer all data to front-end
- Sidebar
- Sidebar with widgets, such as ToC (Table of Contents, list of data, etc)
- Online widget registry (Helium)
- Collaboration
- Multi-cursor edit
- Comment!
10
Zeppelin Roadmap
Modernize
front-end stack
• Currently AngularJS
• Dark theme
Zeppelin on
Kubernetes
• Apply network policy to isolate
Interpreter Pod
• Schedule note on background as a
Job in Kubernetes
• Run extra application such as
terminal, tensorboard, the sameway
SparkUI works
Collaboration
• Multi-cursor edit
• Comment!
Sidebar
• Sidebar with widgets, such as ToC
(Table of Contents, list of data, etc)
• Online widget registry (Helium)
Visualization
• Realtime data visualization
• Pivot in the backend side,
instead of doing it in a front-end
that require transfer all data to
front-end
11
Mailing list
- Users: users@zeppelin.apache.org
- Dev: dev@zeppelin.apache.org
JIRA
- https://issues.apache.org/jira/projects/ZEPPELIN
Github
- https://github.com/apache/zeppelin
Questions,
Suggestions,
Discussions, Votes!
Bug report, Track
development/release
progress
Fixes, improvements,
new features
Join Apache Zeppelin community.
12
www.zepl.com
Q&A
https://zeppelin.apache.org/
Moonsoo Lee / Creator of Zeppelin
moon@zepl.com
@issuefreaks
Send Mei Long your email for Apache Zeppelin
Slack invite: mlong@zepl.com
@meitrappist1
@ApacheZeppelin
13
Backup slides
14
Visualization
15
Transformation on browser (current)
Zeppelin Server
{
title: ….
text: “select job, count(1) from data”,
paragraphs: [
{
results: {
code: SUCCESS,
msg: [
type: TABLE,
data:
http
thrift
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Engineer
33 12019 Teacher
23 23211 Engineer
29 92327 Student
... ... ...
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Marketing
33 12019 Engineer
23 23211 Engineer
29 92327 Student
... ... ...
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Engineer
33 12019 Teacher
23 23211 Engineer
29 92327 Student
... ... ...
Interpreter
Transform (pivot)
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Engineer
33 12019 Teacher
23 23211 Engineer
29 92327 Student
... ... ...
Browser
job count
Student 2
Engineer 3
Teacher 1
Render
16
Problem
- Entire result dataset need to be transferred to browser, even though not all of
them are rendered.
- Browser CPU, memory is limitation of transforming / rendering data
17
Transformation on Server Zeppelin Server
{
title: ….
text: “select job, count(1) from data”,
paragraphs: [
{
results: {
code: SUCCESS,
msg: [
type: TABLE,
data:
Note update
thrift
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Marketing
33 12019 Engineer
23 23211 Engineer
29 92327 Student
... ... ...
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Engineer
33 12019 Teacher
23 23211 Engineer
29 92327 Student
... ... ...
Interpreter
Browser
job count
Student 2
Engineer 3
Teacher 1
Render
Transform (pivot)
job count
Student 2
Engineer 3
Teacher 1
job count
Student 2
Engineer 3
Teacher 1
Transform request (pivot)
Result dataset fetch
18
Transformation on Interpreter Zeppelin Server
{
title: ….
text: “select job, count(1) from data”,
paragraphs: [
{
results: {
code: SUCCESS,
msg: [
type: TABLE,
data:
Result dataset fetch
thrift
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Marketing
33 12019 Engineer
23 23211 Engineer
29 92327 Student
... ... ...
Interpreter
Browser
job count
Student 2
Engineer 3
Teacher 1
Render
Transform (pivot)
job count
Student 2
Engineer 3
Teacher 1
Transform request (pivot)
job count
Student 2
Engineer 3
Teacher 1
Transform request
(pivot)
job count
Student 2
Engineer 3
Teacher 1
Note update
19
Transformation on where data is Zeppelin Server
{
title: ….
text: “select job, count(1) from data”,
paragraphs: [
{
results: {
code: SUCCESS,
msg: [
type: TABLE,
data:
thrift
age balance job
21 1030 Student
34 20331 Engineer
50 30193 Marketing
33 12019 Engineer
23 23211 Engineer
29 92327 Student
... ... ...
Interpreter
Browser
job count
Student 2
Engineer 3
Teacher 1
Render
Transform
pushdown
job count
Student 2
Engineer 3
Teacher 1
Transform request (pivot)
job count
Student 2
Engineer 3
Teacher 1
Transform request
(pivot)
Result dataset fetch
job count
Student 2
Engineer 3
Teacher 1
Note update
20
Related work
- Streaming data update (without refresh notebook)
- Separate transfer for result dataset and note to browser
- Partial data fetch for table display
- Extending TableData API
21
Sidebar and plugin widget
22
>
Sidebar show button.
23
Sidebar widget #1
Sidebar widget #2
Group1 Group2 <
Sidebar hide button
Sidebar widgets
Sidebar widget can
be grouped
24
Contents
1. This is notebook
a. First
b. Second
2. Next
a. Next
One of the most popular feature in Jupyter.
Google Colab also supports it.
Zeppelin has SPELL
See https://www.npmjs.com/package/zeppelin-toc-spell
TOC (table of contents) widget
25
Displays list of table, schema of table, preview of data
recognized by Interpreter
Table data widget
Name Temporary
table1 no
bank yes
Tables
Column Type
age INT
job TEXT
Schema
Preview
26
Drag and drop paragraph to the clipboard.
In the same or in another notebook and drag and drop
paragraph from clipboard.
Clipboard
Drop paragraph here
Paragraph a
Paragraph b
27
Widget on Helium registry
28
Thank you!
Please contact Mei Long mlong@zepl.com with your email
address for an invite to Apache Zeppelin Slack workspace

More Related Content

What's hot

ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
NTT DATA Technology & Innovation
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Noritaka Sekiyama
 
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
Chef
 
Spring Framework Petclinic sample application
Spring Framework Petclinic sample applicationSpring Framework Petclinic sample application
Spring Framework Petclinic sample application
Antoine Rey
 
Java EE から Quarkus による開発への移行について
Java EE から Quarkus による開発への移行についてJava EE から Quarkus による開発への移行について
Java EE から Quarkus による開発への移行について
Shigeru Tatsuta
 
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
Amazon Web Services Korea
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13w
Cloudera Japan
 
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪崇之 清水
 
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
At least onceってぶっちゃけ問題の先送りだったよね #kafkajpAt least onceってぶっちゃけ問題の先送りだったよね #kafkajp
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
Yahoo!デベロッパーネットワーク
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Yongho Ha
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Cloudera Japan
 
Spring Cloud Netflixを使おう #jsug
Spring Cloud Netflixを使おう #jsugSpring Cloud Netflixを使おう #jsug
Spring Cloud Netflixを使おう #jsug
Toshiaki Maki
 
S3 整合性モデルと Hadoop/Spark の話
S3 整合性モデルと Hadoop/Spark の話S3 整合性モデルと Hadoop/Spark の話
S3 整合性モデルと Hadoop/Spark の話
Noritaka Sekiyama
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우
if kakao
 
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint [D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
NAVER D2
 
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
Takeshi Mikami
 
Javaはどのように動くのか~スライドでわかるJVMの仕組み
Javaはどのように動くのか~スライドでわかるJVMの仕組みJavaはどのように動くのか~スライドでわかるJVMの仕組み
Javaはどのように動くのか~スライドでわかるJVMの仕組み
Chihiro Ito
 
NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例
Amazon Web Services Japan
 
Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
Databricks
 

What's hot (20)

ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
ストリーム処理におけるApache Avroの活用について(NTTデータ テクノロジーカンファレンス 2019 講演資料、2019/09/05)
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1Overview of Chef - Fundamentals Webinar Series Part 1
Overview of Chef - Fundamentals Webinar Series Part 1
 
Spring Framework Petclinic sample application
Spring Framework Petclinic sample applicationSpring Framework Petclinic sample application
Spring Framework Petclinic sample application
 
Java EE から Quarkus による開発への移行について
Java EE から Quarkus による開発への移行についてJava EE から Quarkus による開発への移行について
Java EE から Quarkus による開発への移行について
 
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트::  AWS Summit Online Ko...
EMR 플랫폼 기반의 Spark 워크로드 실행 최적화 방안 - 정세웅, AWS 솔루션즈 아키텍트:: AWS Summit Online Ko...
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13w
 
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
Amazon ElastiCache(初心者向け 超速マスター編)JAWSUG大阪
 
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
At least onceってぶっちゃけ問題の先送りだったよね #kafkajpAt least onceってぶっちゃけ問題の先送りだったよね #kafkajp
At least onceってぶっちゃけ問題の先送りだったよね #kafkajp
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理Apache Hadoop YARNとマルチテナントにおけるリソース管理
Apache Hadoop YARNとマルチテナントにおけるリソース管理
 
Spring Cloud Netflixを使おう #jsug
Spring Cloud Netflixを使おう #jsugSpring Cloud Netflixを使おう #jsug
Spring Cloud Netflixを使おう #jsug
 
S3 整合性モデルと Hadoop/Spark の話
S3 整合性モデルと Hadoop/Spark の話S3 整合性モデルと Hadoop/Spark の話
S3 整合性モデルと Hadoop/Spark の話
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우
 
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint [D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
[D2] java 애플리케이션 트러블 슈팅 사례 & pinpoint
 
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
Apache Hadoop & Hive 入門 (マーケティングデータ分析基盤技術勉強会)
 
Javaはどのように動くのか~スライドでわかるJVMの仕組み
Javaはどのように動くのか~スライドでわかるJVMの仕組みJavaはどのように動くのか~スライドでわかるJVMの仕組み
Javaはどのように動くのか~スライドでわかるJVMの仕組み
 
NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例NetflixにおけるPresto/Spark活用事例
NetflixにおけるPresto/Spark活用事例
 
Improving Apache Spark Downscaling
 Improving Apache Spark Downscaling Improving Apache Spark Downscaling
Improving Apache Spark Downscaling
 

Similar to Apache Zeppelin on Kubernetes with Spark and Kafka - meetup @twitter

Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?
Kaxil Naik
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a Service
Steven Wu
 
40043 claborn
40043 claborn40043 claborn
40043 clabornBaba Ib
 
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
Vincenzo Ferme
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
ScyllaDB
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
ManageIQ
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
Alex Van Boxel
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
Roger Rafanell Mas
 
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
Tech Talk: DevOps at LeanIX @ Startup Camp BerlinTech Talk: DevOps at LeanIX @ Startup Camp Berlin
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
LeanIX GmbH
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward
 
OpenFaaS JeffConf 2017 - Milan
OpenFaaS JeffConf 2017 - MilanOpenFaaS JeffConf 2017 - Milan
OpenFaaS JeffConf 2017 - Milan
Alex Ellis
 
Complex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxComplex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBox
bobmcwhirter
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
Stanislav Osipov
 
DCEU 18: App-in-a-Box with Docker Application Packages
DCEU 18: App-in-a-Box with Docker Application PackagesDCEU 18: App-in-a-Box with Docker Application Packages
DCEU 18: App-in-a-Box with Docker Application Packages
Docker, Inc.
 
From Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in SydneyFrom Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in Sydney
SK Telecom
 
Camel on Cloud by Christina Lin
Camel on Cloud by Christina LinCamel on Cloud by Christina Lin
Camel on Cloud by Christina Lin
Tadayoshi Sato
 
Cloud Native Serverless Java — Orkhan Gasimov
Cloud Native Serverless Java — Orkhan GasimovCloud Native Serverless Java — Orkhan Gasimov
Cloud Native Serverless Java — Orkhan Gasimov
GlobalLogic Ukraine
 
All Change
All ChangeAll Change
All Change
Jason Arneil
 

Similar to Apache Zeppelin on Kubernetes with Spark and Kafka - meetup @twitter (20)

Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?Why Airflow? & What's new in Airflow 2.3?
Why Airflow? & What's new in Airflow 2.3?
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Building Stream Processing as a Service
Building Stream Processing as a ServiceBuilding Stream Processing as a Service
Building Stream Processing as a Service
 
40043 claborn
40043 claborn40043 claborn
40043 claborn
 
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
BenchFlow: A Platform for End-to-end Automation of Performance Testing and An...
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Sprint 17
Sprint 17Sprint 17
Sprint 17
 
Google Cloud Dataflow
Google Cloud DataflowGoogle Cloud Dataflow
Google Cloud Dataflow
 
Flink internals web
Flink internals web Flink internals web
Flink internals web
 
Profiling & Testing with Spark
Profiling & Testing with SparkProfiling & Testing with Spark
Profiling & Testing with Spark
 
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
Tech Talk: DevOps at LeanIX @ Startup Camp BerlinTech Talk: DevOps at LeanIX @ Startup Camp Berlin
Tech Talk: DevOps at LeanIX @ Startup Camp Berlin
 
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
Flink Forward Berlin 2018: Steven Wu - "Failure is not fatal: what is your re...
 
OpenFaaS JeffConf 2017 - Milan
OpenFaaS JeffConf 2017 - MilanOpenFaaS JeffConf 2017 - Milan
OpenFaaS JeffConf 2017 - Milan
 
Complex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxComplex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBox
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
DCEU 18: App-in-a-Box with Docker Application Packages
DCEU 18: App-in-a-Box with Docker Application PackagesDCEU 18: App-in-a-Box with Docker Application Packages
DCEU 18: App-in-a-Box with Docker Application Packages
 
From Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in SydneyFrom Kubernetes to OpenStack in Sydney
From Kubernetes to OpenStack in Sydney
 
Camel on Cloud by Christina Lin
Camel on Cloud by Christina LinCamel on Cloud by Christina Lin
Camel on Cloud by Christina Lin
 
Cloud Native Serverless Java — Orkhan Gasimov
Cloud Native Serverless Java — Orkhan GasimovCloud Native Serverless Java — Orkhan Gasimov
Cloud Native Serverless Java — Orkhan Gasimov
 
All Change
All ChangeAll Change
All Change
 

Recently uploaded

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 

Recently uploaded (20)

Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 

Apache Zeppelin on Kubernetes with Spark and Kafka - meetup @twitter

  • 1. 1 Zeppelin Meetup Moonsoo Lee / Creator of Zeppelin moon@zepl.com @apachezeppelin
  • 2. 2 Agenda ⬢ Demo: Real-time Streaming ⬢ Demo: Zeppelin on Kubernetes ⬢ Zeppelin Roadmap ⬢ Q&A
  • 6. 6 Zeppelin server nginx DNS resolver Pod Kubernetes ApiServer Pod Python Interpreter python-intp rpc 12321 Pod Spark Interpreter spark-intp rpc 12321 spark-driver 22321 spark-block manager 22322 spark-ui 4040 Service Spark exec Spark execzeppelin-server http 80 rpc 12320 Create interpreter pod Create spark executor pod Ingress Service Service
  • 7. 7 Benefits MULTI-TENANCY Each note and/or user has own container for interpreters SCALABILITY Single host does not run all interpreters anymore SECURITY Each container is isolated (filesystem, process etc.)
  • 8. 8 Usage $ kubectl apply -f ${ZEPPELIN_HOME}/k8s/zeppelin-server.yaml * Need to build your own Zeppelin and Spark docker image before 0.9.0 is released 1. Build Zeppelin distribution package mvn package -Pbuild-distr … 2. Build Zeppelin docker image cd scripts/docker/zeppelin/bin; docker build -t … 3. Build Spark docker image <spark-distribution>/bin/docker-image-tool.sh -m -t 2.4.0 build Available in 0.9.0-SNAPSHOT http://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/quickstart/kubernetes.html Run
  • 9. 9 Zeppelin Roadmap - Zeppelin on Kubernetes - Apply network policy to isolate Interpreter Pod - Schedule note on background as a Job in Kubernetes - Run extra application such as terminal, tensorboard, the sameway SparkUI works - Modernize front-end stack - Currently AngularJS - Dark theme? - Visualization - Realtime data visualization - Pivot in the backend side, instead of doing it in a front-end that require transfer all data to front-end - Sidebar - Sidebar with widgets, such as ToC (Table of Contents, list of data, etc) - Online widget registry (Helium) - Collaboration - Multi-cursor edit - Comment!
  • 10. 10 Zeppelin Roadmap Modernize front-end stack • Currently AngularJS • Dark theme Zeppelin on Kubernetes • Apply network policy to isolate Interpreter Pod • Schedule note on background as a Job in Kubernetes • Run extra application such as terminal, tensorboard, the sameway SparkUI works Collaboration • Multi-cursor edit • Comment! Sidebar • Sidebar with widgets, such as ToC (Table of Contents, list of data, etc) • Online widget registry (Helium) Visualization • Realtime data visualization • Pivot in the backend side, instead of doing it in a front-end that require transfer all data to front-end
  • 11. 11 Mailing list - Users: users@zeppelin.apache.org - Dev: dev@zeppelin.apache.org JIRA - https://issues.apache.org/jira/projects/ZEPPELIN Github - https://github.com/apache/zeppelin Questions, Suggestions, Discussions, Votes! Bug report, Track development/release progress Fixes, improvements, new features Join Apache Zeppelin community.
  • 12. 12 www.zepl.com Q&A https://zeppelin.apache.org/ Moonsoo Lee / Creator of Zeppelin moon@zepl.com @issuefreaks Send Mei Long your email for Apache Zeppelin Slack invite: mlong@zepl.com @meitrappist1 @ApacheZeppelin
  • 15. 15 Transformation on browser (current) Zeppelin Server { title: …. text: “select job, count(1) from data”, paragraphs: [ { results: { code: SUCCESS, msg: [ type: TABLE, data: http thrift age balance job 21 1030 Student 34 20331 Engineer 50 30193 Engineer 33 12019 Teacher 23 23211 Engineer 29 92327 Student ... ... ... age balance job 21 1030 Student 34 20331 Engineer 50 30193 Marketing 33 12019 Engineer 23 23211 Engineer 29 92327 Student ... ... ... age balance job 21 1030 Student 34 20331 Engineer 50 30193 Engineer 33 12019 Teacher 23 23211 Engineer 29 92327 Student ... ... ... Interpreter Transform (pivot) age balance job 21 1030 Student 34 20331 Engineer 50 30193 Engineer 33 12019 Teacher 23 23211 Engineer 29 92327 Student ... ... ... Browser job count Student 2 Engineer 3 Teacher 1 Render
  • 16. 16 Problem - Entire result dataset need to be transferred to browser, even though not all of them are rendered. - Browser CPU, memory is limitation of transforming / rendering data
  • 17. 17 Transformation on Server Zeppelin Server { title: …. text: “select job, count(1) from data”, paragraphs: [ { results: { code: SUCCESS, msg: [ type: TABLE, data: Note update thrift age balance job 21 1030 Student 34 20331 Engineer 50 30193 Marketing 33 12019 Engineer 23 23211 Engineer 29 92327 Student ... ... ... age balance job 21 1030 Student 34 20331 Engineer 50 30193 Engineer 33 12019 Teacher 23 23211 Engineer 29 92327 Student ... ... ... Interpreter Browser job count Student 2 Engineer 3 Teacher 1 Render Transform (pivot) job count Student 2 Engineer 3 Teacher 1 job count Student 2 Engineer 3 Teacher 1 Transform request (pivot) Result dataset fetch
  • 18. 18 Transformation on Interpreter Zeppelin Server { title: …. text: “select job, count(1) from data”, paragraphs: [ { results: { code: SUCCESS, msg: [ type: TABLE, data: Result dataset fetch thrift age balance job 21 1030 Student 34 20331 Engineer 50 30193 Marketing 33 12019 Engineer 23 23211 Engineer 29 92327 Student ... ... ... Interpreter Browser job count Student 2 Engineer 3 Teacher 1 Render Transform (pivot) job count Student 2 Engineer 3 Teacher 1 Transform request (pivot) job count Student 2 Engineer 3 Teacher 1 Transform request (pivot) job count Student 2 Engineer 3 Teacher 1 Note update
  • 19. 19 Transformation on where data is Zeppelin Server { title: …. text: “select job, count(1) from data”, paragraphs: [ { results: { code: SUCCESS, msg: [ type: TABLE, data: thrift age balance job 21 1030 Student 34 20331 Engineer 50 30193 Marketing 33 12019 Engineer 23 23211 Engineer 29 92327 Student ... ... ... Interpreter Browser job count Student 2 Engineer 3 Teacher 1 Render Transform pushdown job count Student 2 Engineer 3 Teacher 1 Transform request (pivot) job count Student 2 Engineer 3 Teacher 1 Transform request (pivot) Result dataset fetch job count Student 2 Engineer 3 Teacher 1 Note update
  • 20. 20 Related work - Streaming data update (without refresh notebook) - Separate transfer for result dataset and note to browser - Partial data fetch for table display - Extending TableData API
  • 23. 23 Sidebar widget #1 Sidebar widget #2 Group1 Group2 < Sidebar hide button Sidebar widgets Sidebar widget can be grouped
  • 24. 24 Contents 1. This is notebook a. First b. Second 2. Next a. Next One of the most popular feature in Jupyter. Google Colab also supports it. Zeppelin has SPELL See https://www.npmjs.com/package/zeppelin-toc-spell TOC (table of contents) widget
  • 25. 25 Displays list of table, schema of table, preview of data recognized by Interpreter Table data widget Name Temporary table1 no bank yes Tables Column Type age INT job TEXT Schema Preview
  • 26. 26 Drag and drop paragraph to the clipboard. In the same or in another notebook and drag and drop paragraph from clipboard. Clipboard Drop paragraph here Paragraph a Paragraph b
  • 28. 28 Thank you! Please contact Mei Long mlong@zepl.com with your email address for an invite to Apache Zeppelin Slack workspace