Jstorm Introduction
-- zhongyan.feng@alibaba.com
Alibaba
Agenda
Difference with Storm
Plan
Current Stats
Alibaba
• Java Storm
– More powerful features
– More stable
– More faster
What’s the JStorm
Alibaba
• JStorm Team was among one of the earliest that
uses Storm in China.
– Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1
– JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/…
• Our Duties
– Application Development
– JStorm System Development
– JStorm System Operation
Who we are?
Alibaba
• Storm community is not as active as
we’ve expected
– Tailored for enterprise environment
– Fixed critical bugs in Storm
– Provided professional technical support,
improved app development pace.
– Reduced operational cost.
Why start Jstorm?
Alibaba
• Too much requirement drive us move faster
– Release 11 version in 2014
– Refer to https://github.com/alibaba/jstorm/releases
Evolution speed
Alibaba
• Start design from 2012/02/07
• Release first version 0.7.1 2013/04/30
Jstorm history
Alibaba
• Most of powerful Chinese Company
Who are Using Jstorm?
Alibaba
• More than 3000 servers
• More than 3 trillion messages per day
• More than 300 topology
How big in Alibaba?
Alibaba
• Live Alibaba 11.11 room
– Trade amount/count
– PV/UV
– All kinds of KPI
• The peak volume of JStorm messages being process
ed during 11.11,12.12 Shopping Feistival is ten times
as large compared to the peak volume on a normal
day.
User Scenario
Alibaba
• Realtime Recommended Ad
– Analysis user action, then recommend production
User Scenario
Alibaba
• Log Analysis
– Get all kinds of KPI
– Monitor
– Smart Customer Service
– Tlog/EagleEye
User Scenario
Alibaba
• Realtime Data sync pipeline
– DB
– Log
– Message
User Scenario
Alibaba
• 3 Examination every year
– 11/11
– 12/12
– Spring Festival, red package war
– Ten throughput peak period
Why Stable?
Alibaba
• Nimbus HA
• Support Resource Isolation with Cgroups
• Fix bugs under Hadoop-yarn
• Monitor every phase of tuple
• Tuning GC parameter
• Graceful worker shutdown
Improve stability
Alibaba
Faster
• 6 Servers (24core/98G)
• 18 Spout/18 Bolt/18 Acker
Alibaba
9280598
10818815
9065965
6819139
5610201
6243680
6830500
5595900 5474180
3379800
0
2000000
4000000
6000000
8000000
10000000
12000000
0 10 20 30 40 50 60
polltuples/10s
workers
Throughput vs workers
jstorm
storm
• Dedicated Deserializing Thread
• Dedicated ack/fail thread in Spout
• Avoid CPU spin-waiting
• Better Tuned Sampling Logic
• Better Tuned Acking Framework
• Better Tuned GC
• Better Netty RPC framework
• Reduce memory-copying by zeroMq
Why faster?
Alibaba
• More powerful scheduler
• More powerful metrics system
• Support Classloader
• More convenient Web UI/LogView
• Support sync mode for Netty RPC frameworker
• New transaction programming mode
• Self-adaption speed
More features
Alibaba
• More than 100 improvements
– https://github.com/alibaba/JStorm/blob/master/history.md
More details
Alibaba
• Make evolution faster
– Full time developer
– Full time tester
– Hundreds of application which can test new feature quickly
– Java core will bring more developer
What can we bring?
Alibaba
• Provide programming framework liking Trident
Import new plugin
Alibaba
• One year later, maybe we will open source our SQL
engine
SQL Engine
Alibaba
• We are going to port some Spark feature to our
system.
Port Spark’s feature
Alibaba

JStorm Introduction

  • 1.
  • 2.
  • 3.
    • Java Storm –More powerful features – More stable – More faster What’s the JStorm Alibaba
  • 4.
    • JStorm Teamwas among one of the earliest that uses Storm in China. – Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1 – JStorm 0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/… • Our Duties – Application Development – JStorm System Development – JStorm System Operation Who we are? Alibaba
  • 5.
    • Storm communityis not as active as we’ve expected – Tailored for enterprise environment – Fixed critical bugs in Storm – Provided professional technical support, improved app development pace. – Reduced operational cost. Why start Jstorm? Alibaba
  • 6.
    • Too muchrequirement drive us move faster – Release 11 version in 2014 – Refer to https://github.com/alibaba/jstorm/releases Evolution speed Alibaba
  • 7.
    • Start designfrom 2012/02/07 • Release first version 0.7.1 2013/04/30 Jstorm history Alibaba
  • 8.
    • Most ofpowerful Chinese Company Who are Using Jstorm? Alibaba
  • 9.
    • More than3000 servers • More than 3 trillion messages per day • More than 300 topology How big in Alibaba? Alibaba
  • 10.
    • Live Alibaba11.11 room – Trade amount/count – PV/UV – All kinds of KPI • The peak volume of JStorm messages being process ed during 11.11,12.12 Shopping Feistival is ten times as large compared to the peak volume on a normal day. User Scenario Alibaba
  • 11.
    • Realtime RecommendedAd – Analysis user action, then recommend production User Scenario Alibaba
  • 12.
    • Log Analysis –Get all kinds of KPI – Monitor – Smart Customer Service – Tlog/EagleEye User Scenario Alibaba
  • 13.
    • Realtime Datasync pipeline – DB – Log – Message User Scenario Alibaba
  • 14.
    • 3 Examinationevery year – 11/11 – 12/12 – Spring Festival, red package war – Ten throughput peak period Why Stable? Alibaba
  • 15.
    • Nimbus HA •Support Resource Isolation with Cgroups • Fix bugs under Hadoop-yarn • Monitor every phase of tuple • Tuning GC parameter • Graceful worker shutdown Improve stability Alibaba
  • 16.
    Faster • 6 Servers(24core/98G) • 18 Spout/18 Bolt/18 Acker Alibaba 9280598 10818815 9065965 6819139 5610201 6243680 6830500 5595900 5474180 3379800 0 2000000 4000000 6000000 8000000 10000000 12000000 0 10 20 30 40 50 60 polltuples/10s workers Throughput vs workers jstorm storm
  • 17.
    • Dedicated DeserializingThread • Dedicated ack/fail thread in Spout • Avoid CPU spin-waiting • Better Tuned Sampling Logic • Better Tuned Acking Framework • Better Tuned GC • Better Netty RPC framework • Reduce memory-copying by zeroMq Why faster? Alibaba
  • 18.
    • More powerfulscheduler • More powerful metrics system • Support Classloader • More convenient Web UI/LogView • Support sync mode for Netty RPC frameworker • New transaction programming mode • Self-adaption speed More features Alibaba
  • 19.
    • More than100 improvements – https://github.com/alibaba/JStorm/blob/master/history.md More details Alibaba
  • 20.
    • Make evolutionfaster – Full time developer – Full time tester – Hundreds of application which can test new feature quickly – Java core will bring more developer What can we bring? Alibaba
  • 21.
    • Provide programmingframework liking Trident Import new plugin Alibaba
  • 22.
    • One yearlater, maybe we will open source our SQL engine SQL Engine Alibaba
  • 23.
    • We aregoing to port some Spark feature to our system. Port Spark’s feature Alibaba