1. (Cotinuous) Queryの登録
2.データを流す、あるいは既に流れている
3. 入力されたデータに対して処理を行う “on the fly”
4. 処理結果を(クライアントに対して)継続的に出力する
Data Stream Management System(DSMS)
10
Continuous Queries
Streaming Inputs Streaming Outputs
data stream
1
3
42
11.
DBMSとDSMSの比較
11
DBMS DSMS
Data persistentrelations streams, time windows
Data access random sequential, one-pass
Updates arbitrary append-only
Update rates relatively low high, bursty
Processing model query-driven (pull-based) data-driven (push-based)
Queries one-time continuous
Query plans fixed adaptive
Query optimization one query multi-query
Query answers exact exact or approximate
Latency relatively high low
[Golab et al., 2010] p3 “Summary of differences between a DBMS and a DSMS”
Continuous Query Operators:シンプルな例
selection
join
count
13
σa
S1 a a a a
f
pass or drop
⋈
S1
b d
c a d b a
insert
S2
b d g f e
probe
9S1 10 9 8 7
update (to “10”)
b a
f
generate result
b
SensorBee: Source
Source
–Topologyへの入力を表現するComponent
19
Continuous QueriesSources
data stream
S1
S2
S3
CREATE SOURCE S1 TYPE fluentd WITH ...;
CREATE SOURCE S2 TYPE mqtt WITH ...;
CREATE SOURCE S3 TYPE user_source WITH ...;
SensorBee: Sink
Sink
–Topologyからの出力を定義する
21
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE SINK D1 TYPE fluentd WITH ...;
INSERT INTO D1 FROM B1;
CREATE SINK D2 TYPE mqtt WITH ...;
CREATE SINK D3 TYPE user_sink WITH ...;
22.
SensorBee: User DefinedStream Function (UDSF)
UDSF
– 新たなSourceとして振る舞えるユーザ定義関数
22
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE SOURCE B2 AS SELECT RSTREAM
* FROM udsf1(“S2”) [RANGE 1 SECONDS];
23.
SensorBee: User DefinedState (UDS)
UDS
– ストリーム上の各Componentから共通でアクセスできるShared State
23
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE STATE G1 TYPE user_state WITH...;
CREATE SOURCE B3 AS SELECT ISTREAM
* udf3(“G1”, B2:*), S3:*
FROM B2 [RANGE 1 SECONDS], S3 [RANGE 1 SECONDS]
WHERE B2:foo = S3:foo;
G1
24.
Example: Twitterのつぶやきの分類
機械学習と組み合わせたデモ
–Tutorial収録 http://sensorbee.readthedocs.org/en/latest/tutorial.html#using-
machine-learning
– Elasticsearchと機械学習を実際に連携させる
http://www.slideshare.net/nobu_k/elasticsearch-59627321
24
Twitter
Gen
der
Ag
e
Form
atting
Form
atting
Form
atting Labeli
ng
fluentd
参考文献
A. Arasu,S. Badu, and J. Widom. The CQL continuous query language:
Semantic foundations and query execution, 2006.
N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom, H. Balakrishnam,
U. Cetintemel, M. Cheriniack, R. Tibbertts, and S. Zdonik. Towerds a
streaming SQL standard, 2008.
Lukasz Golab, M. Tamer Özsu. Data Stream Management, 2010.
26