10. 1. (Cotinuous) Queryの登録
2. データを流す、あるいは既に流れている
3. 入力されたデータに対して処理を行う “on the fly”
4. 処理結果を(クライアントに対して)継続的に出力する
Data Stream Management System(DSMS)
10
Continuous Queries
Streaming Inputs Streaming Outputs
data stream
1
3
42
11. DBMSとDSMSの比較
11
DBMS DSMS
Data persistent relations streams, time windows
Data access random sequential, one-pass
Updates arbitrary append-only
Update rates relatively low high, bursty
Processing model query-driven (pull-based) data-driven (push-based)
Queries one-time continuous
Query plans fixed adaptive
Query optimization one query multi-query
Query answers exact exact or approximate
Latency relatively high low
[Golab et al., 2010] p3 “Summary of differences between a DBMS and a DSMS”
13. Continuous Query Operators: シンプルな例
selection
join
count
13
σa
S1 a a a a
f
pass or drop
⋈
S1
b d
c a d b a
insert
S2
b d g f e
probe
9S1 10 9 8 7
update (to “10”)
b a
f
generate result
b
21. SensorBee: Sink
Sink
– Topologyからの出力を定義する
21
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE SINK D1 TYPE fluentd WITH ...;
INSERT INTO D1 FROM B1;
CREATE SINK D2 TYPE mqtt WITH ...;
CREATE SINK D3 TYPE user_sink WITH ...;
22. SensorBee: User Defined Stream Function (UDSF)
UDSF
– 新たなSourceとして振る舞えるユーザ定義関数
22
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE SOURCE B2 AS SELECT RSTREAM
* FROM udsf1(“S2”) [RANGE 1 SECONDS];
23. SensorBee: User Defined State (UDS)
UDS
– ストリーム上の各Componentから共通でアクセスできるShared State
23
Streams
Sources
S1
S2
S3
B1
B3
B2
Sinks
D1
D2
D3
CREATE STATE G1 TYPE user_state WITH...;
CREATE SOURCE B3 AS SELECT ISTREAM
* udf3(“G1”, B2:*), S3:*
FROM B2 [RANGE 1 SECONDS], S3 [RANGE 1 SECONDS]
WHERE B2:foo = S3:foo;
G1
24. Example: Twitterのつぶやきの分類
機械学習と組み合わせたデモ
– Tutorial収録 http://sensorbee.readthedocs.org/en/latest/tutorial.html#using-
machine-learning
– Elasticsearchと機械学習を実際に連携させる
http://www.slideshare.net/nobu_k/elasticsearch-59627321
24
Twitter
Gen
der
Ag
e
Form
atting
Form
atting
Form
atting Labeli
ng
fluentd
26. 参考文献
A. Arasu, S. Badu, and J. Widom. The CQL continuous query language:
Semantic foundations and query execution, 2006.
N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom, H. Balakrishnam,
U. Cetintemel, M. Cheriniack, R. Tibbertts, and S. Zdonik. Towerds a
streaming SQL standard, 2008.
Lukasz Golab, M. Tamer Özsu. Data Stream Management, 2010.
26