Two years ago, Spotify introduced Scio, an open-source Scala framework to develop data pipelines and deploy them on Google Dataflow. In this talk, we will discuss the evolution of Scio, and share the highlights of running Scio in production for two years. We will showcase several interesting data processing workflows ran at Spotify, what we learned from running them in production, and how we leveraged that knowledge to make Scio faster, and safer and easier to use.
22. 関係を抽出するクエリの例 1
match p = (:Character)-[*1]->(:PcSkill) return p limit 1 24ms
キャラクタとスキルの結びつきを1つ取得
23. 関係を抽出するクエリの例 2
match p = (c:Character)-[]-(s:PcSkill)-[]-(e:PcSkillEffect)-[]-
(b:BattleEffect)-[]-(g:`.png`)
return p limit 1 6ms
キャラクタ->スキル->スキルエフェクト->バトルエフェクト->ファイルと辿る
24. 関係を抽出するクエリの例 3
match p = shortestpath((c:Character)-[*]-(g:`.ogg`))
return p limit 1 8ms
キャラクタと.oggファイルの最短経路を1つ抽出
26. match
p = shortestpath((c:Character)-[r*]-(f {_nodeType: 'file'}))
with c, f, p,
reduce(cost = 0, n in nodes(p) | cost + n._cost) as cost
where
cost <= 60 and
not any (x in relationships(p) where x.name = 'AreaList.id') and
not any (x in relationships(p) where x.name = 'AreaObject.id') and
not any (x in relationships(p) where x.name = 'Location.id')
return c.id, f.realPath
order by c.id, f.realPath;
Characterに紐づくファイル数をIDごとに集約
実際の集約のクエリ
コスト計算やガード条件で制御