Successfully reported this slideshow.
Your SlideShare is downloading. ×

時系列ビッグデータの特徴自動抽出とリアルタイム将来予測(第9回ステアラボ人工知能セミナー)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 134 Ad

More Related Content

Slideshows for you (20)

Similar to 時系列ビッグデータの特徴自動抽出とリアルタイム将来予測(第9回ステアラボ人工知能セミナー) (20)

Advertisement

More from STAIR Lab, Chiba Institute of Technology (20)

Recently uploaded (20)

Advertisement

時系列ビッグデータの特徴自動抽出とリアルタイム将来予測(第9回ステアラボ人工知能セミナー)

  1. 1. 1 時系列ビッグデータの 特徴⾃動抽出と リアルタイム将来予測 熊本⼤学⼤学院先端科学研究部 科学技術振興機構さきがけ研究員 松原靖⼦ Sakurai Lab. @ Kumamoto University 研究領域「新しい社会システムデザインに向けた情報 基盤技術の創出」(研究総括:⿊橋 禎夫先⽣) © 2017 Yasuko Matsubara
  2. 2. 研究理念 2 未来の予測によって社会を変⾰する ⼤規模データを⽤いて ⾃然現象や社会現象の時間発展を リアルタイムに予測し, 社会活動を最適化する © 2017 Yasuko Matsubara
  3. 3. 時系列ビッグデータ 3 Big Data 環境 経済Web 医療 ⾃然・社会現象と時系列イベントデータ ⾞両・交通 © 2017 Yasuko Matsubara
  4. 4. 時系列ビッグデータ 4 • IoTビッグデータ – 各種センサデータストリーム 1000 2000 3000 4000 5000 0 5 10 123456 VelocityLongitudinal Lateral acceleration 0 5 10 12345678910 Toyota.com 2000 4000 6000 8000 10000 12000 0 5 10 12345678 Apple.com Fitbit.com ⾞両⾛⾏センサ ⽣体センサ モバイルセンサ © 2017 Yasuko Matsubara
  5. 5. 時系列ビッグデータ 5 • Web/オンラインデータ – オンラインユーザ活動 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 US CA AU ZA CN JP Nexus Kindleユーザ 購買履歴,レビュー キーワード検索履歴 © 2017 Yasuko Matsubara
  6. 6. 研究のスコープ • 情報化社会におけるデータ量の⾶躍的な増⼤ – IoTビッグデータ(環境、交通、⽣体) – Web、ソーシャルメディア、医療情報 • 社会を変⾰する情報⼯学 – IoTビッグデータ解析による 製造業の変⾰、付加価値の向上 – テレマティクス(トヨタ⾃動⾞) – ユビキタスウェア(富⼠通) 6 未来の予測によって社会を変⾰する ⼤規模データを⽤いて⾃然現象や社会現象の時間発展を リアルタイムに予測し,社会活動を最適化する © 2017 Yasuko Matsubara
  7. 7. 時系列ビッグデータ解析 • ビッグデータ(データストリーム)処理 – ⼤規模データのための⾼速処理 – 省メモリ化 – リアルタイム情報提供のためのオンライン処理 • 多様なデータへの対応 – 多次元時系列データ(センサデータ) – イベント時系列データ(Webアクセス履歴) – 時間発展グラフ構造データ(SNS) • ⾼度な処理 – 異常検知 – 将来予測 7© 2017 Yasuko Matsubara
  8. 8. 学術貢献 • 学会 – ACM (Association for Computing Machinery) – IEEE (The Institute of Electrical and Electronics Engineers, Inc.) – その他、⽇本では情報処理学会、電⼦情報通信学会 • 国際会議 – ACM SIGMOD, VLDB, IEEE ICDE (データベース) – ACM KDD, IEEE ICDM, SIAM SDM (データマイニング) – WWW, ACM WSDM (Web) 国際的な研究成果と最先端技術開発 8© 2017 Yasuko Matsubara (KDDʼ16発表会場での様⼦)
  9. 9. 産業貢献のための共同研究 • より良いサービスのために – トヨタIT開発センター様 [2014, 2015, 2016年度] – 富⼠通研究所様 [2016年度] – 2017年度:契約⼿続き中(本年度は8社程度を予定。) 9© 2017 Yasuko Matsubara ⾞両⾛⾏データ解析 IoT/スマート⼯場 医療情報解析 Webサービス分析
  10. 10. 10 Roadmap Research issues Current projects Future work © 2017 Yasuko Matsubara
  11. 11. 研究課題 • 時系列データマイニング研究の⽅向性 1. ⼤規模テンソル解析 2. ⾮線形モデリング 3. 特徴⾃動抽出 4. リアルタイム処理 • 3h-チュートリアル – Mining and Forecasting of Big Time-series Data (SIGMOD 2015) – Mining Big Time-series Data on the Web (WWW 2016) – Smart Analytics for Big Time-series Data (KDD 2017) 11 3h-Tutorials SIGMOD 2015 WWW 2016 KDD 2017 X ≈ + ... 𝑑𝑥 𝑑𝑡 = 𝑓(𝑥) © 2017 Yasuko Matsubara
  12. 12. 研究課題 © 2017 Yasuko Matsubara 12 Indexing, Similarity search Feature extraction Data stream Linear- modeling 従来の時系列データマイニング技術 PCA ICA Model (M)Data (X) ED, DTW Correlation AR, ARIMA, LDS StatStream etc… DFT, DWT, SVD, ICA Y X DTW
  13. 13. 研究課題 © 2017 Yasuko Matsubara 13 Indexing, Similarity search Feature extraction Data stream Linear- modeling 「ビッグデータ」解析への課題 Y X PCA ICADTW Model (M)Data (X) ED, DTW Correlation AR, ARIMA, LDS StatStream etc… DFT, DWT, SVD, ICA Automatic parameter- tuning Non-linear phenomena Complex events Real-time processing X
  14. 14. 研究課題 © 2017 Yasuko Matsubara 14 Indexing, Similarity search Feature extraction Data stream Linear- modeling 「ビッグデータ」解析への課題 Y X PCA ICADTW Model (M)Data (X) ED, DTW Correlation AR, ARIMA, LDS StatStream etc… DFT, DWT, SVD, ICA Automatic parameter- tuning Non-linear phenomena Complex events Real-time processing X
  15. 15. 研究課題 © 2017 Yasuko Matsubara 15 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  16. 16. 研究課題 © 2017 Yasuko Matsubara 16 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  17. 17. (R1) ⼤規模テンソル解析 • 時刻付きイベント – 例:Webアクセスログ 17 Mth order tensor (M=3) URL user Time x u v n Element x: # of events e.g., ‘Smith’, ‘CNN.com’, ‘Aug 1, 10pm’; 21 times Time URL User 08-01-12:00 CNN.com Smith 08-02-15:00 YouTube.com Brown 08-02-19:00 CNET.com Smith 08-03-11:00 CNN.com Johnson … … … © 2017 Yasuko Matsubara
  18. 18. ⼤規模テンソル解析 • Fast Mining and Forecasting of Complex Time-Stamped Events (KDD 2012) 18 Time URL User 08-01-12:00 CNN.com Smith 08-02-15:00 YouTube.com Brown 08-02-19:00 CNET.com Smith 08-03-11:00 CNN.com Johnson … … … TriMine KDD 2012 © 2017 Yasuko Matsubara Webサイトの アクセス解析をしよう! 明日は誰がどのページを開く?
  19. 19. ⼤規模テンソル解析 19 Timestamp URL User Device 2012-08-01-12:00 CNN.com Smith iphone 2012-08-02-15:00 YouTube.com Brown iphone 2012-08-02-19:00 CNET.com Smith mac 2012-08-03-11:00 CNN.com Johnson ipad … … … … 2012-08-05-12:00 CNN.com Smith iphone 2012-08-05-19:00 CNET.com Smith iphone Forecast {time, URL, user ID, access devices, http referrer,…} 時刻付きイベント TriMine KDD 2012 © 2017 Yasuko Matsubara
  20. 20. ⼤規模テンソル解析 20 TriMine KDD 2012 © 2017 Yasuko Matsubara Complex time-stamped events Object/ URL Actor/ user Time Web clicks Object Actor Time (business) (news) (media) …
  21. 21. ⼤規模テンソル解析 21 TriMine KDD 2012 © 2017 Yasuko Matsubara Complex time-stamped events Object/ URL Actor/ user Time Web clicks Object Actor Time (business) (news) (media) … businessトピックベクトルの例 オブジェクト /URL Money.com CNN.com Smith Johnson アクター /user 時間 Mon-Fri Sat-Sun ⾼い値: トピックとの 強い関連性
  22. 22. ⼤規模テンソル解析 22 TriMine KDD 2012 © 2017 Yasuko Matsubara Complex time-stamped events URL matrix User matrix Time matrix Noisy L Sparse L (Original data)
  23. 23. TriMine-Forecasts Our final goal: “forecast future events”! e.g., estimate clicks for user “smith”, to URL “CNN.com”, for next 10 days © 2017 Yasuko Matsubara 23 Object/ URL Actor/ User Time v u Future? TriMine KDD 2012
  24. 24. Why not naïve? • Individual-sequence forecasting - Create a set of (u * v) sequences © 2017 Yasuko Matsubara 24 n n+1 … v u n Object Actor v u - L Scalability : time complexity is at least - L Accuracy : each sequence “looks” like noise, (e.g., {0, 0, 0, 1, 0, 0, 2, 0, 0, ….}) -> hard to forecast O(uvn) AR TriMine KDD 2012
  25. 25. TriMine-F Our approach: – Step 1: Forecast time-topic matrix: Ĉ – Step 2: Generate events using 3 matrices © 2017 Yasuko Matsubara 25 Tensor X O A C ˆC Future events O A ˆC TriMine KDD 2012
  26. 26. Time Disease Location Cases 04-01 measles PA 4740 04-01 measles NY 5310 04-02 rubella CA 1923 … ⼤規模テンソル解析 • FUNNEL: Automatic Mining of Spatially Coevolving Epidemics (KDD 2014) 26 FUNNEL KDD 2014 © 2017 Yasuko Matsubara
  27. 27. 研究課題 © 2017 Yasuko Matsubara 27 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  28. 28. (R2) ⾮線形モデリング • ⾮線形⽅程式 (non-linear equations) – 疫学 (epidemiology) – ⽣物学 (biology) – 物理学 (physics) – 経済学 (economics) • ⾮線形社会現象の解析 – ビッグデータの⾮線形解析 – Web, IoTなど 28 FreeDigitalPhotos.net Influenza@Wikipedia © 2017 Yasuko Matsubara
  29. 29. ⾮線形モデリング • Rise and Fall Patterns of Information Diffusion: Model and Implications (KDD 2012) 29 SpikeM KDD 2012 © 2017 Yasuko Matsubara news ! ! ! ! ! Twitter,ブログ… 噂やニュースって どうやって伝わるの?
  30. 30. ⾮線形モデリング • Rise and Fall Patterns of Information Diffusion: Model and Implications (KDD 2012) 30 SpikeM KDD 2012 © 2017 Yasuko Matsubara 20 40 60 80 100 120 140 160 0 100 200 Time (hours) #ofmentions 20 40 60 80 100 120 140 160 0 50 100 Time (hours) #ofmentions Breaking news Decay (power law) News spread (per hour, 1 week)
  31. 31. ⾮線形モデリング • Rise and Fall Patterns of Information Diffusion: Model and Implications (KDD 2012) 31 SpikeM KDD 2012 Time n=0 Time n=nb Time n=nb+1 β 1. Un-informed (ニュースを通知されていないユーザ) 2. 外部ショック (時刻 nb, ニュース速報等) 3. 伝染 (⼝コミ効果) © 2017 Yasuko Matsubara
  32. 32. ⾮線形モデリング • Rise and Fall Patterns of Information Diffusion: Model and Implications (KDD 2012) 32 SpikeM KDD 2012 Time n=0 Time n=nb Time n=nb+1 β 1. Un-informed (ニュースを通知されていないユーザ) 2. 外部ショック (時刻 nb, ニュース速報等) 3. 伝染 (⼝コミ効果) © 2017 Yasuko Matsubara Decay function: f (n) = β *n−1.5 f (n) n Linear scale f (n) n Log scale
  33. 33. 33 ⾮線形モデリング • ピーク時以前のダイナミクスを予測 • 未知の拡散パターンとピーク位置を予測 (1) First spike (2) Release date (3) Two weeks before release SpikeM KDD 2012 © 2017 Yasuko Matsubara
  34. 34. ⾮線形モデリング 34 • The Web as a Jungle: Non-linear Dynamical Systems for Co-evolving Online Activities (WWW 2015) EcoWeb WWW 2015 © 2017 Yasuko Matsubara VS. ? Web上の闘い! ライバルは誰だ?!
  35. 35. 35 EcoWeb WWW 2015 ⾮線形モデリング Android Xbox PlayStation Wii 競合関係 ネットワーク The Web as a Jungle ! ⽣態系モデルに基づくWeb活動の⾮線形解析 学習結果 © 2017 Yasuko Matsubara
  36. 36. ⾮線形モデリング 36 EcoWeb WWW 2015 © 2017 Yasuko Matsubara Ecosystem in the Jungle Ecosystem on the Web The Web as a Jungle ! ⽣態系モデルに基づくWeb活動の⾮線形解析
  37. 37. 37 Image courtesy of xura, criminalatt, David Castillo Dominici, happykanppy at FreeDigitalPhotos.net. z キーワード ユーザ資源 生物種 食料資源 注目度個体数 季節イベント (Xmasなど) 気候・季節 ジャングル Web EcoWeb WWW 2015 ⾮線形モデリング © 2017 Yasuko Matsubara
  38. 38. EcoWeb-individual Popularity size increases over time © 2017 Yasuko Matsubara 38 t=0 t=1 t=2 + + + + + + Species Keywords JungleWeb Foods Users eat attract EcoWeb WWW 2015
  39. 39. EcoWeb-individual Non-linear evolution of a single keyword © 2017 Yasuko Matsubara 39 - Initial condition (i.e., P(0) =p ) - Growth rate, attractiveness - Carrying capacity (=available user resources) r p K Popularity size EcoWeb WWW 2015
  40. 40. EcoWeb-individual Non-linear evolution of a single keyword © 2017 Yasuko Matsubara 40 - Initial condition (i.e., P(0) =p ) - Growth rate, attractiveness - Carrying capacity (=available user resources) K r p Popularity size p K r EcoWeb WWW 2015
  41. 41. EcoWeb-interaction Interaction between multiple keywords © 2017 Yasuko Matsubara 41 Keywords User resources VS. Species Food resources VS. share share EcoWeb WWW 2015
  42. 42. EcoWeb-interaction Interaction between multiple keywords © 2017 Yasuko Matsubara 42 - Interaction coefficient - i.e., effect rate of keyword j on i aij Popularity of keyword i Popularity of j EcoWeb WWW 2015
  43. 43. Popularity of keyword i Popularity of j EcoWeb-interaction Interaction between multiple keywords © 2017 Yasuko Matsubara 43 - Interaction coefficient - i.e., effect rate of keyword j on i aij aij j i aij > 0 EcoWeb WWW 2015
  44. 44. EcoWeb-seasonality “Hidden” seasonal activities © 2017 Yasuko Matsubara 44 Seasonal events Season/ Climate EcoWeb WWW 2015
  45. 45. EcoWeb-seasonality “Hidden” seasonal activities © 2017 Yasuko Matsubara 45 Seasonal events Season/C limate Users change their behavior according to seasonal events! EcoWeb WWW 2015
  46. 46. 46 EcoWeb WWW 2015 ⾮線形モデリング Android Xbox PlayStation Wii 競合関係 ネットワーク The Web as a Jungle ! ⽣態系モデルに基づくWeb活動の⾮線形解析 学習結果 © 2017 Yasuko Matsubara
  47. 47. 47 EcoWeb WWW 2015 ⾮線形モデリング 季節活動 パターン The Web as a Jungle ! ⽣態系モデルに基づくWeb活動の⾮線形解析 学習結果 Android Xbox PlayStation Wii © 2017 Yasuko Matsubara
  48. 48. 48 EcoWeb WWW 2015 ⾮線形モデリング 季節活動 パターン The Web as a Jungle ! ⽣態系モデルに基づくWeb活動の⾮線形解析 学習結果 Android Xbox PlayStation Wii--- Non-Linear equations --- © 2017 Yasuko Matsubara
  49. 49. 研究課題 © 2017 Yasuko Matsubara 49 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  50. 50. (R3) 特徴⾃動抽出 • 特徴⾃動抽出の重要性 • ビッグデータマイニング: -> ⼈⼿を介さない処理が必要 50 ⼿動 - パタメータ調整がセンシティブ - 調整作業に⻑い時間(数時間、数⽇…) ⾃動 - 技術者、専⾨家のチューニングが不要 © 2017 Yasuko Matsubara
  51. 51. 特徴⾃動抽出 • AutoPlait: Automatic Mining of Co-evolving Time Sequences (SIGMOD 2014) 51 AutoPlait SIGMOD 2014 © 2017 Yasuko Matsubara Time ダンスのモーション! どこが切れ目かわかる? ステップの種類は?
  52. 52. 特徴⾃動抽出 52 “Automatic” mining algorithm Find Given beaks wings tail feathers claps beaks wings tail feathers claps Find: compact description of data X Chicken dance AutoPlait SIGMOD 2014 © 2017 Yasuko Matsubara
  53. 53. 特徴⾃動抽出 53 AutoPlait SIGMOD 2014 “Automatic” mining algorithm © 2017 Yasuko Matsubara Idea (1): Multi-level chain model –HMM-based probabilistic model –with “across-regime” transitions Model Sequences beaks wings claps Regimes
  54. 54. 特徴⾃動抽出 54 AutoPlait SIGMOD 2014 Good compression Good description 1 2 3 4 5 6 7 8 9 10 CostM CostC CostT (# of r, m) CostM(M) + Costc(X|M) Model cost Coding cost min ( ) “Automatic” mining algorithm Idea(2): Minimize encoding cost! © 2017 Yasuko Matsubara
  55. 55. 特徴⾃動抽出 アイデア 55 Iteration 1 r=2, m=4 Iteration 2 r=3, m=6 Iteration 4 r=4, m=8 f1 = 2 X X θ1 θ2 f2 =1 f3 = 2 f4 =1 θ1 θ3 θ2 X θ1 θ3 θ4 θ2 f1 = 2 f2 = 3 f3 =1 f4 = 2 f5 = 3 f6 =1 f1 = 2 f2 = 4 f3 = 3 f4 =1 f5 = 2 f6 = 4 f7 = 3 f8 =1 Split 1 2 3 4 5 6 7 Iteration AutoPlait SIGMOD 2014 © 2017 Yasuko Matsubara
  56. 56. 特徴⾃動抽出 Mocap data AutoPlait SIGMOD 2014 56© 2017 Yasuko Matsubara AutoPlait (NO magic numbers) DynaMMo (Li et al., KDD’09) pHMM (Wang et al., SIGMOD’11)
  57. 57. 特徴⾃動抽出 Mocap data AutoPlait SIGMOD 2014 57© 2017 Yasuko Matsubara
  58. 58. 特徴⾃動抽出 Turning point detection (seasonal sweets) Trend suddenly changed in 2010 (release of android OS “Ginger bread”, “Ice Cream Sandwich”) AutoPlait SIGMOD 2014 58© 2017 Yasuko Matsubara
  59. 59. 特徴⾃動抽出 Trend discovery (game-related topics) It discovers 3 phases of “game console war” (Xbox&PlayStation/Wii/Mobile social games) AutoPlait SIGMOD 2014 59© 2017 Yasuko Matsubara
  60. 60. 60 Roadmap Research issues Current projects Future work © 2017 Yasuko Matsubara
  61. 61. 研究プロジェクト • IoTビッグデータのリアルタイム予測 1. ⾮線形テンソル解析 Non-Linear Mining of Competing Local Activities (WWW2016) 2. リアルタイム予測 Regime Shifts in Streams: Real-time Forecasting of Co-evolving Time Sequences (KDD2016) 61 X ≈ + ... 𝑑𝑥 𝑑𝑡 = 𝑓(𝑥) © 2017 Yasuko Matsubara
  62. 62. 研究課題 © 2017 Yasuko Matsubara 62 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  63. 63. ⾮線形テンソル解析 • Non-Linear Mining of Competing Local Activities (WWW2016) 63 CompCube WWW 2016 © 2017 Yasuko Matsubara 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 2004 2007 2010 2013 US CA AU ZA CN JP Nexus Kindle 商品戦略! どの地域でどのくらい 売れてるかな?
  64. 64. ⾮線形テンソル解析 • Non-Linear Mining of Competing Local Activities (WWW2016) 64 CompCube WWW 2016 © 2017 Yasuko Matsubara
  65. 65. ⾮線形テンソル解析 • Non-Linear Mining of Competing Local Activities (WWW2016) 65 m d n X Activity Time (weekly) Google Trends: Googleにおけるキーワード検索回数 © 2017 Yasuko Matsubara CompCube WWW 2016
  66. 66. ⾮線形テンソル解析 Given: Tensor X (activity x location x time) Find: Compact description of X 66 X X = CompCube B C S D © 2017 Yasuko Matsubara CompCube WWW 2016
  67. 67. ⾮線形テンソル解析 Given: Tensor X (activity x location x time) Find: Compact description of X 67 X X = CompCube B C S D Competition Seasonality DeltasBasics © 2017 Yasuko Matsubara CompCube WWW 2016
  68. 68. ⾮線形テンソル解析 68 ⾮線形モデリング ジャングルとWebの⽣態系 User VS. Food VS. share share Ecosystem in the Jungle Ecosystem on the Web Species Activities Image courtesy of xura, criminalatt, David Castillo Dominici, happykanppy at FreeDigitalPhotos.net. © 2017 Yasuko Matsubara CompCube WWW 2016
  69. 69. ⾮線形テンソル解析 69 ⾮線形モデリング ジャングルとWebの⽣態系 User VS. Food VS. share share Ecosystem in the Jungle Ecosystem on the Web Species Activities Image courtesy of xura, criminalatt, David Castillo Dominici, happykanppy at FreeDigitalPhotos.net. © 2017 Yasuko Matsubara CompCube WWW 2016 Non-linear dynamical system 0 100 200 300 400 500 0 0.2 0.4 0.6 0.8 1
  70. 70. ⾮線形テンソル解析 70 (a) CompCube-dense (b) CompCube B C S D © 2017 Yasuko Matsubara CompCube WWW 2016
  71. 71. ⾮線形テンソル解析 71 (a) CompCube-dense B C S D (b) CompCube Global Local © 2017 Yasuko Matsubara CompCube WWW 2016
  72. 72. ⾮線形テンソル解析 72 (a) CompCube-dense B C S D (b) CompCube Global Local © 2017 Yasuko Matsubara CompCube WWW 2016 Dense Sparse G L &
  73. 73. ⾮線形テンソル解析 Google search volumes for 73 Kindle Nexus US CA JP CN BR AU ZA IT Weak/Average/Strong Local Competition strength © 2017 Yasuko Matsubara CompCube WWW 2016
  74. 74. ⾮線形テンソル解析 Google search volumes for 74 Kindle Nexus US CA JP CN BR AU ZA IT Weak/Average/Strong Local Competition strength © 2017 Yasuko Matsubara CompCube WWW 2016
  75. 75. ⾮線形テンソル解析 Local seasonality for 75 iPod Component #1 Component #2 © 2017 Yasuko Matsubara CompCube WWW 2016
  76. 76. ⾮線形テンソル解析 76 Component #1 Component #2 Dec. Chinese New Year Feb. Local seasonality for Xmas iPod © 2017 Yasuko Matsubara CompCube WWW 2016
  77. 77. ⾮線形テンソル解析 Fitting result for 77 News resources 2004 2006 2008 2010 2012 2014 0 0.2 0.4 0.6 0.8 1 Time (weekly) #ofclicks@time Fitting result − RMSE=0.056 1CNN 2Fox_News 3TIME 4Google_News 5BBC 6Yahoo_News 7AP 8Huffington_Post 9MSN_News 10Al_Jazeera © 2017 Yasuko Matsubara CompCube WWW 2016
  78. 78. ⾮線形テンソル解析 78 2004 2006 2008 2010 2012 2014 0 0.2 0.4 0.6 0.8 1 Time (weekly) #ofclicks@time Fitting result − RMSE=0.056 1CNN 2Fox_News 3TIME 4Google_News 5BBC 6Yahoo_News 7AP 8Huffington_Post 9MSN_News 10Al_Jazeera Detected! US election Nov. 2008 Wikipedia Fitting result for News resources © 2017 Yasuko Matsubara CompCube WWW 2016
  79. 79. ⾮線形テンソル解析 79 US election Nov. 2008 Local attention to US election Weak/Strong Wikipedia Fitting result for News resources © 2017 Yasuko Matsubara CompCube WWW 2016
  80. 80. ⾮線形テンソル解析 80© 2017 Yasuko Matsubara Forecasting future local activities d X Time (weekly) Activity ? Train: 2/3 sequences Forecast: 1/3 following years CompCube WWW 2016
  81. 81. ⾮線形テンソル解析 • 将来予測(ローカルパターンの推定) 81 将来予測 ? Future Time © 2017 Yasuko Matsubara CompCube WWW 2016 1. Products
  82. 82. • ⾞両センサデータのテンソル解析 地理情報テンソル {trip, zone, object} 82 w d n Trip Zone Original tensor Trip 1 Trip 2 Trip 3 zone zone zone ⾮線形テンソル解析技術の実⽤化 © 2017 Yasuko Matsubara
  83. 83. • 全ての要素を統合的に解析 • データ全体を表現する要約情報を抽出 • ⾛⾏データに基づく⾼度な道路地図情報を提供 83 ⾮線形テンソル解析技術の実⽤化 © 2017 Yasuko Matsubara
  84. 84. 研究課題 © 2017 Yasuko Matsubara 84 テンソル解析 特徴⾃動抽出 ⾮線形モデリング リアルタイム処理 X Time-stamped event stream Forecasted events ? ? ? t t +1,2,3,... X ≈ + ...
  85. 85. リアルタイム予測 • Regime Shifts in Streams: Real-time Forecasting of Co-evolving Time Sequences (KDD2016) 85 RegimeCast KDD 2016 © 2017 Yasuko Matsubara ? X 未来を予測し続けるには?
  86. 86. リアルタイム予測 • リアルタイム予測に必要なこと (a) ls-時刻先の予測 (b) 継続的にパターン検出 (c)適応⼒のある予測 86 Long-term Continuous Adaptive RegimeCast KDD 2016 © 2017 Yasuko Matsubara ls-steps ? X
  87. 87. リアルタイム予測 87 Snap-Shot Forecast (100-steps -ahead) (Current window) Original RegimeCast RegimeCast KDD 2016 © 2017 Yasuko Matsubara
  88. 88. リアルタイム予測 88 Snap-Shot Forecast (100-steps -ahead) (Current window) Original RegimeCast RegimeCast KDD 2016 © 2017 Yasuko Matsubara ?
  89. 89. リアルタイム予測 89 Snap-Shot Forecast (100-steps -ahead) (Current window) Original RegimeCast RegimeCast KDD 2016 © 2017 Yasuko Matsubara ? Future events Arrived events
  90. 90. リアルタイム予測 90 Snap-Shot Forecast (100-steps -ahead) (Current window) Original RegimeCast RegimeCast KDD 2016 © 2017 Yasuko Matsubara ? Future events Arrived events
  91. 91. © 2017 Yasuko Matsubara 91 Original Snap-Shot Forecast (100-steps -ahead) (Current window) Forecast 321 リアルタイム予測 RegimeCast KDD 2016 RegimeCast
  92. 92. レジームシフト 92 森林 草原 300 400 500 600 700 −2 0 2 6500 6600 6700 6800 6900 −2 −1 0 1 2 3 Value Walking Wiping センサデータストリーム ⾃然界における構造や性質の急激な変化 Image courtesy of dan at FreeDigitalPhotos.net. RegimeCast KDD 2016 ⽣態系 レジームシフトの例: • 森林 vs. 草原 • 珊瑚礁 vs. ⼤型藻類 • 砂漠 vs. 植⽣ S(t) Regime shift in streams: 時系列パターンをレジームとして表現 © 2017 Yasuko Matsubara
  93. 93. RegimeCast Main ideas © 2017 Yasuko Matsubara 93 P2 P1 Latent non-linear dynamics Regime shifts in streams Nested structureP3
  94. 94. 500 1000 1500 2000 25 −2 0 2 O 0 2 stretchingwalking (left) (both) Forecasted variables Latent non-linear dynamics © 2017 Yasuko Matsubara 94 Various patterns (“regimes”) in streams 500 1000 1500 2000 2500 Origi T stretchingwalking (left) (both) Forecasted variables 00 1500 2000 2500 3000 350 Original data Time stretching (lef(right)(left) (both) Forecasted variables
  95. 95. 500 1000 1500 2000 25 −2 0 2 O 0 2 stretchingwalking (left) (both) Forecasted variables Latent non-linear dynamics © 2017 Yasuko Matsubara 95 Various patterns (“regimes”) in streams 500 1000 1500 2000 2500 Origi T stretchingwalking (left) (both) Forecasted variablesQ. How can we effectively capture dynamics of “regimes”? 00 1500 2000 2500 3000 350 Original data Time stretching (lef(right)(left) (both) Forecasted variables
  96. 96. Latent non-linear dynamics © 2017 Yasuko Matsubara 96 A. Latent NLDS Potential activity Estimated event s(t) v(t) Linear Exponential Non-linear Projection* S(0)=s0
  97. 97. 500 1000 1500 2000 2500 Origstretchingwalking (left) (both) Forecasted variables Regime shifts in streams © 2017 Yasuko Matsubara 97 Various patterns (“regimes”) in streams Regime #1 “Walk” Regime #2 “Stretch” change
  98. 98. Regime #1 “Walk” Regime #2 “Stretch” 500 1000 1500 2000 2500 Origstretchingwalking (left) (both) Forecasted variables Regime shifts in streams © 2017 Yasuko Matsubara 98 Various patterns (“regimes”) in streamsQ. How can we identify sudden discontinuities? change
  99. 99. 500 1000 1500 2000 2500 Origstretchingwalking (left) (both) Forecasted variables Regime shifts in streams © 2017 Yasuko Matsubara 99 Various patterns (“regimes”) in streams A: “Regime Shifts in Streams”! Q. How can we identify sudden discontinuities?
  100. 100. Regime shifts in natural systems © 2017 Yasuko Matsubara 100 Woodlands Grasslands Ecological system Abrupt changes in the structure of complex systems Image courtesy of dan at FreeDigitalPhotos.net. Examples: • Woodland vs. grassland • Coral vs. macro algae • Desert vs. vegetation
  101. 101. Regime shifts in natural systems © 2017 Yasuko Matsubara 101 Woodlands Grasslands Ecological system Abrupt changes in the structure of complex systems Image courtesy of dan at FreeDigitalPhotos.net. Time-evolving ecosystem property (nutrients/soils) a0: environmental factor a1: growth/decay rate a2: recover rate S(t)
  102. 102. Regime shifts in event streams © 2017 Yasuko Matsubara 102 Woodlands Grasslands 300 400 500 600 700 −2 0 2 6500 6600 6700 6800 6900 −2 −1 0 1 2 3 Value Walking Wiping Ecological system Motion sensors Abrupt changes in the structure of complex systems Image courtesy of dan at FreeDigitalPhotos.net. Shift Shift
  103. 103. Regime shifts in event streams © 2017 Yasuko Matsubara 103 L-NLDS + regime activity R: Regime shift dynamics c: # of regimes
  104. 104. Nested structure Nested, multi-scale dynamical activities © 2017 Yasuko Matsubara 104 Chicken dance
  105. 105. Nested structure Nested, multi-scale dynamical activities © 2017 Yasuko Matsubara 105 200 400 600 800 1000 1200 −2 0 2 Chicken dance Xorg
  106. 106. Nested structure Nested, multi-scale dynamical activities © 2017 Yasuko Matsubara 106 200 400 600 800 1000 1200 −2 0 2 200 400 600 800 1000 1200 −2 0 2 200 400 600 800 1000 1200 −2 0 2 X(1) X(2) Xorg Original events : Long-term + : Short-term X(1) X(2) Chicken dance
  107. 107. Nested structure Nested, multi-scale dynamical activities © 2017 Yasuko Matsubara 107 200 400 600 800 1000 1200 −2 0 2 200 400 600 800 1000 1200 −2 0 2 200 400 600 800 1000 1200 −2 0 2 Tail feathers = bending knees, once + moving arms, quickly Xorg X(1) X(2) Xorg = X(1) + X(2) Chicken dance
  108. 108. Nested structure Multi-level modeling structure © 2017 Yasuko Matsubara 108 θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) + ... ≈ VE (1) VE (2) Level 1 Level 2 Estimated events VE Full parameter set M Local events (Long-term) (Short-term)
  109. 109. RegimeCast © 2017 Yasuko Matsubara 109 Regime Reader Regime Estimator Event stream XC + ... Model DB Forecast window ≈ Time tc θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) VE (1) VE (2) VF VE X Report RegimeCast KDD 2016
  110. 110. Problem definition • RegimeSnap © 2017 Yasuko Matsubara 110 0 200 400 600 800 2 0 2 Time Current window XC tc Future (unknown) events ts tmTime Arrived events te Forecast window VF Estimated events Event stream X VE
  111. 111. Estimated events Event stream X VE Problem definition • RegimeSnap © 2017 Yasuko Matsubara 111 0 200 400 600 800 2 0 2 Time Current window XC tc Future (unknown) events ts tmTime Arrived events te Forecast window VF 0 200 400 600 800 2 0 2 Time XC Given: Current window (original events) XC
  112. 112. Problem definition • RegimeSnap © 2017 Yasuko Matsubara 112 0 200 400 600 800 2 0 2 Time Current window XC tc Future (unknown) events ts tmTime Arrived events te Forecast window VF VE Estimated events Event stream X VE Find: Estimated events VE
  113. 113. Problem definition • RegimeSnap © 2017 Yasuko Matsubara 113 0 200 400 600 800 2 0 2 Time Current window XC tc Future (unknown) events ts tmTime Arrived events te Forecast window VF VE Estimated events Event stream X VE VF Report: Forecast window (ls-steps-ahead) VF ls
  114. 114. Streaming algorithm • Proposed algorithms © 2017 Yasuko Matsubara 114 RegimeCast RegimeReader RegimeEstimatorA3 A1 A2 Identify current regime dynamics Estimates regime parameter set θ Report ls-steps-ahead future events X ? ls
  115. 115. RegimeCast © 2017 Yasuko Matsubara 115 Regime Reader Regime Estimator Event stream XC + ... Model DB Forecast window ≈ Time tc θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) VE (1) VE (2) VF VE X Report RegimeCast KDD 2016
  116. 116. RegimeCast © 2017 Yasuko Matsubara 116 Regime Reader Regime Estimator + ... Model DB Forecast window ≈ θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) VE (1) VE (2) VF VE Report Event stream XC Time tc X Step1: Extract current window XC RegimeCast KDD 2016
  117. 117. RegimeCast © 2017 Yasuko Matsubara 117 Regime Estimator + ... Forecast window ≈ VE (1) VE (2) VF VE Report Event stream XC Time tc X θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) XC Model DB Step2: Find optimal regimes Regime Reader RegimeCast KDD 2016
  118. 118. © 2017 Yasuko Matsubara 118 Regime Estimator + ... Forecast window ≈ VE (1) VE (2) VF VE Report Event stream XC Time tc X θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) XC Model DB Regime Reader - Update model parameters 𝜃) ()) , … - Identify regime shift dynamics 𝑟(𝑡-) RegimeCast Step2: Find optimal regimes RegimeCast KDD 2016
  119. 119. RegimeCast © 2017 Yasuko Matsubara 119 Regime Reader + ... Forecast window ≈ VE (1) VE (2) VF VE Report Event stream XC Time tc X XC Model DB Step3: (optional) Estimate/insert new regime 𝜃 XC + Insert new regime 𝜽 θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) Regime Estimator RegimeCast KDD 2016
  120. 120. Report Forecast window VF Model DBθ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) Regime Estimator RegimeCast © 2017 Yasuko Matsubara 120 Event stream XC Time tc X XC Regime Reader + ... ≈ VE (1) VE (2) VE Regime Reader Step4: Estimate future events VE Estimated local events: RegimeCast KDD 2016
  121. 121. ... ≈ RegimeCast © 2017 Yasuko Matsubara 121 Regime Reader Regime Estimator + Model DB VE (1) VE (2) θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) Event stream XC Time tc X Step5: Report future events VF VF VE Report Forecast window VF VF RegimeCast KDD 2016
  122. 122. RegimeCast © 2017 Yasuko Matsubara 122 Regime Reader Regime Estimator Event stream XC + ... Model DB Forecast window ≈ Time tc θ1 (1) θ2 (1) θ1 (2) θ2 (2) Θ(1) Θ(2) VE (1) VE (2) VF VE X Report RegimeCast KDD 2016
  123. 123. 予測結果 - MoCap 123 (100-120) 時刻先を予測 RegimeCast KDD 2016 © 2017 Yasuko Matsubara
  124. 124. 予測結果 - MoCap 124 RegimeCast KDD 2016 © 2017 Yasuko Matsubara (100-120) 時刻先を予測
  125. 125. 予測結果 - MoCap 125 (30-35) 時刻先を予測 RegimeCast KDD 2016 © 2017 Yasuko Matsubara
  126. 126. 予測結果 - Web 126 3ヶ⽉先を 予測 Google Trends RegimeCast KDD 2016 © 2017 Yasuko Matsubara
  127. 127. 予測結果 – others 127 Yen vs. dollar & AU vs. PT 6週間先を 予測 3ヶ⽉先を 予測 RegimeCast KDD 2016 © 2017 Yasuko Matsubara
  128. 128. 予測結果 – others 128 RegimeCast KDD 2016 © 2017 Yasuko Matsubara RegimeCast is ✔ Effective ✔ Adaptive ✔ Anytime✔ Scalable MoCap
  129. 129. 129 Roadmap Research issues Current projects Future work © 2017 Yasuko Matsubara
  130. 130. Smart assistant service 技術的課題 1. リアルタイム予測のための⾃律的モデル学習 2. アシスタントサービスのための因果関係の解析 応⽤ • 建設業、製造業、交通サービス、Web、環境 130© 2017 Yasuko Matsubara
  131. 131. Smart assistant service 技術的課題 1. リアルタイム予測のための⾃律的モデル学習 – 時系列モデルの⽣成と蓄積 – 予測のための最適なモデルの選択 – IoTデータストリーム上での効率的なモデル更新 131 Forecast ?X TSM-DBAutomatic © 2017 Yasuko Matsubara
  132. 132. Smart assistant service 技術的課題 2. アシスタントサービスのための因果関係の解析 – モデル間の連結の強さの推定 – 連結をたどることによる要因/結果の関係性の発⾒ – 事故やトラブルのサイン(兆し)の監視 – 社会⾏動のための情報推薦 132 Cause/Effect X E C EC © 2017 Yasuko Matsubara
  133. 133. 将来の社会構造への貢献 リアルタイム予測に基づく⾼度な社会サービスの実現 © 2017 Yasuko Matsubara 133 熊本地震支援 @Wikipedia Wikipedia 交通システム (渋滞緩和/事故防⽌) 製造/流通/開発 (作業ストレス緩和/事故防⽌) ヘルスケア (健康維持) 政策 (市場調査/社会分析) 防災/防犯 (被災者⽀援/緊急情報提⽰)
  134. 134. 参考⽂献 • Conference papers – "Regime Shifts in Streams: Real-time Forecasting of Co-evolving Time Sequences", KDD’16. – "Non-Linear Mining of Competing Local Activities”, WWW’16. – "The Web as a Jungle: Non-Linear Dynamical Systems for Co-evolving Online Activities", WWW’15. – "FUNNEL: Automatic Mining of Spatially Coevolving Epidemics", KDD’14. – “AutoPlait: Automatic Mining of Co-evolving Time Sequences”, SIGMOD’14. – "Rise and Fall Patterns of Information Diffusion: Model and Implications”, KDD’12. – "Fast Mining and Forecasting of Complex Time-Stamped Events", KDD’12. • Tutorials – "Smart Analytics for Big Time-series Data", 3-hour tutorial@KDD’17 (to appear). – "Mining Big Time-series Data on the Web", 3-hour tutorial@WWW’16. – "Mining and Forecasting of Big Time-series data", 3-hour tutorial@SIGMOD’15. • Software/Data/pdf/pptx/etc. – http://www.cs.kumamoto-u.ac.jp/~yasuko/software.html 134© 2017 Yasuko Matsubara

×