Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SAIS/DWS2018報告会 #saisdws2018

316 views

Published on

SAIS/DWS2018報告会の李が担当する部分の資料となります。
https://connpass.com/event/94361/

Published in: Technology
  • Login to see the comments

SAIS/DWS2018報告会 #saisdws2018

  1. 1. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 2018年8月6日 李 燮鳴 Spark AI Summit 2018 報告会
  2. 2. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 自己紹介 2 李 燮鳴 (リ ショウメイ) 2017年3月筑波大学大学院博士(工学)取得 • 並列ファイルシステムのためのスケジューラの研究 2017年4月からはヤフーに入社 • 入社後はHadoopクラスタのDevOpsを担当
  3. 3. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark AI Summit 2018 3 開催日時:2018/06/04~2018/06/06 場所:San Francisco Moscone Center West 参加者:6000名ほど セッション:約9~11並列発表され、合計で193セッション
  4. 4. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 会場 (外観) 4
  5. 5. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 会場 (内部) 5
  6. 6. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 食事 6
  7. 7. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. アジェンダ 7 • MLを便利にするフレームワーク(2件) • Spark SQLについて(2件) • クラスタアーキテクチャー(2件)
  8. 8. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MLを便利にするフレームワーク
  9. 9. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod 9 • TensorFlowの分散型学習を高速化した フレームワーク • MPIのALL REDUCEを利用してGradientsの平均値 の計算を高速化した Alexander Sergeev, Uber
  10. 10. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (1) 10 Alexander Sergeev, Uber Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/ TensorFlowの分散型学習ではParameter Serverを使用し、各Workerで求まったGradientの平均値の計算 難点:Parameter Serverの構成を選択するのは難しい
  11. 11. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (2) 11 Alexander Sergeev, Uber  Horovodでは、Parameter Serverを使用せず、NCCL (NVIDIA Collective Communications Library, MPIで実装)のRing ALL REDUCEでGradientsの交換・平均計算 を行った Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  12. 12. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (3) 12 Alexander Sergeev, Uber  TensorFlow のオフィシャルのベンチマークを用いた性能評価では約2倍ほどの性能向上を確 認できた Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  13. 13. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. MEET UP: Horovod (4) 13 Alexander Sergeev, Uber  InfinibandでRDMA (Remote Direct Memory Access)を使用すると、性能がさらに上がった Uber. (2018, February 19). Meet Horovod: Uber's Open Source Distributed Deep Learning Framework for TensorFlow. Retrieved July 9, 2018, from https://eng.uber.com/horovod/
  14. 14. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen 14 Reynold Xin, Databricks DLのフレームワークをSparkで効率できるようにする提案 • SPIP= Spark Project Improvement Proposal • 現時点ではDesign Sketchが完了(Designが15%終了, SPARK-24374)
  15. 15. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen (1) 15 Reynold Xin, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
  16. 16. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Hydrogen (2) 16 Reynold Xin, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/databricks-keynote-2
  17. 17. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL
  18. 18. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 18 Xiao Li, Wenchen Fan, Databricks Spark SQLがクエリから実行されるまでの各段階で実施できるパラ メータチューニングの手法を紹介した Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  19. 19. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 19 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  20. 20. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 20 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  21. 21. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 21 Xiao Li, Wenchen Fan, Databricks Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  22. 22. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Deep Dive into Spark SQL with Advanced Performance Tuning 22 Xiao Li, Wenchen Fan, Databricks Spark SQLがクエリから実行されるまでの各段階で実施できるパラ メータチューニングの手法を紹介した Databricks Follow. (2018, June 20). Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao L... Retrieved July 10, 2018, from https://www.slideshare.net/databricks/deep-dive- into-spark-sql-with-advanced-performance-tuning-with-xiao-li-wenchen-fan
  23. 23. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution 23 Carson Wang, Intel, Yuanjian Li, Baidu Spark SQLの実行をランタイムで変更させて効率よくした • 最適なReducerの数をランタイムで決める • 適切なJoin手法をランタイムで決める • BaiduではProd環境で使用(SPARK-23128)
  24. 24. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (1) 24 Carson Wang, Intel, Yuanjian Li, Baidu Reducerの数のチューニング • 少なすぎる場合: Spill, OOM • 多すぎる場合: Scheduling overhead. More IO requests. Too many small output files • すべてのstages に適した数を指定するのはむずかしい
  25. 25. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (2) 25 Carson Wang, Intel, Yuanjian Li, Baidu ShuffledRowRDD Partition 0 (70MB) Partition 1 (30MB) Partition 2 (20MB) Partition 3 (10MB) Partition 4 (50MB) ShuffledRowRDD Partition 0 (70MB) Partition 1 (30MB) Partition 2 (20MB) Partition 3 (10MB) Partition 4 (50MB) Target Size per Reducer =64MB, Min-Max Shuffle Partition Number = 1 to 5 30+20+10<64MB
  26. 26. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (3) 26 Carson Wang, Intel, Yuanjian Li, Baidu SQL Query Logical Plan Optimized Logical Plan Multiple Physical Plan Selected Physical Plan Cost Modelで評価Join2 Join1 Exchange1 T1 Exchange2 T2 Exchange3 T3 最適ではない Joinが選ばれる Plannerの予測値と 実際の値と大きく異 なる場合がある
  27. 27. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (4) 27 Carson Wang, Intel, Yuanjian Li, Baidu Join2 Join1 Exchange1 T1 Exchange2 T2 Exchange3 T3 QueryStage4 Join2 Join1 QueryStage Input1 QueryStage Input2 QueryStage Input3 QueryStage1 Exchange1 T1 QueryStage2 Exchange2 T2 QueryStage3 Exchange3 T3 実際の値を把 握できる
  28. 28. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Spark SQL Adaptive Execution (5) 28 Carson Wang, Intel, Yuanjian Li, Baidu BaiduでAdaptive Exectionを適用した結果 • SortMergeJoinがBroadcastJoinに変更され、 50%~200%の性能向上を確認した • 実行時間が1時間以上のジョブでは適切なReducer数が 指定され、50%~100%の性能向上を確認した
  29. 29. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. クラスタアーキテクチャー
  30. 30. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture 30 Brian Cho, Facebook • データと計算を分離したアーキテクチャーの紹介 • データと計算を分離したアーキテクチャーにおけるSpark の最適化 1. Fileインターフェイスの定義 2. SparkのTemporaryファイルのアクセス最適化 3. Spark shuffleの最適化
  31. 31. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (1) 31 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  32. 32. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (2) 32 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  33. 33. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (3) 33 Brian Cho, Facebook データと計算を分離したアーキテクチャーのメリット • それぞれデータと計算に適したサーバー調達できる • キャパシティプランニングが簡単 • それぞれのチームでメンテナンスできる
  34. 34. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (4) 34 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  35. 35. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (5) 35 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  36. 36. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (5) 36 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture
  37. 37. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (6) 37 Brian Cho, Facebook Executor Executor Executor ESS ESS ESS Local FS Local FS Local FS ここはLocalアクセス 計算
  38. 38. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (7) 38 Brian Cho, Facebook Executor Executor Executor ESS ESS ESS Warm Storage ここはRemoteアクセス *Network Transfer 計算 ストレージ
  39. 39. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Taking Advantage of a Disaggregated Storage and Compute Architecture (8) 39 Brian Cho, Facebook Databricks. Taking Advantage of a Disaggregated Storage and Compute Architecture. Retrieved July 10, 2018, from https://databricks.com/session/taking-advantage-of-a-disaggregated-storage-and-compute-architecture Index, shuffle shuffle shuffle Index
  40. 40. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters 40 Sean Suchter, PepperData, Anirudh Ramanathan, Google  Kubernetesの概要、Spark on Kubernetesの実装と今後 の予定を紹介した  Spark DriverはKubernetesのCustom Controllerとして 実装されている  将来的に追加される機能(ピックアップ) • PySpark: SPARK-23984 • Dynamic Allocation: SPARK-24432 • Driver HA
  41. 41. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (1) 41 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
  42. 42. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (2) 42 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters
  43. 43. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (3) 43 Sean Suchter, PepperData, Anirudh Ramanathan, Google bin/spark-submit ¥ --master k8s://<server:port> ¥ --deploy-mode cluster ¥ --name spark-pi ¥ --class org.apach.spark.examples.SparkPi ¥ --conf spark.executor.instances=5 ¥ --conf spark.kubernetes.container.image=<spark-image> ¥ local:///path/to/examples.jar  利用者はほぼ今まで通りの方法でジョブを提出
  44. 44. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. Apache Spark on Kubernetes Clusters (4) 44 Sean Suchter, PepperData, Anirudh Ramanathan, Google Databricks. Apache Spark on Kubernetes Clusters. Retrieved July 11, 2018, from https://databricks.com/session/apache-spark-on-kubernetes-clusters  Spark on Kubernetes Roadmap
  45. 45. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. EOP
  46. 46. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved.Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. 予備スライド
  47. 47. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: MLflow 47 Matei Zaharia, Databricks  SparkのMachine Learningのライフサイクル 管理フレームワーク  SparkのMLが難しい3つのポイントがあること挙げたう え、それぞれのポイントに対して解決策を提供した
  48. 48. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (1) 48 Matei Zaharia, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/unifying-data-and-ai-for-better-data-products
  49. 49. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (2) 49 Matei Zaharia, Databricks Databricks. Project Hydrogen: Unifying State-of-the-art AI and Big Data in Apache Spark. Retrieved July 9, 2018, from https://databricks.com/session/unifying-data-and-ai-for-better-data-products
  50. 50. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (3) 50 Matei Zaharia, Databricks def main() alpha = float(argv[1]) if len(argv) > 1 else 0 l1_ratio = float(argv[2]) if len(argv) > 2 else 0 (x_train, y_train) = load_data("train.parguet") (x_test, y_test) = load_data("test.parguet") print("Using parameter alpha=%.1f l1_ratio=%.1f" % (alpha, l1_ratio)) mlflow.log_param("alpha", alpha) mlflow.log_param("l1", l1_ratio) model = ElasitcNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42) model.fit (x_train, x_train) y_pred = model.predict(x_test) (mae, rmse, r2) = eval_metrics(y_test, y_pred) mlflow.log_metric("MAE", mae) print("MAE", mae) mlflow.log_metric("RMSE", rmse) print("RMSE", rmse) mlflow.log_metric("R2", r2) print("R2", r2) Mlflow Tracking
  51. 51. Copyright (C) 2018 Yahoo Japan Corporation. All Rights Reserved. KEYNOTE: Mlflow (4) 51 Matei Zaharia, Databricks

×