Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
USENIX  NSDI2016
Session:  Resource  Sharing
2016-‐‑‒05-‐‑‒29  @oraccha
Co-‐‑‒located  Events
• ACM  Symposium  on  SDN  Research  2016  (SOSR),  March  13-‐‑‒17
• 2016  Open  Networking  Summit...
Session:  Resource  Sharing
• “Ernest:  Efficient  Performance  Prediction  for  Large-‐‑‒Scale  Advanced  
Analytics,”  S...
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
• Who?:SparkやMesos等で知られるUCB  AMPLab...
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
5
• How?:⼩小規模なTraining  jobの実⾏行行結果か...
Ernest:  Efficient  Performance  Prediction  for  
Large-‐‑‒Scale  Advanced  Analytics
• Results:
6
TRAINING TIME: Keyston...
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• Who?:Stanford  CS出⾝身で、現在はクラウドセキュリティ会社Sookasa
のCEO(共...
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• How?:shadow  queues
– Hill  climbing  algorithm:  ...
Cliffhanger:  Scaling  Performance  Cliffs  
in  Web  Memory  Caches
• 汎⽤用に使えそうな技術。次の発表のFairRideのようなFairnessに対する
考慮はない。
9
...
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
• Who?:UCB  AMPLabの⼤大学院⽣生。MobiCom13、SIGCOMM15で発表
あり。
• What?:Isolation ...
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
• How?
– Max-‐‑‒minポリシにProbabilistic  blockingを導⼊入することでチートに対する
dis-‐‑‒...
FairRide:  Near-‐‑‒Optimal,  Fair  Cache  
Sharing
12
0
15
30
45
60
0 150 300 450 600 750 900 1050
missratio(%)
Time (s)
u...
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• Who?:ミシガン⼤大の助教。UCB  AMPLab出⾝身。ネットワークが専⾨門
(cof...
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• Highest  Utilization  with  the  Optimal  Iso...
HUG:  Multi-‐‑‒Resource  Fairness  for  
Correlated  and  Elastic  Demands
• 100台のEC2インスタンスで実験。
• 3つのテナント
– テナントA、C:pairw...
感想
• 本セッションの対象はデータセンタ内の資源管理理
• ⾰革新的なアイデアがあるわけではなくが、問題をきちんと定式化し、そ
れに基づいて実⽤用的なシステムを構築するという研究のお⼿手本のような
論論⽂文が多い。さすがNSDI。
• シング...
Upcoming SlideShare
Loading in …5
×

USENIX NSDI 2016 (Session: Resource Sharing)

863 views

Published on

USENIX NSDI 2016輪講会資料

Published in: Technology
  • Be the first to comment

  • Be the first to like this

USENIX NSDI 2016 (Session: Resource Sharing)

  1. 1. USENIX  NSDI2016 Session:  Resource  Sharing 2016-‐‑‒05-‐‑‒29  @oraccha
  2. 2. Co-‐‑‒located  Events • ACM  Symposium  on  SDN  Research  2016  (SOSR),  March  13-‐‑‒17 • 2016  Open  Networking  Summit  (ONS),  March  14-‐‑‒17 • The  12th  ACM/IEEE  Symposium  on  Architectures  for  Networking   and  Communications   Systems  (ANCSʼ’16),  March  17-‐‑‒19 • The  13th  USENIX  Symposium  on  Networked  Systems  Design  and   Implementation  (NSDIʼ’16)   • The  USENIX  Workshop  on  Cool  Topics  in  Sustainable  Data   Centers  (CoolDCʼ’16),   March  19 2
  3. 3. Session:  Resource  Sharing • “Ernest:  Efficient  Performance  Prediction  for  Large-‐‑‒Scale  Advanced   Analytics,”  Shivaram Venkataraman,  Zongheng Yang,  Michael  Franklin,   Benjamin  Recht,  and  Ion  Stoica,  University  of  California,  Berkeley • “Cliffhanger:  Scaling  Performance  Cliffs  in  Web  Memory  Caches,”   Asaf Cidon and  Assaf Eisenman,  Stanford  University;  Mohammad   Alizadeh,  MIT  CSAIL;  Sachin Katti,  Stanford  University • “FairRide:  Near-‐‑‒Optimal,  Fair  Cache  Sharing,”  Qifan Pu  and  Haoyuan Li,  University  of  California,  Berkeley;  Matei Zaharia,  Massachusetts   Institute  of  Technology;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,   Berkeley • “HUG:  Multi-‐‑‒Resource  Fairness  for  Correlated  and  Elastic  Demands,”   Mosharaf Chowdhury,  University  of  Michigan;  Zhenhua Liu,  Stony  Brook   University;  Ali  Ghodsi and  Ion  Stoica,  University  of  California,  Berkeley,   and  Databricks Inc. 3
  4. 4. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics • Who?:SparkやMesos等で知られるUCB  AMPLabの⼤大学院⽣生。⼤大規模 データ分析に対するシステムやアルゴリズムが専⾨門で、SoCC12、 EuroSys13、OSDI14、SIGMOD16等で発表あり。 • What?:クラウド環境における機械学習、ゲノム解析などのデータ分析 ワークロードを効率率率的に性能予測するフレームワークの提案 4 DO CHOICES MATTER ? 0 5 10 15 20 25 30 Time(s) 1 r3.8xlarge 2 r3.4xlarge 4 r3.2xlarge 8 r3.xlarge 16 r3.large Matrix Multiply: 400K by 1K 0 5 10 15 20 25 30 35 Time(s) QR Factorization 1M by 1K Network Bound Mem Bandwidth Bound DO CHOICES MATTER ? MATRIX MULTIPLY 10 15 20 25 30 Time(s) 1 r3.8xlarge 2 r3.4xlarge 4 r3.2xlarge 8 r3.xlarge Matrix size: 400K by 1K Cores = 16 Memory = 244 GB Cost = $2.66/hr Cosine Transform Normalization Linear Solver ~100 iterations Iterative (each iteration many jobs) Long Running à Expensive Numerically Intensive 7 Keystone-ML TIMIT PIPELINE Raw Data Properties 0 10 20 30 0 100 200 300 400 500 600 Time(s) Cores Actual Ideal r3.4xlarge instances, QR Factorization:1M by 1K 13 Do choices MATTER ? Computation + Communication à Non-linear Scaling
  5. 5. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics 5 • How?:⼩小規模なTraining  jobの実⾏行行結果から性能を予測。実験計画法 を使ってTraining  job数を削減。 OPTIMAL Design of EXPERIMENTS 1% 2% 4% 8% 1 2 4 8 Input Machines Use off-the-shelf solver (CVX) USING ERNEST Training Jobs Job Binary Machines, Input Size Linear Model Experiment Design Use few iterations for training 0 200 400 600 800 1000 1 30 900 Time Machines ERNEST BASIC Model time = x1 + x2 ∗ input machines + x3 ∗ log(machines)+ x4 ∗ (machines) Serial Execution Computation (linear) Tree DAG All-to-One DAG Collect Training Data Fit Linear Regression
  6. 6. Ernest:  Efficient  Performance  Prediction  for   Large-‐‑‒Scale  Advanced  Analytics • Results: 6 TRAINING TIME: Keystone-ml TIMIT Pipeline on r3.xlarge instances, 100 iterations 29 7 data points Up to 16 machines Up to 10% data EXPERIMENT DESIGN 0 1000 2000 3000 4000 5000 6000 42 machines Time (s) Training Time Running Time 0% 20% 40% 60% 80% 100% Regression Classification KMeans PCA TIMIT Prediction Error (%) Experiment Design Cost-based Is Experiment Design useful ? 30
  7. 7. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • Who?:Stanford  CS出⾝身で、現在はクラウドセキュリティ会社Sookasa のCEO(共同創業者)。クラウドストレージが専⾨門、SIGCOMM12、 USENIX  ATC13,  15で発表あり。 • What?:Performance  cliffに対する、Memcachedの動的キャッシュ割 当て機構(Slab  allocator)の改良良 70 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Number of Items in LRU Queue Hitrate Concave Hull Application 19, Slab 0 Performance  Cliff,   Talus[HPCA15] +1  cache  hit-‐‑‒rate ↓ +35%  speedup The  cache  hit-‐‑‒rate  of   Facebookʼ’s  Memcached pool is  98.2%[SIGMETRICS12] Hit-‐‑‒rate  Curve
  8. 8. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • How?:shadow  queues – Hill  climbing  algorithm:  Hit  rate  curveの勾配の⼩小さいqueue  (slab)から⼤大 きいqueueにメモリを回す。 – Cliff  scaling  algorithm:  performance   cliff(凹区間)の始まりと終わりを⾒見見 つける。 8 Using&Shadow&Queues&to&Estimate& Local&Gradient 823221 879 53 Queue$1 Queue$2 Physical$Queue Shadow$Queue Physical$Queue Shadow$Queue Credits Queue&1 2 Queue&2 @2 1 Resize$Queues Cliffhanger+Runs+Both+Algorithms+in+ Parallel Par$$oned) Original)Queue) Par$$oned) Queues) Track)le4)of)pointer) Track)le4)of)pointer) Track)right)of)pointer) Track)right)of)pointer) Track)hill)climbing) Track)hill)climbing) • Algorithm+1:+incrementally+optimize+memory+ across+queues – Across+slab+classes – Across+applications • Algorithm+2:+scales+performance+cliffs
  9. 9. Cliffhanger:  Scaling  Performance  Cliffs   in  Web  Memory  Caches • 汎⽤用に使えそうな技術。次の発表のFairRideのようなFairnessに対する 考慮はない。 9 Cliffhanger+Reduces+Misses+and+Can+ Save+Memory • Average+misses+reduced:+36.7% • Average+potential+memory+savings:+45% Cliffhanger+Outperforms+Default+and+ Optimized+Schemes • Average+Cliffhanger+hit+rate+increase:+1.2%
  10. 10. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing • Who?:UCB  AMPLabの⼤大学院⽣生。MobiCom13、SIGCOMM15で発表 あり。 • What?:Isolation  guaranteeとStrategy  proofnessを満たし、Pareto   Efficiencyを準最適にするファイルキャッシュポリシの提案。 106 … … … Statically allocated * Globally shared Cache Backend (storage/network) … … … Backend (storage/network) CacheCacheCache What we want Isolation Strategy-proof Higher utilization Share data Isolation Guarantee Strategy Proofness Pareto Efficiency ✓ ✓max-min fairness ✗ priority allocation max-min rate ✗ ✓ ✓ ✓✗ ✗ static allocation ✓ ✓ ✗ Isolation Guarantee Strategy Proofness Pareto Efficiency 106 Properties FairRide ✓ ✓ Near-optimal SIP定理理:ファイル共有において 下記の三つは同時に満たせない
  11. 11. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing • How? – Max-‐‑‒minポリシにProbabilistic  blockingを導⼊入することでチートに対する dis-‐‑‒incentiveを与える。 – Alluxio (Tachyon)[SoCC14]ベースに実装。 11 LEGEND A C 5 5 A B C 5 5 10 B A B C 5 5 10 true access free-ride cheat blocked Figure 3: Example with 2 users, 3 files and total cache size of 2. Numbers represent access frequencies. (a). Al- to get 1 hit/sec access rate for a unit file. To mize over the utility, which is defined as the to rate, a user’s optimal strategy is not to cache th that one has highest access frequencies, but the with lowest cost/(hit/sec). Compare a file of 10 shared by 2 users and another file of 100MB, share users. Even though a user access the former 10 tim and the latter only 8 times/sec, it is overall eco to cache the second file (comparing 5MB/(hit/se 2.5MB/(hit/sec)). (a)  Max-‐‑‒min   fairness (b)  second  user makes  cheating (c)  blocking  free-‐‑‒ riding  access Probabilistic blocking • FairRide blocks a user with p(nj) = 1/(nj+1) probability – nj is number of other users caching file j – e.g., p(1)=50%, p(4)=20% • The best you can do in a general case – Less blocking does not prevent cheating 25
  12. 12. FairRide:  Near-‐‑‒Optimal,  Fair  Cache   Sharing 12 0 15 30 45 60 0 150 300 450 600 750 900 1050 missratio(%) Time (s) user 1 user 2 Cheating under FairRide user 2 cheats user 1 cheats 32 FairRide dis-incentives users from cheating. 400 300 200 100 0 Avg.response(ms) Facebook experiments FairRide outperforms max-min fairness by 29% 34 0 15 30 45 60 1-10 11-50 51-100 101-500 501- RedcutioninMedian JobTime(%) Bin (#Tasks) max-min FairRide
  13. 13. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • Who?:ミシガン⼤大の助教。UCB  AMPLab出⾝身。ネットワークが専⾨門 (coflow-‐‑‒based  networking,  multi-‐‑‒resource  allocation  in  dataceters,   compute  and  storage  for  big  data,  network  virtualization)でSIGCOMM で毎年年のように発表。DRF[NSDI11]、FairCloud[SIGCOMM12]の発展。 • What?:ネットワーク帯域の割当て最適化問題 13 … M1 M2 M3 MN Congestion-Less Core L1 L2 L3 LNLN+1 LN+2 LN+3 L2N How to share the links between multiple tenants to provide 1. optimal performance guarantees and 2. maximize utilization? Tenant-A’s VMs Tenant-B’s VMs
  14. 14. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • Highest  Utilization  with  the  Optimal  Isolation  Guarantee   14 Isolation Guarantee Utilization Work- Conserving Low Low Optimal PS-P DRF Per-Flow Fairness HUG HUG in Cooperative Setting 1. Optimal Isolation Guarantee 2. Work Conservation Isolation Guarantee Utilization Work- Conserving Low Low Optimal PS-P DRF Per-Flow Fairness HUG 1. Optimal Isolation Guarantee 2. HighestUtilization 3. Strategyproof HUG in Non-Cooperative Setting Intuitively, we want to maximize the minimum progress over all tenants, i.e., maximize mink Mk, where mink Mk corresponds to the isolation guaran- tee of an allocation algorithm. We make three observa- tions. First, when there is a single link in the system, this model trivially reduces to max-min fairness. Sec- ond, getting more aggregate bandwidth is not always bet- ter. For tenant-A in the example, ⟨50Mbps, 100Mbps⟩ is better than ⟨90Mbps, 90Mbps⟩ or ⟨25Mbps, 200Mbps⟩, even though the latter ones have more bandwidth in to- tal. Third, simply applying max-min fairness to individ- ual links is not enough. In our example, max-min fairness allocates equal resources to both tenants on both links, resulting in allocations ⟨1 2 , 1 2 ⟩ on both links (Figure 1b). Corresponding progress (MA = MB = 1 2 ) result in a suboptimal isolation guarantee (min{MA, MB} = 1 2 ). Dominant Resource Fairness (DRF) [33] extends max- min fairness to multiple resources and prevents such sub- Cloud Network Sharing Dynamic Sharing Flow-Level (Per-Flow Fairness) No isolation guarantee VM-Level (Seawall, GateKeeper) No isolation guarantee Tenant-/Network-Level Non-Cooperative Environments Require strategy-proofness Highest Utilization for Optimal IsolationGuarantee (HUG) Cooperative Environments Do not require strategy-proofness Reservation (SecondNet, Oktopus, Pulsar, Silo) Uses admission control Low Utilization (DRF) Optimal isolation guarantee Work-Conserving Optimal Isolation Guarantee (HUG) Suboptimal IsolationGuarantee (PS-P, EyeQ, NetShare) Work-conserving
  15. 15. HUG:  Multi-‐‑‒Resource  Fairness  for   Correlated  and  Elastic  Demands • 100台のEC2インスタンスで実験。 • 3つのテナント – テナントA、C:pairwise  one-‐‑‒to-‐‑‒one  communication – テナントB:all-‐‑‒to-‐‑‒all  communication 15 0 50 100 0 60 120 180 240 300 360 420 480 540 TotalAlloc(Gbps) Time (Seconds) Tenant A Tenant B Tenant C (a) Per-flow Fairness (TCP) 0 50 100 0 60 120 180 240 300 360 420 480 540 TotalAlloc(Gbps) Time (Seconds) Tenant A Tenant B Tenant C (b) HUG Figure 10: [EC2] Bandwidth consumptions of three tenants arriving over time in a 100-machine EC2 cluster. Each tenant has 100 VMs, but each uses a different communication pattern (§5.1.1). We observe that (a) using TCP, tenant-B dominates the network by creating more flows; (b) HUG isolates tenants A and C from tenant B.
  16. 16. 感想 • 本セッションの対象はデータセンタ内の資源管理理 • ⾰革新的なアイデアがあるわけではなくが、問題をきちんと定式化し、そ れに基づいて実⽤用的なシステムを構築するという研究のお⼿手本のような 論論⽂文が多い。さすがNSDI。 • シングルセッションで全発表を聞けるのはうれしいが、発表時間20分 は短い(スライドだけ⾒見見てもよくわからないところがある) • UCB  AMPLab強い • Facebook  trace  data欲しい 16 本資料料で使⽤用したすべての図はNSDI2016ホームページの proceedingsおよびslidesから引⽤用しました。

×