Your SlideShare is downloading. ×
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Pldir 0630
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Pldir 0630

462

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
462
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Prefetch and Cache in PLDI'02 ● Dynamic Hot Data Stream Prefetching... ● プロファイリングとHot Data Streamの解析に基づくダイ ナミックプリフェッチング手法の提案 ● Efficient Discovery of Regular Stride... ● Irregularなload命令のストライドパタンの発見 ● Static Load Classification for... ● load命令を20種に分類。Load-value predictionの利用 をコンパイル時に決定する
  • 2. CITED BY 40 読んだ人 みよし たけふみ 2010.06.30
  • 3. 概要 ● プリフェッチは限られた場所でしか有効ではない ● ダイナミックプリフェッチングの提案 ● Temporal data reference profile ● Extract hot data stream ● With the added prefetch inst(no profiler, analyzer) ● Improvement 5-19% speedup
  • 4. Overview
  • 5. Data Refs. Profiling and Analysis ● Bursty Tracing Framework for Low-over-head Temporal Profiling ● Not only the freq., but also temporal relationships eg. cdeabcdeabfgとabcdefabcdeg ● Extensions for Online Optimization ● Fast Hot Data Stream Detection
  • 6. [15] Bursty Tracing Framework for Low-overhead temporal profiling 2つのバージョンを用意 nCheckとnInstで実行コードを選択 Vulcanでバイナリ変換してチェックコード等を挿入
  • 7. Extensions for Online Optilization
  • 8. Fast Hot Data Stream Detection(1) = to compress the profile and infer its hierarchical structure. [23]
  • 9. Fast Hot Data Stream Detection(2) v.heat = v.length*v.frequency A.heat = wA.length*A.coldUses
  • 10. Overhead of profiling and analysis
  • 11. Dynamic Prefetching ● Generating Detection and Prefetching Code ● Injecting Detection and Prefetching Code
  • 12. Generating Detection and Prefetching Code Hot data stream v = v1v2...v{v.length} into a head v.head = v1v2...vheadLen and a tail v.tail = v{headLen+1}v{headLen+2}...v{v.length}.
  • 13. Performance impact
  • 14. CITED BY 18 読んだ人 みよし たけふみ 2010.06.30
  • 15. 概要 ● Irregular data referencesのプリフェッチは難しい ● 重要なIrregularなload命令はストライドアクセスパタン をもっている(ようだ) ● ストライド付きload命令を発見するプロファイル手法 ● ストライド情報のプロファイルをedge frequencyなプロファイ ラに組み込む ● 17%の速度低下 ● 181.mcf: 1.59x, 254.gap: 1.14x などなど
  • 16. CITED BY 2 読んだ人 みよし たけふみ 2010.06.30
  • 17. 概要 [20] ● Load-value prediction : loadの結果を推測する ● Load-value predictionの有効利用には、キャッシュミス し正しく予測されるloadにSpeculationが限られる ● 従来: Hardware-/Profile-based method ● コンパイル時にSpeculationの決定を行う ● コンパイラによるloadの分類手法 ● CとJavaで効果を評価 [20] M. H. Lipasti, C. B. Wilkerson, and J. P. Shen. Value Locality and Load Value Prediction. In Proceedings ofthe second international conference on architectural support for programming languages and operatingsystems, pages 138–147, 1996.
  • 18. もう少し詳しい概要 ● Load命令を静的に20種に分類 ● Region: Stack, Heap, Global space ● Kind: object Field, Array element, Scalar variable ● Type: Pointer, Non-pointer ● 16K, 64K, 256Kの2-way set-associative cache ● 5 load-value predictors, 2048/infinite entries (i) lv, which predicts the last value for every load (ii) l4v, which predicts one of the last four values for every load (iii) st2d, which uses strides to predict loads (iv) fcm, which uses a representation of the context of preceding loads to predict a load (v) dfcm, which enhances fcm with strides.
  • 19. 種類別キャッシュミス率
  • 20. Predictionの成功率

×