The document presents research on access strategies for network caching. It introduces the data store selection problem of determining which data stores to access based on indicators to minimize miss costs and access costs. The paper proposes modeling this as a knapsack problem and provides three approximation algorithms - DSKnap, DSPot, and DSPP. An evaluation on a real Wikipedia trace and CDN topology shows the DSKnap algorithm outperforms existing heuristics in total access costs across different miss rates and number of accessed locations.
24. Access
Strategies for
Network
Caching
Introduction
Model
Algorithms
Evaluation
Conclusion
A Naive approach for the DSS Problem
Run a Knapsack approximation algorithm for budget =
1, . . . , β
For each suggested solution, calculate φ(∗)
Take the arg min of the suggested solutions
The Data Store Selection (DSS) problem:
Find a subset of datastores D which maximizes
j∈D
− log(ρj) s.t.
j∈D
Cj ≤ B
25. Access
Strategies for
Network
Caching
Introduction
Model
Algorithms
Evaluation
Conclusion
A Naive approach for the DSS Problem
Run a Knapsack approximation algorithm for budget =
1, . . . , β
For each suggested solution, calculate φ(∗)
Take the arg min of the suggested solutions
This is very costly
Goal: emulate solution space efficiently
The Data Store Selection (DSS) problem:
Find a subset of datastores D which maximizes
j∈D
− log(ρj) s.t.
j∈D
Cj ≤ B
32. Access
Strategies for
Network
Caching
Introduction
Model
Algorithms
Evaluation
Conclusion
Evaluation Settings
Trace: accesses to Wikipedia [UPvS’09]
Network layout based on a real-world CDN [OVH]
19 datastores, each storing 1k URLs
False positive ratio: 2%
Access costs based on topology and bandwidth
19 users, requesting data items, all using either
Cheapest positive indication access policy
All positive indications access policy
DSKnap
A missed item is fetched to either 1, 3, 5 datastores
Compare total access costs, normalized to Opt
equipped with a perfect indicator