FrugalML is a technique that uses machine learning to optimize usage of machine learning prediction APIs. It trains on data annotated by different APIs to learn a strategy that selects the best sequence of APIs to call within a given budget. This can achieve up to 90% lower costs or 5% better accuracy compared to using any single API. The strategy selects an initial "base" API and then may call additional "add-on" APIs based on the predictions and quality scores from previous APIs. FrugalML is proven to efficiently learn the optimal strategy and outperforms commercial APIs on various tasks and datasets in both cost and accuracy.
FrugalML: Using ML APIs More Accurately and Cheaply
1. FrugalML: Using ML Prediction APIs
more Accurately and Cheaply
Lingjiao Chen
1
Joint work with
James Zou
Matei Zaharia
2. Outline
Introduction to MLaaS
FrugalML: How to save up to 90% using cloud ML APIs?
The main idea
How to use it
Empirical evaluation on real world ML APIs
What is next?
2
Copyright@Lingjiao Chen,
https://lchen001.github.io/
3. Machine Learning as a Service (MLaaS)
- Goal:
Mitigate low level overheads
- e.g., model training
- data labelling, etc
- Participator:
-VALUE:
Previous: USD 1.0 billion in 2019
Expected: USD 8.48 billion by 2025
2019 2024
C
A
G
R
:
4
3
%
Source: Mordor Intelligence
3
Copyright@Lingjiao Chen,
https://lchen001.github.io/
5. Problem: Which API to use?
- ML Prediction APIs: a data point -> a label (plus a cost)
e.g., Google API: images -> facial emotions, 0.0015$/image
- Many commercial APIs with same functionality
- Heterogeneity in performance and cost
… …
5
Copyright@Lingjiao Chen,
https://lchen001.github.io/
6. Our Proposed Solution: FrugalML
- Optimize for best sequential strategy with a budget constraint
Up to 90% cost savings or 5% better accuracy with same cost
across all tasks and datasets evaluated
6
Copyright@Lingjiao Chen,
https://lchen001.github.io/
7. FrugalML: How to use it?
- Call a base service first
- Take the predicted quality score (QS) and predicted label from the
base service as features to decide
- i) if the prediction should be accepted
- ii) if and which additional API should be invoked.
7
Copyright@Lingjiao Chen,
https://lchen001.github.io/
8. FrugalML: How to use it?
8
Copyright@Lingjiao Chen,
https://lchen001.github.io/
FrugalMLTraining FrugalML Deploying
Google API Deploying
9. FrugalML: How to train it?
Goal: Pick the optimal base/add-on services, thresholds, etc.
Combinatorial optimization problem: provably efficient solver?
Statistically: How many samples are needed?
Computationally: How long does it take for training?
9
Copyright@Lingjiao Chen,
https://lchen001.github.io/
10. FrugalML: A provably efficient solver
✔ Key lemma: base/add-on services from <3 services (sparsity)
✔ An approx. solver: O(1/N) accuracy loss guarantee
✔ Sample complexity: N samples annotated by APIs
✔ Computational cost: O(N)
10
Copyright@Lingjiao Chen,
https://lchen001.github.io/
11. Learned FrugalML Strategy
Case Study on a facial emotion dataset, FER+
Budget: $5 (=cheapest commercial API)
FrugalML works well in practice
11
Copyright@Lingjiao Chen,
https://lchen001.github.io/
$15
$10
$0.01
12. Accuracy and Cost Comparison
Cost
(Dollar)
Accuracy
(%)
Case Study on a facial emotion dataset, FER+
FrugalML works well in practice
12
Copyright@Lingjiao Chen,
https://lchen001.github.io/
13. Accuracy Budget Trade-offs
Case study on a facial emotion dataset, FER+1
Accuracy
(%)
Microsoft API
Github API
FrugalML works well in practice
13
Copyright@Lingjiao Chen,
https://lchen001.github.io/
Face++ API
Google API
14. FrugalML’s cost savings (%) while match best commercial API’s accuracy
Up to 90% cost savings or 5% better accuracy with same cost
across all tasks and datasets evaluated
FrugalML works well in practice
Vision NLP Speech
14
Copyright@Lingjiao Chen,
https://lchen001.github.io/
15. FrugalML’s accuracy improvement (%) while match best commercial API’s cost
Up to 90% cost savings or 5% better accuracy with same cost
across all tasks and datasets evaluated
FrugalML works well in practice
Vision NLP Speech
15
Copyright@Lingjiao Chen,
https://lchen001.github.io/
16. Conclusions and Open Problems
Question: Best use ML APIs in the market within a budget
Our solution: FrugalML
Provable performance and efficiency guarantee
Up to 90% cost savings or 5% better accuracy with same cost
Dataset with 612,139 samples annotated by APIs and code released
Open problems: many exist in this under-explored area
More complicated tasks?
API performance shift?
Other requirements (fairness, latency, …)?
16
Copyright@Lingjiao Chen,
https://lchen001.github.io/
17. Code and Data:
github.com/lchen001/Frugal
ML
More on theoretical analysis, empirical results:
Please visit our project website and/or full paper!
17
Copyright@Lingjiao Chen,
https://lchen001.github.io/