Successfully reported this slideshow.
Your SlideShare is downloading. ×

How Fast is AI in Pharo? Benchmarking Linear Regression

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 29 Ad

More Related Content

Similar to How Fast is AI in Pharo? Benchmarking Linear Regression (20)

More from ESUG (20)

Advertisement

Recently uploaded (20)

How Fast is AI in Pharo? Benchmarking Linear Regression

  1. 1. How Fast is AI in Pharo? Benchmarking Linear Regression 1Arolla, Paris 2Inria, Univ. Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL Oleksandr ZAITSEV1,2, Sebastian JORDAN MONTAÑO2, Stéphane DUCASSE 2 IWST22 — International Workshop on Smalltalk Technologies Novi Sad, Serbia oleksandr.zaitsev@arolla.fr
  2. 2. 2 Fast Very slow! Why? pharo-ai Scikit-learn (Python) Training machine learning models
  3. 3. 3 Training machine learning models Can we do the same in Pharo?
  4. 4. 4
  5. 5. 5 LAPACK SciPy scikit-learn pharo-ai PolyMath LAPACK Pharo Python Fast Algebra Math Machine Learning
  6. 6. 6 We want to provide a proof of concept that math & AI algorithms in Pharo can be boosted by delegating time- consuming operations to highly-optimised external libraries. To do that, we build a prototype implementation of linear regression based on the DGELSD routine of LAPACK
  7. 7. Linear regression and how it is implemented
  8. 8. 8 Linear Regression Problem
  9. 9. 9 Linear Regression Problem
  10. 10. 10 Linear Regression Problem
  11. 11. 11 Linear Regression Implementations Implementation Gradient Descent Least Squares - Iterative - Approximate - One iteration - Exact solution
  12. 12. 12 Implementation 1. Gradient Descent
  13. 13. 13 Need to minimise the errors ei = ̂ yi − yi Cost function - mean squared errors J(w) = 1 m m ∑ i=1 e2 i Implementation 1. Gradient Descent
  14. 14. 14 J(w) = 1 m m ∑ i=1 ( ̂ yi − yi)2 Implementation 1. Gradient Descent
  15. 15. 15 J(w) = 1 m m ∑ i=1 ( ̂ yi − yi)2 Implementation 1. Gradient Descent
  16. 16. 16 J(w) = 1 m m ∑ i=1 ( ̂ yi − yi)2 θ(new) = θ(old) − α ∂ ∂θi J(θ) Implementation 1. Gradient Descent
  17. 17. 17 Implementation 2. Least Squares ̂ θ = argmin||y − Xθ||2 Orthogonal projection
  18. 18. 18 Implementation 2. Least Squares Implemented as DGELSD routine In LAPACK We can call it through FFI
  19. 19. 19 Demo will come in the end Wait for it!
  20. 20. Experiment
  21. 21. 21 Datasets
  22. 22. 22 Research Questions - RQ.1 - Measuring LAPACK speedup. How much time improvement can we achieve by calling LAPACK from Pharo? - RQ.2 - Comparing to scikit-learn. How does Pharo & LAPACK implementation compare to the one provided by scikit-learn? - RQ.3 - Comparing pure Pharo with Python. How does pure Pharo implementation of linear regression compare to equivalent pure Python implementation?
  23. 23. 23 RQ.1 - Measuring LAPACK speedup
  24. 24. 24 Time in seconds (less is better) Small Medium Pure Pharo vs Pharo & LAPACK 1820 times faster 2103 times faster Pharo Pharo & LAPACK 19min40sec 1h44min 0.6sec 2.9sec Help RQ.1 - Measuring LAPACK speedup
  25. 25. 25 RQ.2 - Comparing to scikit-learn.
  26. 26. 26 RQ.2 - Comparing to scikit-learn.
  27. 27. 27 RQ.3 - Comparing pure Pharo with Python
  28. 28. 28 RQ.3 - Comparing pure Pharo with Python
  29. 29. Summary ‣ We propose a prototype implementation of Linear Regression based on LAPACK ‣ We show that LAPACK & Pharo is up to 2103 times faster than pure Pharo ‣ We also show that scikit-learn is 8-5 times faster than our prototype, depending on the size of the data. ‣ Finally, we demonstrate that pure Pharo is up to 15 times faster than the equivalent implementation in pure Python ‣ Those findings can lay the foundation for the future work in building fast numerical libraries for Pharo and further using them in higher-level libraries such as pharo-ai.

×