Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lightning: large scale machine learning in python

1,612 views

Published on

Slides for Pydata Paris 2016 presentation

Published in: Science
  • Be the first to comment

Lightning: large scale machine learning in python

  1. 1. LIGHTNING, A LIBRARY FOR LARGE-SCALE MACHINE LEARNING IN PYTHON ,Fabian Pedregosa (1) Mathieu Blondel (2) (1) Chaire Havas-Dauphine / INRIA, Paris France (2) NTT Communication Science Laboratories, Kyoto Japan
  2. 2. SCIKIT-LEARN: WITH GREAT CODE COMES GREAT RESPONSABILITY #lines of code in scikit-learn Very selective for new algorithms/models.
  3. 3. LIGHTNING Incorporate recent progress in large-scale optimization. scikit-learn compatible . scalable on large datasets. support for dense and sparse input. emphasis on structured sparsity penalties. dependencies = Python + Cython + scikit-learn.
  4. 4. SCIKIT-LEARN COMPATIBLE mix lightning with scikit-learn Pipeline, GridSearchCV, etc. ⟹
  5. 5. FROM LARGE DATA TO LARGE OPTIMIZATION Big data comes in different flavors. n{ ⎛ ⎝ ⎜ ⎜ ⎜ ⎜ D A T A ⎞ ⎠ ⎟ ⎟ ⎟ ⎟   p Large sample: Computer vision, advertising, etc. Large dimension: Biology, neuroscience, etc.
  6. 6. LEARNING FROM LARGE SAMPLES Usual methods (gradient descent, BFGS, etc.): Pass through the data at each iteration. Prohibitive for large datasets. Back to simple methods: Stochastic gradient descent (Robbins and Monro, 1951).
  7. 7. LEARNING FROM LARGE SAMPLES lighting example, n=100.000 In last 5 years, flurry of new stochastic methods: Stochastic variance- reduced gradient (SVRG) Stochastic Dual Coordinate Ascent (SDCA) Stochastic Average Gradient (SAG/SAGA) They are all in lightning!
  8. 8. LEARNING FROM LARGE FEATURES Iterate through the columns. Coordinate Descent-like algorithms. Very efficient for sparse models. (Blondel et al. 2013) , multiclass classification with group-lasso penalty
  9. 9. STRUCTURED SPARSITY There's so much more than the Lasso ... Group sparse penalty. Total variation. Trace norm (low rank).
  10. 10. API Similarities and differences with scikit-learn scikit-learn: (penalty = 'l1', )LogisticRegression   loss function solver='liblinear'   algorithm lightning: (penalty = 'l1', ) CDClassifier   algorithm loss='log'   loss function API based on algorithms, not models.
  11. 11. EXTENSIBILITY Typical loss and penalties available. Possible to pass custom loss or penalty function clf = FistaClassifier( loss=my_loss, penalty=my_penalty) (available for Fista*and SAGA*)
  12. 12. FUTURE CHALLENGES Parallel stochastic methods (Leblond, Pedregosa, Lacoste-Julien 2016) Out of core (scale beyond computer memory).
  13. 13. SCIKIT-LEARN-CONTRIB lightning is just the beginning. Welcome projects that are: Your browser does not support SVG scikit-learn compatible. Documented. Test coverage > 80%.
  14. 14. THANKS FOR YOUR ATTENTION http://contrib.scikit-learn.org/lightning/ (We're hiring!)

×