6 grisel-scikit-learn-introduction-130228102221-phpapp02
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

6 grisel-scikit-learn-introduction-130228102221-phpapp02

on

  • 322 views

 

Statistics

Views

Total Views
322
Views on SlideShare
322
Embed Views
0

Actions

Likes
1
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

6 grisel-scikit-learn-introduction-130228102221-phpapp02 Presentation Transcript

  • 1. scikit-learn Machine Learning in Python Data Tuesday - Feb. 26 2013 - Parisdimanche 24 février 13
  • 2. • Library of Machine Learning models • Simple fit / predict / transform API • Python / NumPy / SciPy / Cython & wrappers for libsvm / liblinear • Model Assessment, Selection & Ensembles • Some support for multi-coredimanche 24 février 13
  • 3. Possible Applications • Text Classification / Sequence Tagging NLP • Computer Vision / Robotics • Learning To Rank - IR and advertisement • Statistical Analysis of the Brain: fMRI / MEG • Astronomy, Biology, Social Sciences...dimanche 24 février 13
  • 4. dimanche 24 février 13
  • 5. dimanche 24 février 13
  • 6. dimanche 24 février 13
  • 7. Example: Training a Model for Face Recognitiondimanche 24 février 13
  • 8. Total dataset size: n_samples: 1288, n_features: 1850, n_classes: 7 Extracting the top 150 eigenfaces from 966 faces done in 0.466s Projecting the input data on the eigenfaces orthonormal basis done in 0.056s Fitting the SVM classifier to the training set done in 18.549s Predicting peoples names on the test set done in 0.062s precision recall f1-score support Ariel Sharon 0.90 0.75 0.82 12 Colin Powell 0.78 0.94 0.85 62 Donald Rumsfeld 0.86 0.72 0.78 25 George W Bush 0.89 0.96 0.92 141 Gerhard Schroeder 0.92 0.74 0.82 31 Hugo Chavez 0.90 0.53 0.67 17 Tony Blair 0.81 0.74 0.77 34 avg / total 0.86 0.86 0.86 322dimanche 24 février 13
  • 9. dimanche 24 février 13
  • 10. Learned Eigen Facesdimanche 24 février 13
  • 11. Contributors • GitHub-centric contribution workflow • each pull request needs 2 x [+1] reviews • code + tests + doc + example • 92% test coverage / Continuous Integr. • 4 major releases per years + 4 bugfix rel. • 66 contributors for release 0.13dimanche 24 février 13
  • 12. Users • We support users on & ML • 200+ questions tagged with [scikit-learn] • Many competitors + benchmarks • 500+ answers on ongoing user survey • 60% academics / 40% from industry • Some data-drive Startups use sklearndimanche 24 février 13
  • 13. Thank you! • http://scikit-learn.org - Main Project + doc • @ogrisel on twitter • http://ogrisel.com - ML Consultancy (soon)dimanche 24 février 13
  • 14. Backup Slidesdimanche 24 février 13
  • 15. Caveat Emptor • Domain specific tooling kept to a minimum • Some feature extraction for Bag of Words Text Analysis • Some functions for extracting image patches • Domain integration is the responsibility of the user or 3rd party librariesdimanche 24 février 13