Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Build your own data challenge, or just organize team work

283 views

Published on

We have open sourced the toolkit behind www.ramp.studio and all the starting kits. You can use rampwf to build your own predictive workflows, organize workflow building and optimization within an internal data science team, and submit the kit to us if you want to run a code submission data challenge.

https://www.ramp.studio
https://github.com/paris-saclay-cds/ramp-workflow
https://github.com/ramp-kits
https://ramp-studio.slack.com






Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Build your own data challenge, or just organize team work

  1. 1. RAMP-WORKFLOW & RAMP-KITS 1 Université Paris-Saclay / CNRS BALÁZS KÉGL Center for Data Science Paris-Saclay
  2. 2. 2 Most classical data challenges are HR and publicity events
  3. 3. 3 We decided to turn them into a tool for 1. Collaborative prototyping 2. Teaching aid 3. Data science team management
  4. 4. 4 We are open sourcing it toolkit: https://github.com/paris-saclay-cds/ramp-workflow examples: https://github.com/ramp-kits
  5. 5. 5 Funded by Université Paris-Saclay
  6. 6. 6 RAMP.STUDIO DATA CHALLENGE WITH CODE SUBMISSION
  7. 7. Center for Data Science Paris-Saclay B. Kégl (CNRS) 7 what you achieved with a well tuned deep net the diversity gap the human blender gap competitive phase collaborative phase THE POWER OF THE (COLLABORATING) CROWD
  8. 8. Center for Data Science Paris-Saclay B. Kégl (CNRS) OPEN PHASE LETS PARTICIPANTS CATCH UP THE GOAL OF TEACHING 8
  9. 9. Center for Data Science Paris-Saclay B. Kégl (CNRS) 9 COMMUNICATION AND REUSE
  10. 10. 10 You can 1. Participate in upcoming RAMPs 2. Use RAMP in teaching or training
  11. 11. 11 Setting up the RAMP is was long and hard.
  12. 12. 12 Separate workflow building and workflow optimization
  13. 13. 13 Before solving the problem, set it up (even put it into production)
  14. 14. • toolkit: https://github.com/paris-saclay-cds/ramp-workflow • for designing workflows • set of ready-made metrics, workflows, CV schemes, data readers • unique command-line test script • examples: https://github.com/ramp-kits • a zoo of problems, experiments, workflows • (at least) one initial solution 14 RAMP-WORKFLOW & RAMP-KITS
  15. 15. Center for Data Science Paris-Saclay B. Kégl (CNRS) CLASSIFYING AND REGRESSING ON MOLECULAR SPECTRA 15 chemotherapy drug in elastic pocket laser spectrometer molecular spectra feature extractor 1 feature extractor 2 regressor concentration classifier drug type
  16. 16. Center for Data Science Paris-Saclay B. Kégl (CNRS) FORECASTING EL NINO SIX MONTHS AHEAD 16 … 300.14 299.83 298.76 299.87 299.82 300.15 300.10 299.50… … time series feature extractor x (a fixed length feature vector)regressor
  17. 17. 17 A SINGLE SCRIPT TO DEFINE THE BUNDLE X ypred score type score cross-validation scheme dataconnectors FE CLF workflow
  18. 18. 18 A SINGLE EXECUTABLE TO TEST THE SUBMISSIONS • Keep your different submissions in a simple file structure • Communicate them on git • Execute them also from the notebook
  19. 19. 19 You can 1. Use rampwf for your own workflows 2. Use rampwf to organize workflow building and optimization in an internal data science team 3. Submit it to us if you want to run a data challenge
  20. 20. 20 toolkit: github.com/paris-saclay-cds/ramp-workflow examples: github.com/ramp-kits blogs: medium.com/@balazskegl slack: ramp-studio.slack.com frontend: www.ramp.studio mail: balazs.kegl@gmail.com

×