Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
{
Kaggle <- BikeShare
Taposh Dutta Roy
Jan 26th 2015
Presented at YHAT, Oakland, CA
Contents
• About Bikeshare
• Data
• Tools
• R
• Factor Engineering
• Matrix
• Random Forest
• Neural Nets
About Bike Share
Competition: http://www.kaggle.com/c/bike-
sharing-demand
Challenge:
Forecast use of a city’s bike share ...
Publication :
Fanaee-T, Hadi, and Gama, Joao, Event
labeling combining ensemble detectors
and background knowledge, Progre...
Data
The goal is to predict counts either
based on sum of casual & registered or
directly
Data Fields
datetime - hourly date + timestamp
season - 1 = spring, 2 = summer, 3 = fall, 4 = winter
holiday - whether the...
Data
datetime - hourly date + timestamp
season - 1 = spring, 2 = summer, 3 = fall, 4 = winter
holiday - whether the day is...
Data
Datetime - hourly date + timestamp
Predefined Factors:
season - 1 = spring, 2 = summer, 3 = fall, 4 = winter
holiday ...
Data - Continuous
Data
Workday busy hours
Data
Data
Tools
Weka
R
Python
H2O + R
Vowpal Wabbit
Using R
Feature Engineering
Citations
Feature-Weighted Linear Stacking
Joseph Sill1, Gabor Takacs2, Lester Mackey3, and David Lin4
Combining Predictio...
Upcoming SlideShare
Loading in …5
×

Kaggle bikeshare Competition - Part 1

1,413 views

Published on

Kaggle Competition on Bikeshare

Published in: Engineering
  • Be the first to comment

Kaggle bikeshare Competition - Part 1

  1. 1. { Kaggle <- BikeShare Taposh Dutta Roy Jan 26th 2015 Presented at YHAT, Oakland, CA
  2. 2. Contents • About Bikeshare • Data • Tools • R • Factor Engineering • Matrix • Random Forest • Neural Nets
  3. 3. About Bike Share Competition: http://www.kaggle.com/c/bike- sharing-demand Challenge: Forecast use of a city’s bike share system Data : UCI Machine Learning Repository
  4. 4. Publication : Fanaee-T, Hadi, and Gama, Joao, Event labeling combining ensemble detectors and background knowledge, Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg. About Bike Share
  5. 5. Data The goal is to predict counts either based on sum of casual & registered or directly
  6. 6. Data Fields datetime - hourly date + timestamp season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holiday workingday - whether the day is neither a weekend nor holiday weather – 1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog temp - temperature in Celsius atemp - "feels like" temperature in Celsius humidity - relative humidity windspeed - wind speed casual - number of non-registered user rentals initiated registered - number of registered user rentals initiated count - number of total rental
  7. 7. Data datetime - hourly date + timestamp season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holiday workingday - whether the day is neither a weekend nor holiday weather – 1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog temp - temperature in Celsius atemp - "feels like" temperature in Celsius humidity - relative humidity windspeed - wind speed casual - number of non-registered user rentals initiated registered - number of registered user rentals initiated count - number of total rental
  8. 8. Data Datetime - hourly date + timestamp Predefined Factors: season - 1 = spring, 2 = summer, 3 = fall, 4 = winter holiday - whether the day is considered a holiday workingday - whether the day is neither a weekend nor holiday weather – 1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
  9. 9. Data - Continuous
  10. 10. Data Workday busy hours
  11. 11. Data
  12. 12. Data
  13. 13. Tools Weka R Python H2O + R Vowpal Wabbit
  14. 14. Using R
  15. 15. Feature Engineering
  16. 16. Citations Feature-Weighted Linear Stacking Joseph Sill1, Gabor Takacs2, Lester Mackey3, and David Lin4 Combining Predictions for Accurate Recommender Systems Michael Jahrer ,Andreas Töscher ,Robert Legenstein

×