SlideShare a Scribd company logo
1 of 14
Mother Nature’s Impact 
on Bike Ridership 
Jackie Zajac 
Kays Fattal 
Naumaan Nasir
Does weather have a relationship 
with bike ridership? 
Can we predict bike usage based 
on weather?
INTRODUCTION 
• Our team 
• Research questions 
• Picking datasets 
• Our audience
METHODOLOGY 
• Why linear regression? 
• How we manipulated the data 
• MySQL engine aggregated 
3M table into sum of rental 
counts and duration 
• Mashed up with 731 rows of 
weather data (2011, 2012) 
• Added a Year field 
• Tools: Excel, MySQL database, 
R (Rattle)
METHODOLOGY 
• Picking our best configuration 
• Categoric vs. numeric variables 
• Must decide how to measure bike usage 
• Must pick best variables 
• Error analysis
PHASE I 
• Began with a broad study of six regressions 
• Two target variables (rental counts, duration) 
• Three temperature measures 
• Minimum, Average, Maximum 
• Chunked the day into three time ranges to reflect 
temperature during bike rides 
• Evaluated multiple weather variables’ affect on 
regressions 
• Ignored Date field
Plots
PHASE II 
• Combining the data sets 
• Picking best variables: 
• Bike rental counts as sole target variable 
• Maximum temperature 
• Utilized date/year field 
• Switched Snow to categoric variable 
• Analyzed and refined our regression 
• Higher accuracy – R-squared = .8374 or 83.74%
MSE and R-squared 
• A measure of accuracy in one dataset 
predicting another 
• Relationship between R-squared and MSE
X X 
X
FINAL MODEL 
Weight Variable 
-4004.501 Intercept 
62.118 Maximum Temperature 
-132.741 Average Wind 
93.162 Precipitation 
416.818 Visibility 
2063.069 Year 
-161.038 Snow [0.0-1.2] inches 
-4.945 Snow [1.2-2.0] inches 
-588.349 Snow [2.0-3.1] inches 
-5.390 Snow [3.1-3.9] inches 
Y=
LESSONS LEARNED 
• Too many independent variables to incorporate 
crime dataset in addition to weather dataset 
• Means Squared Error (MSE), R-squared 
• Only two years’ worth of data was available due to 
Bikeshare’s short history (2011, 2012) 
• Final model would be even more accurate with 
additional historical data
CONCLUSION 
• Our hypotheses proved true: weather does affect 
bike ridership 
• Why is Maximum Temperature better? 
• Why does the Year improve accuracy? 
• The categorical range of snow inches
QUESTIONS? 
Thanks!

More Related Content

What's hot

Approaches for a sustainable industrial heating
Approaches for a sustainable industrial heating Approaches for a sustainable industrial heating
Approaches for a sustainable industrial heating Aspiration Energy Pvt Ltd
 
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทย
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทยการนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทย
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทยAJ. Tor วิศวกรรมแหล่งนํา้
 
Illinois ASHRAE Research Promotion
Illinois ASHRAE Research PromotionIllinois ASHRAE Research Promotion
Illinois ASHRAE Research PromotionIllinois ASHRAE
 
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...Vahid Rasouli
 

What's hot (8)

Afcom2010
Afcom2010Afcom2010
Afcom2010
 
Approaches for a sustainable industrial heating
Approaches for a sustainable industrial heating Approaches for a sustainable industrial heating
Approaches for a sustainable industrial heating
 
Submission#19614_Final
Submission#19614_FinalSubmission#19614_Final
Submission#19614_Final
 
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทย
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทยการนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทย
การนำเสนอบทความวิชาการระดับนานาชาติ Version ภาษาไทย
 
Illinois ASHRAE Research Promotion
Illinois ASHRAE Research PromotionIllinois ASHRAE Research Promotion
Illinois ASHRAE Research Promotion
 
Class 27 pd, pid electronic controllers
Class 27   pd, pid electronic controllersClass 27   pd, pid electronic controllers
Class 27 pd, pid electronic controllers
 
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...
42 Wind Energy Potential Assessment In Order to Produce Electrical Energy for...
 
Big Data: WEATHER DATA ANALYSIS
Big Data: WEATHER DATA ANALYSISBig Data: WEATHER DATA ANALYSIS
Big Data: WEATHER DATA ANALYSIS
 

Similar to Data Science: Can weather predict Bikeshare usage?

Bike sharing analysis san francisco
Bike sharing analysis san franciscoBike sharing analysis san francisco
Bike sharing analysis san franciscoNavtej Singh Chawla
 
Predicting the Wind: Wind farm prospecting using GIS
Predicting the Wind: Wind farm prospecting using GISPredicting the Wind: Wind farm prospecting using GIS
Predicting the Wind: Wind farm prospecting using GISKenex Ltd
 
Predicting the Wind - wind farm prospecting with GIS
Predicting the Wind - wind farm prospecting with GISPredicting the Wind - wind farm prospecting with GIS
Predicting the Wind - wind farm prospecting with GISKenex Ltd
 
Analysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future UseAnalysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future UseKimberly Nguyen
 
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...Sergio A. Guerra
 
ATS-16: Making Data Count, Josh Roll
ATS-16: Making Data Count, Josh RollATS-16: Making Data Count, Josh Roll
ATS-16: Making Data Count, Josh RollBTAOregon
 
Energy audit & conservation studies for commercial premises
Energy audit & conservation studies for commercial premisesEnergy audit & conservation studies for commercial premises
Energy audit & conservation studies for commercial premisesravindradatar
 
From Data to insight: Emerging Opportunities in Africa for 2018
From Data to insight: Emerging Opportunities in Africa for 2018From Data to insight: Emerging Opportunities in Africa for 2018
From Data to insight: Emerging Opportunities in Africa for 2018mdn_dan
 
MIDIH Paufex-IOTandCI experiment
MIDIH Paufex-IOTandCI experimentMIDIH Paufex-IOTandCI experiment
MIDIH Paufex-IOTandCI experimentMIDIH_EU
 
Ensemble Modelling - Assignment 3 - DA
Ensemble Modelling - Assignment 3 - DAEnsemble Modelling - Assignment 3 - DA
Ensemble Modelling - Assignment 3 - DAArun Sankar
 
Portfolio MS-MBA
Portfolio MS-MBAPortfolio MS-MBA
Portfolio MS-MBARAHUL SINGH
 
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...EMEX
 
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...Wassim Derguech
 
FinalPresentation-GradProject
FinalPresentation-GradProjectFinalPresentation-GradProject
FinalPresentation-GradProjectManabu Mukohyoshi
 
Presentation on Spot Speed Study Analysis for the course CE 454
Presentation on Spot Speed Study Analysis for the course CE 454Presentation on Spot Speed Study Analysis for the course CE 454
Presentation on Spot Speed Study Analysis for the course CE 454nazifa tabassum
 
2015 x472 class 10 - lcca
2015 x472 class 10 - lcca2015 x472 class 10 - lcca
2015 x472 class 10 - lccamichaeljmack
 

Similar to Data Science: Can weather predict Bikeshare usage? (20)

Bike sharing analysis san francisco
Bike sharing analysis san franciscoBike sharing analysis san francisco
Bike sharing analysis san francisco
 
Predicting the Wind: Wind farm prospecting using GIS
Predicting the Wind: Wind farm prospecting using GISPredicting the Wind: Wind farm prospecting using GIS
Predicting the Wind: Wind farm prospecting using GIS
 
Predicting the Wind - wind farm prospecting with GIS
Predicting the Wind - wind farm prospecting with GISPredicting the Wind - wind farm prospecting with GIS
Predicting the Wind - wind farm prospecting with GIS
 
Analysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future UseAnalysis on Bike Rental Data to Predict Future Use
Analysis on Bike Rental Data to Predict Future Use
 
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
 
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...
NOVEL DATA ANALYSIS TECHNIQUE USED TO EVALUATE NOX AND CO2 CONTINUOUS EMISSIO...
 
ATS-16: Making Data Count, Josh Roll
ATS-16: Making Data Count, Josh RollATS-16: Making Data Count, Josh Roll
ATS-16: Making Data Count, Josh Roll
 
Energy audit & conservation studies for commercial premises
Energy audit & conservation studies for commercial premisesEnergy audit & conservation studies for commercial premises
Energy audit & conservation studies for commercial premises
 
From Data to insight: Emerging Opportunities in Africa for 2018
From Data to insight: Emerging Opportunities in Africa for 2018From Data to insight: Emerging Opportunities in Africa for 2018
From Data to insight: Emerging Opportunities in Africa for 2018
 
MIDIH Paufex-IOTandCI experiment
MIDIH Paufex-IOTandCI experimentMIDIH Paufex-IOTandCI experiment
MIDIH Paufex-IOTandCI experiment
 
Ensemble Modelling - Assignment 3 - DA
Ensemble Modelling - Assignment 3 - DAEnsemble Modelling - Assignment 3 - DA
Ensemble Modelling - Assignment 3 - DA
 
Portfolio MS-MBA
Portfolio MS-MBAPortfolio MS-MBA
Portfolio MS-MBA
 
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
03 solargis uncertainty_albuquerque_pvp_mws_2017-05_final
 
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...
Case Study: Operational Energy Reduction through Data Analysis & Virtual Benc...
 
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
An Autonomic Approach to Real-Time Predictive Analytics using Open Data and ...
 
JML_WeatherResume
JML_WeatherResumeJML_WeatherResume
JML_WeatherResume
 
FinalPresentation-GradProject
FinalPresentation-GradProjectFinalPresentation-GradProject
FinalPresentation-GradProject
 
Presentation on Spot Speed Study Analysis for the course CE 454
Presentation on Spot Speed Study Analysis for the course CE 454Presentation on Spot Speed Study Analysis for the course CE 454
Presentation on Spot Speed Study Analysis for the course CE 454
 
2015 x472 class 10 - lcca
2015 x472 class 10 - lcca2015 x472 class 10 - lcca
2015 x472 class 10 - lcca
 
13 helioscope pvpmc 2017v4
13 helioscope pvpmc 2017v413 helioscope pvpmc 2017v4
13 helioscope pvpmc 2017v4
 

Recently uploaded

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Recently uploaded (20)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Data Science: Can weather predict Bikeshare usage?

  • 1. Mother Nature’s Impact on Bike Ridership Jackie Zajac Kays Fattal Naumaan Nasir
  • 2. Does weather have a relationship with bike ridership? Can we predict bike usage based on weather?
  • 3. INTRODUCTION • Our team • Research questions • Picking datasets • Our audience
  • 4. METHODOLOGY • Why linear regression? • How we manipulated the data • MySQL engine aggregated 3M table into sum of rental counts and duration • Mashed up with 731 rows of weather data (2011, 2012) • Added a Year field • Tools: Excel, MySQL database, R (Rattle)
  • 5. METHODOLOGY • Picking our best configuration • Categoric vs. numeric variables • Must decide how to measure bike usage • Must pick best variables • Error analysis
  • 6. PHASE I • Began with a broad study of six regressions • Two target variables (rental counts, duration) • Three temperature measures • Minimum, Average, Maximum • Chunked the day into three time ranges to reflect temperature during bike rides • Evaluated multiple weather variables’ affect on regressions • Ignored Date field
  • 8. PHASE II • Combining the data sets • Picking best variables: • Bike rental counts as sole target variable • Maximum temperature • Utilized date/year field • Switched Snow to categoric variable • Analyzed and refined our regression • Higher accuracy – R-squared = .8374 or 83.74%
  • 9. MSE and R-squared • A measure of accuracy in one dataset predicting another • Relationship between R-squared and MSE
  • 10. X X X
  • 11. FINAL MODEL Weight Variable -4004.501 Intercept 62.118 Maximum Temperature -132.741 Average Wind 93.162 Precipitation 416.818 Visibility 2063.069 Year -161.038 Snow [0.0-1.2] inches -4.945 Snow [1.2-2.0] inches -588.349 Snow [2.0-3.1] inches -5.390 Snow [3.1-3.9] inches Y=
  • 12. LESSONS LEARNED • Too many independent variables to incorporate crime dataset in addition to weather dataset • Means Squared Error (MSE), R-squared • Only two years’ worth of data was available due to Bikeshare’s short history (2011, 2012) • Final model would be even more accurate with additional historical data
  • 13. CONCLUSION • Our hypotheses proved true: weather does affect bike ridership • Why is Maximum Temperature better? • Why does the Year improve accuracy? • The categorical range of snow inches

Editor's Notes

  1. Does weather affect Bikeshare, and how? Can we predict it? To what limit can we be accurate? Found the datasets on capital bikeshare and on farmer’s almanac Who can use this study? Discuss what this could do for Bikeshare as a company
  2. Linear regression was best suited. We were doing a comparison rather than classification. It was not a true/false research question. We used charts in Excel to study the difference between predicted values and actual values.
  3. Linear regression was best suited. We were doing a comparison rather than classification. It was not a true/false research question. Min, avg, max temperature – best variables? Error analysis – used both MSE and R-squared. Kays will discuss in further detail later.
  4. TOP LEFT: Minimum temperature TOP RIGHT: Average temperature, date is numeric LOWER LEFT: Maximum temperature, date is numeric LOWER RIGHT: Best combination: Maximum temperature, Year variable – numeric with two years only