This document summarizes an analysis of Airbnb bookings by first-time users. It describes the objectives to analyze booking locations and social/seasonal trends. Five datasets were used, including training, test, and session data that was cleaned and preprocessed. Initial analysis found trends in social media and seasonal bookings over time. Tools like Excel, RStudio, and ArcGIS were used to perform techniques like decision trees, XGBoost, and linear regression. Models analyzed associations between variables and earlier models were compared. Key findings included expected bookings by country, with the US having the most bookings. The document concludes with limitations and compares results to competition winners.
4. Our Data
• 5 original datasets:
• Training Set (213451x16)
• Test Set (62096 x 15)
• Sessions (10,567,737 x 6)
• Age Brackets
• Countries
5. Data Cleansing and Dummy Variables
• Merging of sessions dataset with training set and with test set
• Dealing with the missing data
• Creation of the dummy variables
13. Findings
Country Expected User Bookings
Australia 552
Canada 802
Germany 662
Spain 883
France 1359
Great Britain 951
Italy 1041
Netherlands 723
Portugal 514
United States of America 13885
Other 3008
No Booking 37716