Paris is the most populous city in France, with over 2 million residents within its 105 square km area. As a popular tourist destination with rich French culture, tour advisors in Paris face significant competition. This document uses data sources like Wikipedia and APIs to collect data on venues and cluster the 20 boroughs of Paris into 5 groups based on venue types and locations, to help advisors recommend accommodations and tours tailored to visitor interests. A map visualization shows the clustered boroughs to aid advising. The analysis could be expanded to provide more detailed neighborhood guidance to tourists.
2. Background
most populous city of France with an area of
105 square kilometers (41 square miles) and an
official estimated population of 2,140,526
residents i.e. with 252 residents per hectare, not
counting parks.
Paris being rich in French culture and heritage is
a tourist attraction to a large set of people.
3. Problem and Interest
Looking at the large number figure of the visitors
it’s obvious the tour advisors face a tough
competition.
It’s needed to know where to invest money and
time during a tour and to choose a Borough
wisely to take accommodation in.
The satisfaction of customer is necessary for the
business profit of Tour advisors.
4. Data Sources and Data Cleaning
Paris,Wikipedia was the major source
The unnecessary columns were dropped and the
usable dataframe now looked like this-
5. Collecting more data
Latitudes and Longitudes of the Boroughs were
obtained from Nominatim API using geopy
module,python.
Further, venues were found using Foursquare API
6. Using Foursquare API
I designed the limit as 100 venue and the
radius 500 meter for each borough from their
given latitude and longitude informations.
8. K-means Clustering
One hot encoding was used to deal with
Categorical Data
Clusters=5
Venues were taken in account
Folium library was used to superimpose clusters
on Paris map
10. Discussion
I used the Kmeans algorithm as part of this
clustering study where I assumed no. of clusters
to be 5. However, only 20 Arrondissement
coordinates were used. For more detailed and
accurate guidance, the data set can be expanded
and the details of the neighborhood or
monuments can also be drilled.
I ended the study by visualizing the data and
clustering information on the Paris map.
11. Conclusion
As the tourism plays a great role in Paris , there
must a cut throat competitions between tour
advisors and for being the best customer
satisfaction is the foremost thing to taken care of.
Advising the neighborhoods for accommodation
or visiting according to their interests of venues
and monuments can save time and money of the
visitors which would only result into good tour and
satisfied customers for the tour advisors.