Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Revealing spatial and temporal patterns from Flickr photography: a case study with tourists in Amsterdam

6,344 views

Published on

An exploratory visual analytics approach was used to identify temporal distributions, spatial clusters and popular routes of tourists in Amsterdam by making use of geotagged photos from social media platform Flickr. The presented methods combine the analytical strength of humans with the data processing power of computers, using geovisualisations and charts to explore data, find patterns, and draw conclusions from its outcomes. For this research, the metadata of 2,849,261 geotagged photos was harvested from Flickr and stored in a spatial database. From this dataset, 393,828 photos were located in the municipality of Amsterdam. A semi-automatic classification method classified 39,1% of the users as tourist with a very high precision and recall. The temporal distribution of tourists and locals is compared for different temporal granularities. A method is presented to assess photo timestamps by making use of photos that contain a real clock. An existing grid-based clustering method was implemented and improved to explore Amsterdam’s spatial distribution of tourists in Google Earth. The major tourist hotspots are detected using the density-based clustering algorithm DBSCAN. Finally, the most probable routes of tourists between subsequent photo locations were estimated and aggregated into a route density map. A qualitative approach was used to validate the study outcomes by interviewing eight tourism experts of the municipality of Amsterdam. Their knowledge about the city bears a good resemblance with the detected spatial clusters and route density map of tourists. Despite several imperfections of geosocial data, we conclude that the methods provide meaningful insight into the spatial and temporal patterns of tourists in urban spaces and are a valuable addition to traditional tourism surveys.

Published in: Data & Analytics

Revealing spatial and temporal patterns from Flickr photography: a case study with tourists in Amsterdam

  1. 1. REVEALING SPATIAL AND TEMPORAL PATTERNS FROM FLICKR A CASE STUDY WITH TOURISTS IN AMSTERDAM
  2. 2. TOURISM IN AMSTERDAM RAPID GROWTH Source: Nicky Otten (Flickr)
  3. 3. MORE AND MORE CONCERNS ABOUT TOURISM A SELECTION OF RECENT NEWS ARTICLES They are puking and peeing on the Zeedijk NOS, December 5 2014 Is Amsterdam becoming a second Venice? De Morgen, March 27 2015 The center of Amsterdam should not become too popular Volkskrant, October 25 2014 Amsterdam taken over by tourists RTL, April 3 2015 Amsterdam will welcome twice as many tourists in 2030 Het Parool, December 9 2014
  4. 4. INITIAL RESEARCH TOPIC WAGENINGEN UNIVERSITY AND AMS Explore the possibilities to use (geo)tweets for detecting spatial and temporal patterns of tourists in Amsterdam But why Twitter? How about Flickr? Twitter Flickr Number of users + + + / - Amount of data + + + Connection of data to real location + / - + + Use by tourists + / - + + Interval between subsequent posts + / - + +
  5. 5. RESEARCH PROJECT The objective of this exploratory research project is to develop, implement and test methods that reveal spatial and temporal patterns of tourists from a large dataset of geotagged Flickr photos OBJECTIVE RESEARCH QUESTIONS RQ-01: What methods are available to detect spatial and temporal patterns from geosocial data? RQ-02: What methods need to be implemented to identify temporal distributions, spatial clusters and popular routes of tourists from the metadata of Flickr photos? RQ-03: How well do the identified temporal distributions, spatial clusters and popular routes resemble the spatial and temporal behaviour of tourists?
  6. 6. FLICKR DATA COLLECTION
  7. 7. FLICKR DATA COLLECTION OVERVIEW OF STEPS & TECHNIQUES Flickr Database (API) Request Local database (PostgreSQL) Java application XML-file Metadata Restriction: 1 request per second
  8. 8. FLICKR DATA COLLECTION STEP 1: HARVESTING PHOTO ID’S WITHIN BOUNDING BOXES (1550) Search parameters: • Xmin, Xmax, Ymin, Ymax • Min date: January 1, 2005 • Max date: December 31, 2014 Search result: • Photo ID • User ID • Photo title
  9. 9. FLICKR DATA COLLECTION STEP 2: REQUESTING ADDITIONAL METADATA Search parameters: • Photo ID Search result: • Latitude, longitude • Date and time • User name • User home location • Tags • Photo URL • Location accuracy 2.849.261 photos +/- 5 weeks of harvesting
  10. 10. FLICKR DATA COLLECTION STEP 2: REQUESTING ADDITIONAL METADATA Search parameters: • Photo ID 484.346 photos Search result: • Latitude, longitude • Date and time • User name • User home location • Tags • Photo URL • Location accuracy
  11. 11. FLICKR DATA EXPLORATION PHOTOS IN QGIS
  12. 12. FLICKR DATA EXPLORATION SELECTION OF PHOTOS IN GOOGLE EARTH
  13. 13. TOURIST CLASSIFICATION BASED ON USER’S HOME LOCATION
  14. 14. TOURIST CLASSIFICATION 1. Classification of user location by SQL UPDATE users SET countryname = 'Japan', istourist = 'True', classification = 'SQL' WHERE geoname = '' AND userid IN (SELECT userid FROM users WHERE (userlocation ~* 'y(japan|nippon|日本)y')) (8628 users - 54%) SQL AND ONLINE GEOCODING Geonames API (External database) PostgreSQL (Local database) Java Application 2. Classification of user location by online geocoding Tokyo Tokyo Japan Japan (450 users - 3%) User location = Tokyo Tokyo = Japan
  15. 15. NUMBER OF UNIQUE USERS 0 1.750 3.500 5.250 7.000 6.914 6.257 2.821 17,6% 39,1% 43,2% Locals Tourists Unclassified TOURIST CLASSIFICATION Overall accuracy = 99%
  16. 16. NUMBER OF UNIQUE PHOTOS 0 40.000 80.000 120.000 160.000 132.213 107.016 154.599 39,3% 27,2% 33,6% Local Photos Tourist Photos Unclassified Photos TOURIST CLASSIFICATION Overall accuracy = 99%
  17. 17. CLASSIFICATION RESULTS AMSTERDAM RELATIVE AMOUNT OF TOURISTS PER NATIONALITY (2013) United States United Kingdom Germany Italy Spain France 0% 5% 10% 15% 20% Flickr nationalities 2013 CBS hotel nationalities 2013
  18. 18. TEMPORAL DISTRIBUTIONS DIFFERENT GRANULARITIES
  19. 19. TEMPORAL DISTRIBUTIONS RELATIVE NUMBER OF TOURISTS AND PHOTOS PER HOUR (2005-2014) 0% 2% 4% 6% 8% 10% 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00 0:00 Tourists Tourist photos Many daytime photos
  20. 20. TEMPORAL DISTRIBUTIONS RELATIVE NUMBER OF TOURISTS AND LOCALS PER HOUR (2005-2014) 0% 2% 4% 6% 8% 10% 1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 18:00 19:00 20:00 21:00 22:00 23:00 0:00 Tourists Locals Maximums shifted Relatively more tourists photos in the night More local photos in the evening
  21. 21. Exact match 2 hours off TIMESTAMP VALIDATION TIME DIFFERENCE BETWEEN PHOTO TIMESTAMP AND REAL TIME
  22. 22. TIMESTAMP VALIDATION TIME DIFFERENCE BETWEEN PHOTO TIMESTAMP AND REAL TIME Selecting • all photos tagged with ‘clock’ • all photos near Central Station ! 1032 photos of locals 1134 photos of tourists Result • 70 suitable photos of tourists • 50 suitable photos of locals
  23. 23. 0% 20% 40% 60% 80% -10:00:00 -9:00:00 -8:00:00 -7:00:00 -6:00:00 -5:00:00 -4:00:00 -3:00:00 -2:00:00 -1:00:00 0:00:00 1:00:00 2:00:00 3:00:00 4:00:00 5:00:00 6:00:00 7:00:00 8:00:00 9:00:00 10:00:00 Locals Tourists TIMESTAMP VALIDATION TIME DIFFERENCE BETWEEN PHOTO TIMESTAMP AND REAL TIME
  24. 24. PHOTOGRAPHERS PER DAY OF THE WEEK (2005-2014) 0% 5% 10% 15% 20% Monday Tuesday Wednesday Thursday Friday Saturday Sunday Tourists Locals TEMPORAL DISTRIBUTIONS
  25. 25. PHOTOGRAPHERS PER MONTH (2005-2014) 0% 2% 4% 6% 8% 10% 12% January February March April May June July August September October November December Tourists Locals TEMPORAL DISTRIBUTIONS
  26. 26. TOURISTS AND FOREIGN HOTEL GUESTS PER MONTH (2012+2013) 0% 2% 4% 6% 8% 10% 12% January February March April May June July August September October November December Tourists (Flickr 2012 + 2013) Hotel guests (CBS 2012 + 2013) TEMPORAL DISTRIBUTIONS
  27. 27. 0 40 80 120 160 200 1 365 Locals Tourists PHOTOGRAPHERS PER DAY OF THE YEAR (2005-2014) Queens-day TEMPORAL DISTRIBUTIONS
  28. 28. SPATIAL DISTRIBUTION GRID-BASED CLUSTERING
  29. 29. SPATIAL DISTRIBUTION GRID-BASED CLUSTERING 1 1 1 1 1 1 1 1 1 2 111 2 31 1 1 1 1 112
  30. 30. EXPLORING THE DATA TOURIST COUNT PER HEXAGON IN GOOGLE EARTH
  31. 31. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING
  32. 32. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING DBSCAN: Density-Based Spatial Clustering for Applications with Noise • Detects clusters with different shapes and sizes • Not sensitive to noise very suitable for geosocial data ! • Eps: radius search area • MinPts: minimum number of points in neighborhood Eps Noise MinPts=4
  33. 33. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING
  34. 34. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING
  35. 35. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING
  36. 36. SPATIAL DISTRIBUTION DENSITY-BASED CLUSTERING
  37. 37. TOURISTIC ROUTES
  38. 38. ONE DAY IN THE LIFE OF A TOURIST TOURISTIC ROUTES
  39. 39. LINEAR TRAJECTORIES OF MANY TOURISTS TOURISTIC ROUTES
  40. 40. LINEAR TRAJECTORIES BETWEEN CLUSTERS TOURISTIC ROUTES
  41. 41. TOURISTIC ROUTES RELATING TRAJECTORIES TO STREET NETWORK USING ROUTING ALGORITHM As the crow flies Trajectory over network
  42. 42. STEP 1: CREATE A SIMPLIFIED PEDESTRIAN NETWORK TOURISTIC ROUTES Original Aggregate road links Densify road links
  43. 43. TOURISTS TAKE THE MOST POPULAR ROUTES TOURISTIC ROUTES
  44. 44. STEP 2: REDUCE TRAVEL COST PER ROAD SEGMENT BASED ON PHOTO DENSITY TOURISTIC ROUTES 2,6 1,9 1,4 4,2 3,1 1,8 6,9 6,2 4,1 7,3 9,3 9,6
  45. 45. 1. Create pairs of time-ordered photo locations per user Point A Point B Point B Point C … … ! 2. Calculate distance, time interval and speed per photo pair 3. Select all photo pairs within thresholds: • Distance > 50 m and < 750 m • Time interval > 0 sec and < 600 sec • Speed > 1 km/h and < 5 km/h 4. Calculate closest network node for start and end of every pair TOURISTIC ROUTES STEP 3: CREATE PHOTO PAIRS FOR ROUTING
  46. 46. TOURISTIC ROUTES STEP 4: CALCULATE ROUTES AND AGGREGATE INTO ROUTE DENSITY MAP 1. Calculate route for 6,477 photo pairs with pgRouting 2. Aggregate and count overlaying route segments 3. Visualize touristic route densities
  47. 47. TOURISTIC CLUSTERS AND ROUTES VALIDATION OF RESULTS Solution: Expert judgement by a questionnaire Participants: 8 tourism experts from different departments of the municipality of Amsterdam Problem: No comparable quantitative data available
  48. 48. TOURISTIC ROUTES VALIDATION OF RESULTS BY 8 TOURISM EXPERTS Match: 75% Match: 38% Match: 75% Match: 100% Match: 100% Match: 63% Match: 100% Match: 67% Match: 67% Match: 100% Match: 100% Match: 100% WITH HIGH CONFIDENCE (5/5)3
  49. 49. VALIDATION OF RESULTS TOURISTIC CLUSTERS AND ROUTES Expert # Profession Validity results [1-5] Usefulness results [1-5] 1 Policy Advisor Traffic & Public Space 4 5 2 Data Analyst, Information en Statistics 4 4 3 Senior Advisor Traffic Management 4 4 4 Researcher, Information en Statistics 3 4 5 Senior Advisor Traffic Research 5 4 6 Urban Planner 5 5 7 Urban Planner 4 5 8 Urban Designer 4 5 4.1 4.5 How well do the study outcomes resemble the real world? Are the study outcomes useful for you or for your organization? * ** * **
  50. 50. SUGGESTIONS FOR FUTURE WORK AND POTENTIAL THESIS TOPICS • Calibrate thresholds with quantitative data • Extensive validation of results in cooperation with tourism experts • Cooperate with municipality to define objectives, some suggestions: Additional data sources: Instagram, Twitter, Sina Weibo Divide spatial distributions in different temporal intervals Compare spatial distribution of locals and tourists Divide the spatial distributions in different nationalities Use the presented patterns as input for an agent-based model Discover typical tourism problems with other geosocial data types
  51. 51. THANK YOU FOR YOUR ATTENTION! ANY QUESTIONS OR REMARKS?

×