Do photo sharing websites represent a sufficient database to aid in national map updating or change detection?
1. Do photo sharing websites represent a sufficient database to aid in national map updating or change detection? Vyron Antoniou, Jeremy Morley, Muki Haklay Department of Civil, Environmental and Geomatic Engineering
7. Popular tiles (>15 photos) Flickr Geograph ≈ 5.1 % of the total area 3.7 % of the total area
8. Data Flow in popular tiles For both sources about 3.800 of the popular tiles had geotagged photos submitted for three consecutive semesters. This is approximately 1.6% of the total area
16. ? = or ≠ or ≈ Explicit What is the spatial distribution of changes ? or Changes Implicit Popular
17. User Behaviour. Time between capturing and uploading a photograph
18. Conclusions - Two types of sources: explicit and implicit - Implicit : Huge volumes of data, poor spatial coverage Explicit: Moderate data volume, sufficient spatial coverage - We need the amount of data of a spatially implicit source and the spatial coverage of a explicit source - Spatially speaking … no Long-Tail in implicit Web 2.0 apps - We need to understand the pattern of changes on the ground
What is the challenge: The phenomenon of user generated content and the increased presence of geographic information in such applications has motivated researchers to use them for geographic information retrieval purposes. Can we do that and in what extent?
Early in our research we saw that there are two types of available sources: Spatially implicit: that urge their users to upload any kind of photos, geotagged or not, without any spatial constraints, and Spatially explicit: that urge their users to directly interact and capture spatial entities. We examined user contribution in different photo-sharing applications for the UK.
We used the National Grid of UK to collect geotagged photos in January 2009. It is important to notice the total amount of photos for each source and the maximum number of photos submitted to one Km2
Here is the percentage of tiles versus the number of photos for each source. It is characteristic that, with the exception of Geograph, all sources have no photos submitted to almost 80% of the study area
Putting those numbers on a map we can see the spatial distribution for each source It is clear that only Geograph provides a good coverage of UK and the other 3 sources behave similarly
In a next level we associated our photo data with population data. Here the purple colour shows where the number of photos submitted is greater than expected according to population data The cyan colour shows the opposite, where the photos are lesser than expected according to the number of people that live in that area. It is interesting to notice that in London for example apart from the centre of the city, the photos submitted are lesser than expected.
Going one step further, we looked into how many 1 Km2 tiles have more than 15 photos submitted to them. The findings are not that encouraging since by putting that thread only 5% of UK is covered from Geograph and less than 4% from Flickr Here we see that the pattern of spatial distribution of Geograph and Flickr are more similar. Some clusters are similar (green) and others are not (red)
And by examining the data flow to those popular tiles we saw that the trend has started to stabilize. Although this stabilization might be temporary phenomenon the important thing is that only 1.6% of the UK has constant flow of data for 3 consecutive semesters
In the end of July we collected some more data about Flikckr
The left image shows the differences recorded in each tile. We see that about 11% of the UK was active in terms of geotagged photo submission (or deletion) The right image shows the tiles that didn’t had any photos submitted earlier and now they have. Only for an extra 1% of the UK, more than 3 photos have been submitted over a period of 6 months.
So, the overall pattern of the spatial distribution and the intensity of the phenomenon didn’t actually changed over a period of 6 months
So, the overall pattern of the spatial distribution and the intensity of the phenomenon didn’t actually changed over a period of 6 months
In our next step we looked the phenomenon in a larger scale for 15 test areas in UK. This is one of these test areas of 15 Km2 at Hampstead Heath in north London. This is a point-density surface created by the capture locations of about 8.000 photos available in Flickr. We can see few intense clusters in the popular places of the area This is a point-density surface created by the capture locations of about 1.100 photos available in Geograph. We can see a very different pattern in the distribution of the photos that covers more sufficiently the test area and it is not confined just in few popular spots.
In our next step we looked the phenomenon in a larger scale for 15 test areas in UK. This is one of these test areas of 15 Km2 at Hampstead Heath in north London. This is a point-density surface created by the capture locations of about 8.000 photos available in Flickr. We can see few intense clusters in the popular places of the area This is a point-density surface created by the capture locations of about 1.100 photos available in Geograph. We can see a very different pattern in the distribution of the photos that covers more sufficiently the test area and it is not confined just in few popular spots.
In our next step we looked the phenomenon in a larger scale for 15 test areas in UK. This is one of these test areas of 15 Km2 at Hampstead Heath in north London. This is a point-density surface created by the capture locations of about 8.000 photos available in Flickr. We can see few intense clusters in the popular places of the area This is a point-density surface created by the capture locations of about 1.100 photos available in Geograph. We can see a very different pattern in the distribution of the photos that covers more sufficiently the test area and it is not confined just in few popular spots.
Nevertheless, the true importance of the implicit and explicit sources will emerge by examining where changes take place on the ground. This is a next step that we will focus our research in the future
Finally, we looked into the behaviour of users participating in Flickr and Geograph by examining a sample of 50.000 photos from Flickr and 10.000 photos from Geograph. This is an indication of how up-to-date is the data available in these sources. Firstly, we see that the user behaviour is similar for both sources and secondly that the 70% of the photos for Geograph and 85% for Flickr will be published within 6 moths from their capture.