Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Case Study: Mapping the Maps
How to find 50,000 maps in a haystack of 1,000,000 images;
geolocate them, and categorise the...
1,000,000 images
Fantastic, but …
Very limited metadata
Wikimedia said no bulk upload
Volunteer response…
Create a subject index by book…
… encouraging images to be uploaded by the book
(20,000 so far – mostly by one man)
… however, manual categorisation of images is
very very time-consuming.
Could anything be done more automatically…
Maps: natural classification, given co-ordinates
Could anything be done more automatically…
So: find the maps on Flickr,
and tag them…
… using the index to drive the process
31 Oct
… using the index to drive the process
31 Oct
… using the index to drive the process
31 Oct
… using the index to drive the process
03 Nov
… using the index to drive the process
17 Dec
… using the index to drive the process
19 Dec
But how many maps were there ?
Oct 31
But how many maps were there ?
Oct 31
But how many maps were there ?
Nov 2
But how many maps were there ?
Nov 7
But how many maps were there ?
Nov 14
But how many maps were there ?
Dec 1
But how many maps were there ?
Dec 10
But how many maps were there ?
Dec 17
But how many maps were there ?
Dec 28
-- including 20,000 found independently by @Quasimondo,
machine-assisted using his own pattern recognition methods
50,000 ...
Geo-location,
using the Klokan/BL Georeferencer
(Free alternatives are also available)
Next step:
10x more images than
the BL has ever attempted before
Next step:
Success allows the old map to be laid over
the top of a modern one
Pilot run of 3,000 completed
Now characterised by location …
Pilot run of 3,000 completed
... and scale
All that is needed to
identify individual continents …
… countries …
… nation …
… nations …
… cities …
… and beyond
… and beyond.
Ready to be uploaded to Wikimedia
Ready to be uploaded to Wikimedia
…. using Europeana’s GlamWiki Uploader
Ready to be uploaded to Wikimedia
…. using Europeana’s GlamWiki Uploader
THANK YOU, Europeana!
Upcoming SlideShare
Loading in …5
×

Mapping the Maps (Europeana Tech, Feb 12, 2015 - Ignite talk)

1,550 views

Published on

How to find 50,000 maps in a haystack of 1,000,000 images; geolocate them, and categorise them ... on a budget of no or not many euros.

The 1,000,000 image collection extracted by the British Library from 19th-century books is a wonderful resource — but one Wikimedia Commons felt it could not accept, other than through exhaustive hand-uploading, because without good metadata about the subject of the image at the image level, the images could not be made categorisable and so would simply not be discoverable. This talk describes a joint BL/Wikimedia initiative to systematically go through the images, which discovered 50,000 maps in eight weeks. Geo-location of these map images then makes it possible to use automated tools to help group them and organise them and categorise them in different ways, the key step to making them valuable and reusable.

(5-minute "ignite" talk, given at the start of Europeana Tech 2015)

Published in: Internet
  • Be the first to comment

Mapping the Maps (Europeana Tech, Feb 12, 2015 - Ignite talk)

  1. 1. Case Study: Mapping the Maps How to find 50,000 maps in a haystack of 1,000,000 images; geolocate them, and categorise them ... on a budget of no not many euros. James Heald, Wikimedia volunteer @heald_j Kimberly Kowal, British Library Kimberly.Kowal@bl.uk
  2. 2. 1,000,000 images Fantastic, but …
  3. 3. Very limited metadata Wikimedia said no bulk upload
  4. 4. Volunteer response… Create a subject index by book…
  5. 5. … encouraging images to be uploaded by the book (20,000 so far – mostly by one man)
  6. 6. … however, manual categorisation of images is very very time-consuming.
  7. 7. Could anything be done more automatically…
  8. 8. Maps: natural classification, given co-ordinates Could anything be done more automatically…
  9. 9. So: find the maps on Flickr, and tag them…
  10. 10. … using the index to drive the process 31 Oct
  11. 11. … using the index to drive the process 31 Oct
  12. 12. … using the index to drive the process 31 Oct
  13. 13. … using the index to drive the process 03 Nov
  14. 14. … using the index to drive the process 17 Dec
  15. 15. … using the index to drive the process 19 Dec
  16. 16. But how many maps were there ? Oct 31
  17. 17. But how many maps were there ? Oct 31
  18. 18. But how many maps were there ? Nov 2
  19. 19. But how many maps were there ? Nov 7
  20. 20. But how many maps were there ? Nov 14
  21. 21. But how many maps were there ? Dec 1
  22. 22. But how many maps were there ? Dec 10
  23. 23. But how many maps were there ? Dec 17
  24. 24. But how many maps were there ? Dec 28
  25. 25. -- including 20,000 found independently by @Quasimondo, machine-assisted using his own pattern recognition methods 50,000 maps in all: classmark detailed totals index index ------ ---------- ----------- misc 16074 14091 1983 Europe 13136 6254 6882 British Isles 7191 269 6922 North America 6758 1524 5234 USA 5782 1209 4573 Asia 2736 1280 1456 Africa 2300 1075 1225 South America 895 659 236
  26. 26. Geo-location, using the Klokan/BL Georeferencer (Free alternatives are also available) Next step:
  27. 27. 10x more images than the BL has ever attempted before Next step:
  28. 28. Success allows the old map to be laid over the top of a modern one
  29. 29. Pilot run of 3,000 completed
  30. 30. Now characterised by location … Pilot run of 3,000 completed
  31. 31. ... and scale
  32. 32. All that is needed to identify individual continents …
  33. 33. … countries …
  34. 34. … nation … … nations …
  35. 35. … cities …
  36. 36. … and beyond … and beyond.
  37. 37. Ready to be uploaded to Wikimedia
  38. 38. Ready to be uploaded to Wikimedia …. using Europeana’s GlamWiki Uploader
  39. 39. Ready to be uploaded to Wikimedia …. using Europeana’s GlamWiki Uploader THANK YOU, Europeana!

×