How to find 50,000 maps in a haystack of 1,000,000 images; geolocate them, and categorise them ... on a budget of no or not many euros.
The 1,000,000 image collection extracted by the British Library from 19th-century books is a wonderful resource — but one Wikimedia Commons felt it could not accept, other than through exhaustive hand-uploading, because without good metadata about the subject of the image at the image level, the images could not be made categorisable and so would simply not be discoverable. This talk describes a joint BL/Wikimedia initiative to systematically go through the images, which discovered 50,000 maps in eight weeks.
In the second stage of the process, now just getting under way, crowd geolocation of these map images is now making it possible to use automated tools to group them and organise them and categorise them in different ways, with the aim of uploading them to Commons with a full provisional categorisation, the key step to making them valuable and reusable.
(25 minute talk given at GlamWiki 2015 in the Hague)