Thanks for joining us today – to my knowledge this is the Natrual History Museum’s first hackathon based on its specimen data!
80M awesome objects - aim to get them done in under a lifetime[5:59] Default policy is openess - data and images going on the portal[5:59] Hopefully 3D at some point!
The Natural History Open Data Challenge @ OTA16
Diverse collections spanning
space and time
Challenge of scale:
>80 million specimens!
Challenge of speed
(digitising within a lifetime)
“open by default”
Scientific name: Thymelicus lineola (Ochsenheimer, 1808)
Locality: Tilbury Docks
Country: United Kingdom
Decimal latitude: 51.4605
Decimal longitude: 0.3449
Recorded by: T G. Howarth; Howarth
Collection date: 31 / 07 / 1938
Most iCollections specimens will have ~30 fields containing data
(over 100 different fields across all collections)
There are some issues…
(where is H. M. Edelsten!?)
How did collecting effort change over time?
Who was the collector who collected from the most distinct localities? – can we make a ranking
table and mash up data with Wikipedia or other sources?
What can we learn about the collectors – who travelled the furthest or most regularly?
Were most specimens collected in rural areas? Is there collection bias in particular counties?
How can we make the data more attractive to difference audiences?
How could we display the data in more engaging or informative ways?