Unknown knowns,
long tails, and long data
@rdmpage
http://iphylo.blogspot.com
Stressors and Drivers of Food Security: Evidence from Scientific Collections
We know this
We know we don’t
know this
We don’t know that
we don’t know this
known
unknown We know this, but we
don’t know that
Biological Diversity in the Patent System
Paul Oldham, Stephen Hall, Oscar Forero
PLoS ONE http://dx.doi.org/10.1371/journal.pone.0078737
“…human innovative activity involving
biodiversity in the patent system focuses on
approximately 4% of taxonomically described…”
@junglepaul
BHL and GBIF as biomedical databases
http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html
PubMed
(disease)
BioStor
(publication)
GBIF
(specimen)
Summary
• Open access literature is a potential goldmine
of information (long data, long tail)
• Text mining for entities (scientific names,
places, specimens, attributes) (search is still
the killer app)
• Linking things together (unknown knowns)