Liberating OA figures from PDF to Flickr (A Pro-iBiosphere talk)
Liberating OA Figures
– Content Discovery
– Innovation ( seriously )
PDFs are 'electronic paper' –
we can do better than this
● Free-to-use platform (free as in beer, it's not open)
One Terabyte of free storage per account
● Highly popular platform for image sharing
(in top 100 most frequently visited websites of the world)
● Supports Creative Commons licensing (many platforms don't)
● Feature-rich, good UI, useful API, etc...
– to extract just images from PDFs
– to embed appropriate metadata in the images
– e.g. the Publisher
– the Authors
– the paper Title and DOI
– the figure Caption Text
– the licence under which the image is available for use
● Open Access facilitates and empowers re-use
– BOAI-compliant OA papers are typically licensed
under CC BY or CC0.
● Flickr free accounts are ad-supported
● Advertising is a commercial endeavour – it generates money
● Thus CC BY-NC or CC BY-NC-ND content to which you are not the
original copyright holder, cannot be reposted to your free Flickr account
Re-posting to Flickr is only possible if content
is openly-licensed. Licensing details are important!
A Flickr 'album' for the figure content of each paper.
Clark, J.L. & Mora M.M. (2014) Nautilocalyx erytranthus (Gesneriaceae), a new species
from Northwestern Amazonia. Phytotaxa. Licensed under CC BY
Using Content Negotation for CrossRef DOIs you can get the full citation details of the
paper from the DOI.
But ONLY if the publisher has done their job and registered the DOI with CrossRef
A major problem for this project is that often, newly-published Magnolia Press article
DOI's are NOT registered with CrossRef. I can find articles from 2013 with DOI's that
are still aren't registered with CrossRef. Extremely annoying – this causes real problems.
Results of content negotiation performed 11-June-2014:
Full attribution visible next to figure. One-click link to source. Full caption text. Searchable.
View-counter (METRICS!). Open licencing marked (tells you it's CC BY on mouse-over)
Enriching OA literature – adding embedded metadata to figures
Using exiftool one can embed the attribution (plain-text citation),
Publisher, Re-use Rights, and Figure Caption inside the figure
Only one publisher currently embeds
useful metadata in their figure images
Well done PLOS!
Not perfect though.
Author names &
the paper title are
Searching for phylogeny is hard
Make it a lot easier!
Search by “presence
of phylogenetic trees”
Link to journal search here
Status as of 11-June-2014
● 4045 phylogeny figures from PLOS ONE
● 5074 phylogeny figures from 128 other OA journals
(Pensoft, BMC, other PLOS journals, Hindawi, MDPI)
● 708 figures from 152 (OA-only) Phytotaxa papers
– On Twitter @PhytoFigs , and on Flickr
● 1326 figures from 146 (OA-only) Zootaxa papers
– on Flickr
Screencrop of the @PhytoFigs twitter account:
Other work in this area
(people are already re-using content)
Rod Page (2010) has re-imagined Zootaxa content http://iphylo.org/~rpage/zootaxa/
explained here: http://iphylo.blogspot.be/2010/08/extracting-semantic-goodness-from.html
Numerous other Rod Page projects probably relevant too – thoughts on a comparison of
Flickr & Pinterest in terms of features / use would be interesting.
Yale Image Finder: http://krauthammerlab.med.yale.edu/imagefinder/
takes OA figures + captions from PubMed Central articles with good search capability
BUT it's only for PMC articles. A lot of biodiversity literature is NOT in PMC.
British Library content on Flickr: https://www.flickr.com/photos/britishlibrary/
explained more here:
Biosearch (last updated October 2007!!!): http://biosearch.berkeley.edu/
another seemingly abandoned open access figure search database
PLOS's Tumblr highlighting visually appealing figures they publish, on Tumblr: