1. In your worst nightmares
How experimental scientists are
doing provenance for themselves
esn’t b andard/
esn’t b andard/
Uses Excel for data analysis?!?!!
4. ...a typical dataset...
5. We have...
6. But how did we end up here?
7. ...we used to be good at this...
© Cell Press, Nature Publishing Group, American Chem
Soc, American Soc Microbiology, fair use claimed
8. When it was on paper...
...you had to ask for a copy...
...and you said so in the paper...
9. But in the online world...
...too many people
...too many ﬁles
...too much movement
...it’s all too hard isn’t it?
10. But all is not lost...
11. ...because even online researchers
still care about citation
14. Link to information...
...evolving best practice
17. Expectations of link behaviour
Granularity of citation
Evolving best practice
Some technical problems....mostly social
18. Some real research data...
19. Published data... http://is.gd/thCK
20. Published data... http://is.gd/thEg
21. Data summary... http://is.gd/thEX
22. Original experiment http://is.gd/thFa
23. Versioning... http://is.gd/thGb
24. Versioning and provenance...
...through linked open data...
...and third party timestamps
26. URI for every object...
...can link in or out
No semantics to links
(at the moment)
28. Technical solutions...
• Push data to the open web
• Highly granular URIs...repositories for which “the
ﬁle” is not the atomic concept
• Strong versioning and forking functionality...like any
halfway decent code repository
• Strong identity management solutions for people,
• Tools for linking objects
29. Social solutions...
• Use the strong culture of citation in
• Leverage the need of researchers to
track their own data properly
• A discussion of best practice for citation,
30. Problems are primarily
social, not technical...
....technical solutions are
needed to make it easy
31. ...but the ﬁrst problem
is to tell people why
they should care...