Repositories for Scientific Data: An #animalgarden show (Pecha Kucha) - Peter Murray-Rust

  • 338 views
Uploaded on

Peter Murray-Rust's Pecha Kucha presentation "Repositories for Scientific Data: An #animalgarden show" which was delivered on Friday 2nd August 2013 at the Repository Fringe 2013.

Peter Murray-Rust's Pecha Kucha presentation "Repositories for Scientific Data: An #animalgarden show" which was delivered on Friday 2nd August 2013 at the Repository Fringe 2013.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
338
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. REPOSITORIES FOR SCIENTIFIC DATA An #animalgarden show Peter Murray-Rust, OKFN and University of Cambridge Chuff OWL Moomin AMI Gulliver Sleepless cleanTux UncleSam
  • 2. I’m AMI studying biodiversity. I compute phylogenetic trees Only 4% of computed trees are saved I’m in a pear tree.
  • 3. Where can I put my data? Institutional repos don’t work, we’ve tried WE NEED DOMAIN REPOSITORIES FOR SCIENCE
  • 4. So how do you manage data? We’re BIG DATA at NASA We hire data experts
  • 5. But I’m a LONG-TAIL scientist!
  • 6. Australia have a national data service (ANDS) We could use their TARDIS* Let’s ask the crystallographers. They save their data
  • 7. I want to publish this paper You MUST send ALL the data. The IUCr will check if it’s correct
  • 8. It takes years to create vocabularies Core dictionary (coreCIF) version 2.4.3 _diffrn_ambient_temperature Definition: The mean temperature in kelvins at which the intensities were measured. Range: 0.0 -> infinity Type: numb ID For humans For machines: Constraint + type We need domain vocabularies through inter/national efforts
  • 9. PMRgroup also built a crystal structure repo (Crystaleye) It’s got 200,000 entries But none from Elsevier, Wiley, Springer
  • 10. And NONE of the results are archived Computational Materials scientists costs 1,000 Million USD / year PMR wrote software to turn FORTRAN into XML
  • 11. PMR and others have started a global effort to create vocabularies It’s hard and slow work PMR group built compchem repository Chempound XML RDF NoSQL SPARQL
  • 12. Is PMR making progress? Hoping to work with Obama’s 500 M USD “materials genome”
  • 13. WE NEED DOMAIN REPOSITORIES FOR BIODIVERSITY
  • 14. We could use Figshare As long as it’s Open Or OKFN’s CKAN
  • 15. And we can also do theses! PMR and Ross Mounce will index the whole of published bioscience! 5 years of JISC projects helped
  • 16. We’re going to index SPECIES, PLACES, DATES I’m a baby Buddleja Davidii
  • 17. OKFN Chuff! I’m an Okapi balloonii
  • 18. WE NEED DOMAIN REPOSITORIES FOR SCIENCE Wake up, nearly finished PechaKucha i knackering
  • 19. Chuff REPOSITORIES FOR SCIENTIFIC DATA An #animalgarden show Peter Murray-Rust, OKFN and University of Cambridge WE NEED DOMAIN REPOSITORIES FOR SCIENCE