Your SlideShare is downloading. ×
0
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Create and recieve scientific data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Create and recieve scientific data

3,007

Published on

A talk given at the DCC digital curation 101 workshop which illustrates how to curate and manage scientific data, considering the content, syntax and semantics of the data

A talk given at the DCC digital curation 101 workshop which illustrates how to curate and manage scientific data, considering the content, syntax and semantics of the data

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,007
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. a centre of expertise in data curation and preservation Create or Receive Scientific data Dr. Frank Gibson and Dr. Phillip Lord Frank.Gibson@newcastle.ac.uk Phillip.Lord@newcastle.ac.uk Funded by: This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc- sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Digital Curation 101, October 6th-10th, 2008, NeSC, Edinburgh
  • 2. a centre of expertise in data curation and preservation “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.” - Geoffrey Bowker Create or Receive
  • 3. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 4. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 5. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 6. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 7. If we have a paper a centre of expertise in data curation and preservation who cares about the data? Create or Receive http://flickr.com/photos/nicmcphee/2756494307/
  • 8. a centre of expertise in data curation and preservation A paper = a claim (or claims) The full record that supports that claim should be available for detailed examination and critique Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 9. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 10. a centre of expertise in data curation and preservation 1000+ Databases Create or Receive
  • 11. Biocuration: Databases a centre of expertise in data curation and preservation Create or Receive
  • 12. Biocuration: Wiki a centre of expertise in data curation and preservation Create or Receive
  • 13. a centre of expertise in data curation and preservation Create or Receive Slide by Cameron Neylon http://www.slideshare.net/CameronNeylon
  • 14. a centre of expertise in data curation and preservation Create or Receive
  • 15. Funders a centre of expertise in data curation and preservation http://flickr.com/photos/luismimunoznajar/2093185804/or Create Receive
  • 16. a centre of expertise in data curation and preservation Create or Receive Create or Receive
  • 17. a centre of expertise in data curation and preservation Curation aims Amenable Preservable Ownable Accessible Citable Create or Receive
  • 18. a centre of expertise in data curation and preservation Significant Properties of Data Content Syntax Semantics Create or Receive
  • 19. a centre of expertise in data curation and preservation Content Create or Receive
  • 20. a centre of expertise in data curation and preservation Publisher Type Title Creator Source Identifier Date Rights Create or Receive
  • 21. Simple Dublin Core a centre of expertise in data curation and preservation Type Format Title Identifier Creator Source Subject Language Description Relation Publisher Coverage Contributor Rights Date Create or Receive
  • 22. a centre of expertise in data curation and preservation Content: Domain Specific Create or Receive
  • 23. a centre of expertise in data curation and preservation Syntax Create or Receive
  • 24. a centre of expertise in data curation and preservation Create or Receive
  • 25. a centre of expertise in data curation and preservation Choosing a Syntax • Openness • -is there an open, publicly available specification for the format; are its specifications in the public domain; is it unencrypted? • Portability • -is the format independent of hardware, operating system, of other software; is it independent of particular institutions, groups, or events; is it in widespread current use; does it contain little or no built-in functionality? • Quality • -is it robust; simple; highly tested; loss-free? Create or Receive
  • 26. a centre of expertise in data curation and preservation Semantics Create or Receive
  • 27. a centre of expertise in data curation and preservation Semantics can be complex One semantic = many words Many words = one semantic Create or Receive
  • 28. a centre of expertise in data curation and preservation • Excel data example – do I need it? Create or Receive •Zeeberg et al. BMC Bioinformatics 2004 5:80 doi:10.1186/1471-2105-5-80 •Zeeberg et al. BMC Bioinformatics 2004 5:80 doi:10.1186/1471-2105-5-80
  • 29. What is fly? a centre of expertise in data curation and preservation •Fly •Fly •http://en.wikipedia.org/wiki/Image:Air_india_b747-400_vt-esn_arp.jpg •http://en.wikipedia.org/wiki/Image:MuscuDomestica.jpg •Fly •Fly •http://en.wikipedia.org/wiki/Image:Green_Highlander_salmon_fly.jpg •http://en.wikipedia.org/wiki/Image:Fly_poster.jpg Create or Receive
  • 30. a centre of expertise in data curation and preservation Ontology • A controlled vocabulary is an association between formal names (identifiers) and their definitions. • An ontology is a controlled vocabulary augmented with logical constraints that describe their interrelationships. Create or Receive
  • 31. a centre of expertise in data curation and preservation Ontologies for Life science • Emergence has occurred for two reasons • Consistent annotation of data • To add meaning and understanding that can be interpreted computationally • Bio-ontologies registered on the OBO foundry Create or Receive
  • 32. a centre of expertise in data curation and preservation Application of Significant Properties In Proteomics Create or Receive
  • 33. a centre of expertise in data curation and preservation Minimum Information about a Proteomics Experiment (MIAPE) • Sufficiency. • The MIAPE guidelines should require sufficient information about a dataset and its experimental context to allow a reader to understand and critically evaluate the interpretation and conclusions, and to support their experimental corroboration. • Practicability. • Achieving compliance with MIAPE should not be so burdensome as to prohibit its widespread use. Create or Receive
  • 34. a centre of expertise in data curation and preservation Create or Receive
  • 35. a centre of expertise in data curation and preservation Minimum reporting guidelines • Describe content • Implementation independent • Impacts • Publication • Syntax • Semantics Create or Receive
  • 36. a centre of expertise in data curation and preservation Syntax for proteomics • The content in MIAPE GE needs to be structured to facilitate • dissemination • transfer • storage • A community development process to agree on a syntax • building upon the FuGE data model • A pre-existing community developed representation of scientific experiments • Interoperable Create or Receive
  • 37. a centre of expertise in data curation and preservation FuGE • Model of common components in science investigations, such as materials, data, protocols, equipment and software. • Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats. Create or Receive
  • 38. a centre of expertise in data curation and preservation UML/XML/RDBMS • UML gives structure (but not syntax) • Very abstract, very general • XML provides a concrete syntax • Meta language is interoperable, checkable, viable and has basic metadata support (language, character coding and so on). • Tends toward the verbose. Not (very) searchable for itself. • Therefore, transfer and archive format. • RDBMS • SQL is (sort of) a standard • Highly computationally amenable form; v. good for searching • Conversion from XML is possible, but in a number of ways. • Hard work – nice to have an off-the-shelf implementation. Create or Receive
  • 39. GelMLa centre of expertise in data curation and preservation Create or Receive
  • 40. a centre of expertise in data curation and preservation Semantics for Gels Create or Receive
  • 41. Semantics for science a centre of expertise in data curation and preservation Create or Receive
  • 42. a centre of expertise in data curation and preservation Curation of Gel experiments Public Laboratory Data entry and transfer repositories I) GelML data entry tools GelML MAIPE GE II) Direct database submission III) Automated export of GelInfoML MAIPE GI sepCV Create or Receive
  • 43. Discoverability and reuse a centre of expertise in data curation and preservation •Persistent Identifiers •Rights management Create or Receive
  • 44. a centre of expertise in data curation and preservation Persistent Identifiers • a name for a resource which will remain the same regardless of where the resource is located • In biology typically assigned to data upon publication • Type of identifier dependent on publication method • Description and Representation Information provides more information about persistent identifiers Create or Receive
  • 45. a centre of expertise in data curation and preservation Rights management • Difficult to determine • Lots of legal issues • In biology/bioinformatics tends to be open access •Creative commons Create or Receive
  • 46. Receiving data for curation a centre of expertise in data curation and preservation Content Syntax Semantics Create or Receive
  • 47. Who will receive it? Route map a centre of expertise in data curation and preservation What are their policies on: Route map Content, Syntax, Semantics Plan your experiment to conform to Content, Syntax, Semantics Implement experiment to; Collect appropriate Content Structure in appropriate Syntax Ensure Semantics are preserved Curate… Create or Receive
  • 48. a centre of expertise in data curation and preservation Meta Route Map • How to build the map if you don’t have one yet. Create or Receive
  • 49. a centre of expertise in data curation and preservation Appraise and Select • Investigates the evaluation and selection of data for longterm curation and preservation Create or Receive
  • 50. a centre of expertise in data curation and preservation Acknowledgments • The CARMEN project • www.carmen.org.uk • The Proteomics Standards Initiative (PSI) • http://psidev.info • Colleagues at Newcastle University • Phillip Lord, Anil Wipat, Allyson Lister Create or Receive
  • 51. a centre of expertise in data curation and preservation Create or Receive

×