• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data Management (from day 0)
 

Data Management (from day 0)

on

  • 2,049 views

Practical experiences around data handling form a chemist, now working in biology. The story starts on the day that I started managing my PhD thesis research in a version control system, originally ...

Practical experiences around data handling form a chemist, now working in biology. The story starts on the day that I started managing my PhD thesis research in a version control system, originally Subversion, later Git. It them moves on to all the issues around data in publishing, data licensing (be sure you understand what you're using), online repositories, a bit of data citation, repository-integrated data analysis, finding a conclusion, and returning to day 1.

Statistics

Views

Total Views
2,049
Views on SlideShare
547
Embed Views
1,502

Actions

Likes
3
Downloads
3
Comments
0

48 Embeds 1,502

http://chem-bla-ics.blogspot.com 480
http://chem-bla-ics.blogspot.co.uk 154
http://chem-bla-ics.blogspot.nl 129
http://chem-bla-ics.blogspot.de 100
http://chem-bla-ics.blogspot.ru 64
http://chem-bla-ics.blogspot.fi 64
http://feedly.com 54
http://chem-bla-ics.blogspot.com.es 53
http://chem-bla-ics.blogspot.fr 46
http://chem-bla-ics.blogspot.ca 39
http://chem-bla-ics.blogspot.com.au 36
http://chem-bla-ics.blogspot.sg 27
http://chem-bla-ics.blogspot.it 21
http://chem-bla-ics.blogspot.ie 20
http://chem-bla-ics.blogspot.se 17
http://chem-bla-ics.blogspot.co.nz 15
http://chem-bla-ics.blogspot.gr 15
http://news.google.com 15
http://chem-bla-ics.blogspot.in 13
http://chem-bla-ics.blogspot.ch 13
http://chem-bla-ics.blogspot.no 10
http://nrnb.org 9
http://newsblur.com 9
http://chem-bla-ics.blogspot.dk 9
http://chem-bla-ics.blogspot.jp 9
http://chem-bla-ics.blogspot.kr 9
http://chem-bla-ics.blogspot.be 8
http://chem-bla-ics.blogspot.com.br 8
http://chem-bla-ics.blogspot.pt 7
http://chem-bla-ics.blogspot.cz 6
http://plus.url.google.com 5
https://twitter.com 5
http://chem-bla-ics.blogspot.com.ar 4
http://chem-bla-ics.blogspot.co.at 4
http://feeds.feedburner.com 3
http://chem-bla-ics.blogspot.tw 3
http://127.0.0.1 3
http://chem-bla-ics.blogspot.com.tr 3
http://www.newsblur.com 2
http://chem-bla-ics.blogspot.co.il 2
http://www.feedspot.com 2
http://chem-bla-ics.blogspot.hu 1
http://chem-bla-ics.blogspot.hk 1
http://chem-bla-ics.blogspot.ro 1
http://www.inoreader.com 1
http://silverreader.com 1
http://chem-bla-ics.blogspot.mx 1
https://demo.plu.mx 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

Data Management (from day 0) Data Management (from day 0) Presentation Transcript

  • Department of Bioinformatics - BiGCaT 1 Data Management (from day 0) Egon Willighagen (@egonwillighagen) 3 April 2014, Masterclass RDM in NL
  • Department of Bioinformatics - BiGCaT 2 Day 0: data plan Before you start doing an experiment, you get a lab notebook. (Some universities already require electronic lab notebooks!)
  • Department of Bioinformatics - BiGCaT 3 Day 1: the electronic lab notebook • Version Control System – Allows backups – Allows annotation – Dated changes
  • Department of Bioinformatics - BiGCaT 4 Day 2: be careful what data you use • Availability in 4 years? – Your Library/University has a copy? • Can you read the format? • Can you copy the data and share (e.g. with collaborators)? • What if the journal you publish in requires you to share data?
  • Department of Bioinformatics - BiGCaT 5 Day 3: store everything • Experiments – Description – Results (images, measurements, …) • Written output – Reports, papers, presentations
  • Department of Bioinformatics - BiGCaT 6 Day 4: Analyse data directly from a repository Willighagen E. (2014) Accessing biological data in R with semantic web technologies. PeerJ PrePrints 2:e185v3. 10.7287/peerj.preprints.185v3 mart = biomaRt::useMart(biomart="snp", dataset="hsapiens_snp") brca1 = c("rs16940","rs16941", "rs16942", "rs799916", "rs799917") data = biomaRt::getBM(attributes=attribs, filters=c("snp_filter"), values=brca1, mart=mart) results = sparql.remote( "http://rdf.farmbio.uu.se/chembl/sparql", paste( "SELECT DISTINCT ?predicate ?object WHERE {", " ?assay <http://www.w3.org/2000/01/rdf-schema#label> "CHEMBL615603" ;", " ?predicate ?object . }" ))
  • Department of Bioinformatics - BiGCaT 7 Day 4: Analyses inside your report http://yihui.name/knitr/ <p>We can also produce plots (centered by the option <code>fig.align='center'</code>): </p> <!--begin.rcode html-cars-scatter, message=FALSE, fig.align='center' library(ggplot2) plot(mpg~hp, mtcars) qplot(hp, mpg, data=mtcars) +geom_smooth() end.rcode-->
  • Department of Bioinformatics - BiGCaT 8 Day 5: Large Repositories • Uniprot, ChEMBL, Gene Ontology – Is there a deposition workflow? • Growing repositories – WikiPathways • Set up a new database (paper+1) – e.g. DrugMet – Problem: what about small data? • Journal driven – CSD – PDB
  • Department of Bioinformatics - BiGCaT 9 Day 5: Database Seeds • Set up a new database (paper += 1) – e.g. DrugMet S. Lampa + me CC-SA, but data CC0
  • Department of Bioinformatics - BiGCaT 10 Day 5: National Repositories
  • Department of Bioinformatics - BiGCaT 11 Day 5: Small Data @ FigShare
  • Department of Bioinformatics - BiGCaT 12 Day 5: Scientific dissemination • Data sharing: copyright – Can data be copyrighted? – Data Source: you, lab mates, others? – Ownership • Data sharing: license – Do you want your data reused? – And be modified (format!)? – Commercial use?
  • Department of Bioinformatics - BiGCaT 13 Day 6: Format? Why not SemWeb? • 5 Star Open Data (5stardata.info) open available, reusable, open format, URIs (ontologies etc), linked data
  • Department of Bioinformatics - BiGCaT 14 Linked Open Data Cloud
  • Department of Bioinformatics - BiGCaT 15 Day 7: are people using your work?
  • Department of Bioinformatics - BiGCaT 16 Day 8: back to step 0 • Take feedback (“peer review”), study new uses • Plan your next study CC-BY frankensteinnn@flickr