This document discusses tools for improving reproducibility in research, including hosting data in GigaDB, sharing images using OMERO, implementing workflows using Galaxy and executable documents, and sharing virtual machines. It emphasizes the need for publishers to host and curate research objects like data, code, and workflows and provide citations for reproducible research. Key tools highlighted are GigaDB for data hosting, OMERO for image hosting, Galaxy for implementing workflows, and virtual machines for sharing full computational environments.
12. Hosting Images
• Image LIMS
• Web embedding
– View online, no
need for software
• Full res
• Link all images to
publication
– No cherry picking
http://www.openmicroscopy.org/site/products/omero
14. Accessible Cyber-Centipede images
OMERO: providing
access to imaging data
View, filter, measure raw
images with direct links
from journal article.
See all image data, not
just cherry picked
examples.
Download and reprocess.
19. Implement workflows in a community-accepted
format
http://galaxyproject.org
Open source
Over 45,000 main
Galaxy server users
Over 1,000 papers
citing Galaxy use
Over 55 Galaxy
servers deployed
20. Implement workflows in an intuitive format
Tool list ToCoolp yprigahrt aNBmAFe-Bte 2r0i1s3ation Results panel
23. Birmingham Metabo-Galaxy
Tools wrapped in Python and XML
User sees web form (easy!)
Data stored centrally (secure!)
Work done centrally (easy update)
29. Open lab books, dynamic documents
• Facilitate reuse and sharing with tools like: Knitr, Sweave,
iPython Notebook
Sweave
• Working towards executable papers…
32. Some testimonials for Knitr
Authors (Wolfgang Huber)
“I do all my projects in Knitr. Having the textual
explanation, the associated code and the results all in one
place really increases productivity, and helps explaining
my analyses to colleagues, or even just to my future self.”
Reviewers (Christophe Pouzat)
“It took me a couple of hours to get the data, the few
custom developed routines, the “vignette” and to
REPRODUCE EXACTLY the analysis presented in the
manuscript. With few more hours, I was able to modify the
authors’ code to change their Fig. 4. In addition to making
the presented research trustworthy, the reproducible
research paradigm definitely makes the reviewer’s job
much more fun!
37. Share data in GigaDB
Share all images in GigaDB
-View images via OMERO
Share code in GigaDB!
Share pipeline using:
Executable docs!
Galaxy!
VMs!
38. Improve
reproducibility!
Give us data, papers
& pipelines*
Contact us:
scott@gigasciencejournal.com
editorial@gigasciencejournal.com
database@gigasciencejournal.com
* APC’s currently generously covered
by BGI until 2015
www.gigasciencejournal.com
39. Thanks to:
team: Our collaborators: Case study:
Ruibang Luo (BGI/HKU)
Shaoguang Liang (BGI-SZ)
Tin-Lap Lee (CUHK)
Qiong Luo (HKUST)
Senghong Wang (HKUST)
Yan Zhou (HKUST)
Funding from: CBIIT
@gigascience
facebook.com/GigaScience
blogs.biomedcentral.com/gigablog/
Peter Li
Huayan Gao
Chris Hunter
Jesse Si Zhe
Nicole Nogoy
Laurie Goodman
Amye Kenall
(BMC)
Marco Roos (LUMC)
Mark Thompson (LUMC)
Jun Zhao (Lancaster)
Susanna Sansone (Oxford)
Philippe Rocca-Serra (Oxford)
Alejandra Gonzalez-Beltran
(Oxford)
www.gigadb.org
galaxy.cbiit.cuhk.edu.hk
www.gigasciencejournal.com