Scott Edmunds flashtalk on "Rewarding Reproducibility and Method Publishing the GigaScience Way" from Beyond the PDF 2 "Making it Happen" session. 20/3/13
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Scott Edmunds flashtalk slides from Beyond the PDF2
1. Rewarding Reproducibility and Method
Publishing the GigaScience Way
Scott Edmunds
GigaScience
scott@gigasciencejournal.com
@gigascience/SCEdmunds
2. The Issue:
(Mo Data, Mo Problems…)
Data-driven science era brings:
• Huge opportunities
• Huge challenges with:
data curation, review/QA, handling, sharing
= growing reproducibility gap
3. GigaSolution: deconstructing the paper
Take data publication approach further and reward:
• Data availability
• Metadata/curation
Metadata Analyses
• Interoperability
Methods
Data
• Availability of workflows
• Transparent analyses
4. GigaSolution: deconstructing the paper
Combines and integrates:
Open-access journal
Data Publishing Platform
Data Analysis Platform
Utilizes big-data infrastructure and expertise from:
Worlds largest genomics organisation with:
17PB storage, 20.5K cores, 212TFlops,
>1000 bioinformaticians
www.gigadb.org
www.gigasciencejournal.com
5.
6. How are we supporting data
reproducibility?
Open-Data
Open-Paper Data sets DOI:10.5524/100038
78GB CC0 data
Open-Pipelines
DOI:10.1186/2047-217X-1-18
Open-Workflows
>6500 accesses Analyses DOI:10.5524/100044
Open-Review
8 reviewers tested data in ftp server & named reports published
Enabled code to being picked apart by bloggers in wiki
http://homolog.us/wiki/index.php?title=SOAPdenovo2
Open-Code
Code in sourceforge under GPLv3: http://soapdenovo2.sourceforge.net/
>4000 downloads
8. SOAPdenovo2 workflows implemented in
Implemented entire workflow in our Galaxy server, inc.:
• 3 pre-processing steps
• 4 SOAPdenovo modules
• 1 post processing steps
• Evaluation and visualization tools
Also available to download by >25K Galaxy users in
galaxy.cbiit.cuhk.edu.hk
14. What is needed to
make it happen?
Give us your data &
pipelines!*
Contact us:
scott@gigasciencejournal.com
editorial@gigasciencejournal.com
database@gigasciencejournal.com
* APC’s currently generously covered by BGI
www.gigasciencejournal.com
15. Thanks to:
team: Our collaborators: Funding from:
Peter Li Ruibang Luo (BGI/HKU)
Chris Hunter Shaoguang Liang (BGI-SZ)
Jesse Si Zhe Tin-Lap Lee (CUHK)
Nicole Nogoy Huayen Gao (CUHK)
Tam Sneddon Qiong Luo (HKUST) CBIIT
Alexandra Basford Senghong Wang (HKUST)
Laurie Goodman Yan Zhou (HKUST)
@gigascience
Follow us: facebook.com/GigaScience
blogs.openaccesscentral.com/blogs/gigablog/
www.gigadb.org
galaxy.cbiit.cuhk.edu.hk
www.gigasciencejournal.com
Editor's Notes
That just leaves me to thank the GigaScience team: Laurie, Scott, Alexandra, Peter and Jesse, BGI for their support - specifically Shaoguang for IT and bioinformatics support – our collaborators on the database, website and tools: Tin-Lap, Qiong, Senhong, Yan, the Cogini web design team, Datacite for providing the DOI service and the isacommons team for their support and advocacy for best practice use of metadata reporting and sharing.Thank you for listening.