Your SlideShare is downloading. ×
0
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Scott Edmunds at Tech4Dev on Open Publishing	for the Big-Data Era
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scott Edmunds at Tech4Dev on Open Publishing for the Big-Data Era

396

Published on

Scott Edmunds at the Tech4Dev "The Openness Paradigm" session on Open Publishing for the Big-Data Era, Lausanne 4th June 2014

Scott Edmunds at the Tech4Dev "The Openness Paradigm" session on Open Publishing for the Big-Data Era, Lausanne 4th June 2014

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
396
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. : Open Publishing for the Big-Data Era "Information is the currency of the future world” William Gibson Scott Edmunds, Peter Li, Huayan Gao, Chris Hunter, Si Zhe Xiao, Tin-Lap Lee, Laurie Goodman #Tech4Dev: 4th June 2014
  • 2. Challenges/Opportunities in the Data-Driven Era Quick response to climate change, food security & disease outbreaks Using networking power of the internet to tackle problems Can ask new questions & find hidden patterns & connections Build on each others efforts quicker & more efficiently More collaborations across more disciplines Harness wisdom of the crowds: crowdsourcing, citizen science, crowdfunding Enables: Enabled by: Removing silos, standards/formats, open-access/data Challenges:
  • 3. Not enabled by: paywalls, silos, dead trees 18121665 1869 • Scholarly articles are merely advertisement of scholarship . The actual scholarly artefacts, i.e. the data and computational methods, which support the scholarship, remain largely inaccessible --- Jon B. Buckheit and David L. Donoho, WaveLab and reproducible research, 1995 • Lack of transparency, lack of credit for anything other than “regular” dead tree publication • If there is interest in data, only to monetise & repackage
  • 4. • Data • Software • Re-use… = Credit } Credit where credit is overdue: “One option would be to provide researchers who release data to public repositories with a means of accreditation.” “An ability to search the literature for all online papers that used a particular data set would enable appropriate attribution for those who share. “ Nature Biotechnology 27, 579 (2009) New incentives/credit
  • 5. GigaSolution: deconstructing the paper www.gigadb.org www.gigasciencejournal.com Utilizes big-data infrastructure and expertise from: Combines and integrates: Open-access journal Data Publishing Platform Data Analysis Platform
  • 6. Rewarding open data
  • 7. Democratization: the “Peoples Parrot” Puerto Rican Parrot Genome Project (Amazona vittata ) Was the rarest parrot, national bird of Puerto Rico Community funded from artworks, fashion shows, beer brands, crowdfunding… Genome annotated by students in community college as part of bioinformatics education Paper and Data published and promoted in GigaScience and GigaDB Taras K Oleksyk, et al., (2012) A Locally Funded Puerto Rican Parrot (Amazona vittata) Genome Sequencing Project Increases Avian Data and Advances Young Researcher Education. GigaScience 2012, 1:14 Steven J. O’Brien. (2012): Genome empowerment for the Puerto Rican parrot – Amazona vittata. GigaScience 2012, 1:13 Oleksyk et al., (2012): Genomic data of the Puerto Rican Parrot (Amazona vittata) from a locally funded project. GigaScience. http://dx.doi.org/10.5524/100039
  • 8. To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as: Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011): Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. http://dx.doi.org/10.5524/100001 Crowdsourcing disease outbreaks: To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
  • 9. Downstream consequences: “Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and publish their work without wasting time on legal wrangling.” 1. Citations (>200) 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons 4. Example for faster & more open science
  • 10. 1.3 The power of intelligently open data The benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro- intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin– producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency.
  • 11. IRRI GALAXY Beneficiaries of the genomics revolution? Rice 3K project: 3,000 rice genomes, 13.4TB public data
  • 12. Thanks to: editorial@gigasciencejournal.com database@gigasciencejournal.com @gigascience facebook.com/GigaScience blogs.biomedcentral.com/gigablog/ Contact us: Laurie Goodman, Editor in Chief Nicole Nogoy, Commissioning Editor Peter Li, Lead Data Manager Chris Hunter, Lead BioCurator Rob Davidson, Data Scientist Xiao (Jesse) Si Zhe, Database Developer Amye Kenall, Journal Development Manager Follow us: www.gigasciencejournal.com www.gigadb.org

×