Scott Edmunds: Revolutionizing Data Dissemination: GigaScience

4,898 views

Published on

Scott Edmunds talk in the "Policies and Standards for Reproducible Research" session on Revolutionizing Data Dissemination: GigaScience, at the Genomic Standards Consortium meeting at Shenzhen. 6th March 2012

Published in: Technology, Health & Medicine
1 Comment
1 Like
Statistics
Notes
  • blessing_11111@yahoo.com

    My name is Blessing
    i am a young lady with a kind and open heart,
    I enjoy my life,but life can't be complete if you don't have a person to share it
    with. blessing_11111@yahoo.com

    Hoping To Hear From You
    Yours Blessing
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
4,898
On SlideShare
0
From Embeds
0
Number of Embeds
2,263
Actions
Shares
0
Downloads
25
Comments
1
Likes
1
Embeds 0
No embeds

No notes for slide
  • Helps reproducibility, but some debate over whether it can help that much regarding scaling.
  • Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  • Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  • Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  • Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  • Scott Edmunds: Revolutionizing Data Dissemination: GigaScience

    1. Revolutionizing data dissemination. GSC13, Shenzhen Scott Edmunds www.gigasciencejournal.com
    2. Now taking submissions… Large-Scale Data Journal/Database In conjunction with:Editor-in-Chief: Laurie Goodman, PhDEditor: Scott Edmunds, PhDAssistant Editor: Alexandra Basford, PhDLead Curator: Tam Sneddon D.Phil www.gigasciencejournal.com
    3. Associated Database www.gigaDB.org
    4. FundersData Reuse Data Producers BGI Databases Journals Users
    5. Data Re-useEffort($) Usability
    6. Need to lower the hurdles…Effort($) Usability
    7. Need to lower the hurdles…Effort($) Usability
    8. Need to lower the hurdles…Better handling of metadata… Cloud solutions?Better tools for assessing data quality…
    9. Need to lower the hurdles…More efficient handling of data… Cloud?Do we need to keep everything?Compression?
    10. Better incentives?Effort($) Usability
    11. New incentives/creditCredit where credit is overdue:“One option would be to provide researchers who release data topublic repositories with a means of accreditation.”“An ability to search the literature for all online papers that used aparticular data set would enable appropriate attribution for thosewho share. “Nature Biotechnology 27, 579 (2009)Prepublication data sharing(Toronto International Data Release Workshop)“Data producers benefit from creating a citable reference, as it can ?later be used to reflect impact of the data sets.”Nature 461, 168-170 (2009)
    12. Datacitation: Datacite and DOIsAims to: “increase acceptance of research data as legitimate, citable contributions to the scholarly record”. “data generated in the course of research are just as valuable to the ongoing academic discourse as papers and monographs”.
    13. For data citation to work, needs:• Proven utility/potential user base.• Acceptance/inclusion by journals.• Data+Citation: inclusion in the references.• Tracking by citation indexes.• Usage of the metrics by the community…
    14. Datacitation: utility/user base.>1.3 million DOIs since Dec 2009
    15. BGI Datasets Get DOI®s Many released pre-publication…Invertebrate PLANTSAnt Vertebrates Chinese cabbage- Florida carpenter ant Giant panda Macaque Cucumber- Jerdon’s jumping ant - Chinese rhesus Foxtail millet- Leaf-cutter ant - Crab-eating PigeonpeaRoundworm Naked mole rat PotatoSilkworm Penguin Sorghum - Emperor penguinHuman - Adelie penguinAsian individual (YH) Pigeon, domestic- DNA Methylome Polar bear- Genome Assembly Sheep doi:10.5524/100004- Transcriptome Tibetan antelopeAncient DNA (coming soon)- Saqqaq Eskimo Microbe- Aboriginal Australian E. Coli O104:H4 TY-2482 Cell-Line Chinese Hamster Ovary
    16. Our first DOI:To maximize its utility to the research community and aid those fightingthe current epidemic, genomic data is released here into the public domainunder a CC0 license. Until the publication of research papers on theassembly and whole-genome analysis of this isolate we would ask you tocite this dataset as:Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G;Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S;Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z;Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J andthe Escherichia coli O104:H4 TY-2482 isolate genome sequencingconsortium (2011)Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGIShenzhen. doi:10.5524/100001http://dx.doi.org/10.5524/100001 To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
    17. Data Citation: acceptance by journals
    18. Data Citation: acceptance by journals
    19. Data+Citation: inclusion in the references
    20. • Data submitted to NCBI databases:- Raw data SRA:SRA046843- Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000- SNPs dbSNP:1056306- CNVs-- InDels SV } dbGAP:nstd63• Submission to public databases complemented by its citable form in GigaDB.
    21. In the references…
    22. Is the DOI…
    23. And now in Nature Biotech…
    24. Datacitation: tracking?
    25. Datacitation: tracking? DataCite metadata in harvestable form (OAI-PMH)Plans in 2012 to link central metadata repository with WoS - Will finally track and credit use! To be continued…
    26. Thanks to:Laurie Goodman Alexandra BasfordTam Sneddon Shaoguang LiangTin-Lap Lee (CUHK) Qiong Luo (HKUST) scott@gigasciencejournal.comContact us: editorial@gigasciencejournal.com @gigascience Follow us: facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog/ www.gigasciencejournal.com
    27. GSC13 special seriesSeeking submissions highlighting best practice ingenomics research: • Discussion/comment/white papers • Cloud computing, software for data handling • Research highlighting best practice• Rapid review - rolling publication after launch issue• High-visibility – published/promoted by BMC/GigaScience• Article Processing Charge covered by BGI• Hosting of any test datasets in GigaDB Contact: editorial@gigasciencejournal.com www.gigasciencejournal.com

    ×