GigaScience: data and beta-database launch. Announcing GigaDB

  • 1,050 views
Uploaded on

GigaScience Editor-in-Chief Laurie Goodman's talk at the International Conference on Genomics pre-conference press-session on the release of new unpublished datasets, and a new look beta version of …

GigaScience Editor-in-Chief Laurie Goodman's talk at the International Conference on Genomics pre-conference press-session on the release of new unpublished datasets, and a new look beta version of their database: GigaDB.org

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,050
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Announces the launch ofWith the release of seventeen new genomic datasets from both plants and animals
  • 2. An upcoming open-access open-data journal and database Innovative article publishing and data hosting … “big and sharable” www.gigasciencejournal.com Published by BGI in partnership with BioMed Central
  • 3. About The Journal Open access, open-data online journal optimized for the publication of all types of biological studies that use or create large-scale data sets. Novel publication format that combines standard manuscript publication with an extensive database that hosts all associated data. Scope includes studies from the entire spectrum of life and biomedical sciences,including imaging, neuroscience, ecology, medicine, ‘omics, and other types oflarge-scale shareable data.Data are released under a CC0 license, making them, as much as possible under law, in the public domain so that others may freely use these for any purposes without restriction under copyright or database law.Editorial interaction with the different biological communities to determine the best means of hosting and accessing their type of data. Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
  • 4. Why DOI®s?– Clear method for data tracking and data citation, allowing: • Increased searchability (and use) of data • Credit for data production, making it clear who produced the data and when • The ability to track and receive feedback on data usage • Credit to original authors for their data’s use • A data citation metric potentially rivaling and complementary to the impact factor • The potential to publish papers relating to a dataset, while making the data available and receiving credit for it earlier
  • 5. Our first DOI®:To maximize its utility to the research community and aid those fighting the currentepidemic, genomic data is released here into the public domain under a CC0license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as:Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J;Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y;Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X;Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011)Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen.doi:10.5524/100001http://dx.doi.org/10.5524/100001 To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
  • 6. Nine Previously Available Datasets with DOIsAnimals Giant panda (Ailuropoda melanoleuca) Macaque Chinese rhesus macaque (Macaca mulatta lasiota) Crab-eating macaque (Macaca fascicularis) Penguin Emperor penguin (Aptenodytes forsteri) Adelie penguin (Pygoscelis adeliae) Pigeon, domestic (Columba livia domestica) Polar bear (Ursus maritimus)Microbes E. coli (Escherichia coli) O104:H4 strain TY-2482Cell Lines CHO-K1 - Chinese hamster (Cricetulus griseus) ovary cell line k1
  • 7. http://www.GigaDB.org
  • 8. Releasing During ICG-VI Animals: Both Vertebrates and InvertebratesAnt: Florida carpenter ant (Camponotus floridanus) Jerdon’s jumping ant (Harpegnathos saltator) Leaf-cutter ant (Acromyrmex echinatior)Human (Homo sapiens): Asian individual (YH): Genome Assembly Data DNA Methylome of Blood Cells Data Lymphoblastoid cell Transcriptome DataNaked mole rat (Heterocephalus glaber)Roundworm (Ascaris suum)Sheep, domestic (Ovis aries)Silkworm: Domestic (Bombyx mori) and wild (Bombyx mandarina) Multiple strainsTibetan antelope (Pantholops hodgsonii)
  • 9. Releasing During ICG-VIPlants Chinese cabbage (Brassica rapa) Cucumber, domestic (Cucumis sativus var. sativus L.) Foxtail millet (Setaria italica) Pigeonpea (Cajanus cajan) Potato (Solanum tuberosum L.) Sorghum(Sorghum bicolor): Two Strains: sweet and grainComing: Additional Human Individuals Aboriginal Australian Saqqaq palaeoeskimo And others that are currently under review
  • 10. Datasets without published analysis papers• Five of these datasets illustrate the future of early data release: These datasets are being released before their analysis papers are published.• These data can now be used by the community and the data cited with a DOI:• This promotes very rapid data release, as the data producers can receive citable credit— the primary means by which most academicians receive career advancement.• Thus, DOI and citation of data reduce the need to delay data release until after publication of the more detailed data analysis paper. (1) Foxtail millet; (2) Sorghum; (3) Human Asian individual lymphoblastoid cell transcriptome data; (4) Domestic Sheep; (5) Tibetan antelope
  • 11. GDSAP:Genomic Data Submission and Analytical platform Big data from the Data, Data, Data… “Sequencing Farm” Data Modeling Tin-Lap Lee, CUHK Pipeline design Validation Commercial applications “Apps”
  • 12. First demonstration of New Gold Standard for Data CitationDr. Clare Garvey, Editor of Genome Biology, has informedus, and agreed for us to announce, that The sorghum genomeanalysis paper has just been accepted in Genome Biology. Itwill be published later this month, and that paper will includethe data citation in the references where it can be easilytracked by Thompson ISI, and allow the easiest way currentlypossible for readers today to find and use that data.Zheng. L-Y; Guo X-S; He B; Sun, L-J; Peng, Y; Dong, S-S; Liu, T-F;Jiang, S; Ramachandran, S; Liu, C-M; Jing, H-C: Genome datafrom sweet and grain sorghum (Sorghum bicolor). GigaScience(2011). http://dx.doi.org/10.5524/100012
  • 13. Editor-in-Chief: Laurie Goodman, PhD Editor: Scott Edmunds, PhD Assistant Editor: Alexandra Basford, PhDContact: editorial@gigasciencejournal.com Follow GigaScience on Twitter @GigaScience www.gigasciencejournal.com www.gigaDB.org