Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
GDSAP- A Galaxy-based platformfor large-scale genomics analysis                   Tin-Lap, LEE          School of Biomedic...
CBIIT        • Jointly established between          The Chinese University of          Hong Kong (CUHK) and BGI.        • ...
Genomic Data Submission and Analytical Platform(GDSAP)Objectives:• Provides enhanced functionality in additional to the or...
GDSAP Structure    ToolDevelopment   Biomedical and bioinformatics research   Publishing
Galaxy/CUHK-BGIhttp://www.cuhk.edu.hk/cbiit/galaxy.html
GDSAP Structure    ToolDevelopment   Biomedical and bioinformatics research   Publishing
What is SOAP?• SOAP - a tool package that provides full solution to NGS data  analysis by BGI.
Why SOAP?• Galaxy has been using SAMtools for consensus sequence calling, but the  recent upgrade has left this part out, ...
Galaxy Tool Shed• Enables sharing of Galaxy tools across  Galaxy servers around the world.• SOAP package tools configured ...
Implement: SOAPsnp
Implement: SOAPdenovo configuration file
Implement: SOAPdenovo
GDSAP structureBioinformaticsDevelopment      Biomedical and bioinformatics research   Publishing
How does it work?      • MyExperiment works as a repository for        workflows.      • Taverna workflows.      • New: Ga...
Taverna workflow
Galaxy workflow
Import (1)
Import (2)
Export (1)
Export (2)
GDSAP structureBioinformaticsDevelopment      Biomedical and bioinformatics research   Publishing
Now taking submissions…        Large-Scale Data        Journal/Database       In conjunction with:Editor-in-Chief: Laurie ...
GigaScience is go…
Data Publishing www.gigaDB.org
37 Datasets with DOI®sInvertebrate                                             Released pre-publicationAnt                ...
GDSAP: Genomic Data Submission            and Analytical platformGigaDB v2 export to GDSAP
GDSAP: Genomic Data Submission              and Analytical platform                                 Big data              ...
Acknowledgements•   Lee Lab (CUHK)             • myExperiment     – Huayan Gao                 – Finn Bacall              ...
Thank you
Tin-Lap Lee: GDSAP- A Galaxy-based platform for large-scale genomics analysis
Upcoming SlideShare
Loading in …5
×

Tin-Lap Lee: GDSAP- A Galaxy-based platform for large-scale genomics analysis

1,478 views

Published on

Tin-Lap Lee (CUHK) presentation "GDSAP- A Galaxy-based platform for large-scale genomics analysis" from the Galaxy Community Conference 2012, Chicago, July 26th 2012

Published in: Technology
  • Be the first to comment

Tin-Lap Lee: GDSAP- A Galaxy-based platform for large-scale genomics analysis

  1. 1. GDSAP- A Galaxy-based platformfor large-scale genomics analysis Tin-Lap, LEE School of Biomedical Sciences, CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Hong Kong SAR, China.
  2. 2. CBIIT • Jointly established between The Chinese University of Hong Kong (CUHK) and BGI. • “We aim to provide a platform conducive to training of multi-disciplinary talents conversant with the knowledge and application of genomics, proteomics, genetics , computation biology and bioinformatics, by capitalizing on both institutions’ expertise and strengths in genomic science.”
  3. 3. Genomic Data Submission and Analytical Platform(GDSAP)Objectives:• Provides enhanced functionality in additional to the original Galaxy functions: • Customized public instances. • Seamless integration with SBS-UCSC genome database mirror and MyExperiment workflow environment. • Exchange and publish data through GigaSciences journal portal.Outcomes:• Simplies complicated bioinformatics tasks, accelerate data processing and allow flexible analysis.• Significantly reduce software and hardware costs, encourage research collaboration.
  4. 4. GDSAP Structure ToolDevelopment Biomedical and bioinformatics research Publishing
  5. 5. Galaxy/CUHK-BGIhttp://www.cuhk.edu.hk/cbiit/galaxy.html
  6. 6. GDSAP Structure ToolDevelopment Biomedical and bioinformatics research Publishing
  7. 7. What is SOAP?• SOAP - a tool package that provides full solution to NGS data analysis by BGI.
  8. 8. Why SOAP?• Galaxy has been using SAMtools for consensus sequence calling, but the recent upgrade has left this part out, which is very limited to some biologists.• SOAPsnp is the only other method that can call full consensus sequences besides SAMtools.• The main galaxy site supports none of the SOAP tools, including SOAPsnp.
  9. 9. Galaxy Tool Shed• Enables sharing of Galaxy tools across Galaxy servers around the world.• SOAP package tools configured for use in Galaxy. – SOAPsnp/SOAPdenovo
  10. 10. Implement: SOAPsnp
  11. 11. Implement: SOAPdenovo configuration file
  12. 12. Implement: SOAPdenovo
  13. 13. GDSAP structureBioinformaticsDevelopment Biomedical and bioinformatics research Publishing
  14. 14. How does it work? • MyExperiment works as a repository for workflows. • Taverna workflows. • New: Galaxy workflows. • GDSAP integration
  15. 15. Taverna workflow
  16. 16. Galaxy workflow
  17. 17. Import (1)
  18. 18. Import (2)
  19. 19. Export (1)
  20. 20. Export (2)
  21. 21. GDSAP structureBioinformaticsDevelopment Biomedical and bioinformatics research Publishing
  22. 22. Now taking submissions… Large-Scale Data Journal/Database In conjunction with:Editor-in-Chief: Laurie Goodman, PhDEditor: Scott Edmunds, PhDAssistant Editor: Alexandra Basford, PhD www.gigasciencejournal.com
  23. 23. GigaScience is go…
  24. 24. Data Publishing www.gigaDB.org
  25. 25. 37 Datasets with DOI®sInvertebrate Released pre-publicationAnt Vertebrates Non-BGI- Florida carpenter ant Giant panda Paper in GigaScience- Jerdon’s jumping ant Macaque- Leaf-cutter ant - Chinese rhesus PlantsRoundworm - Crab-eating Chinese cabbageSchistosoma Mini-Pig CucumberSilkworm Naked mole rat Foxtail millet Penguin Pigeonpea - Emperor penguin PotatoHuman SorghumAsian individual (YH) v1+v2 - Adelie penguin- DNA Methylome Pigeon, domestic- Genome Assembly Polar bear- Transcriptome Sheep Coming soon…Cancer (14TB) Tibetan antelope Microbiome dataHep B infected exomes ParrotSingle Cell Bladder Cancer MicrobesAncient DNA E. Coli O104:H4 TY-2482- Saqqaq Eskimo Cell-Line- Aboriginal Australian Chinese Hamster Ovary Mouse Methylomes
  26. 26. GDSAP: Genomic Data Submission and Analytical platformGigaDB v2 export to GDSAP
  27. 27. GDSAP: Genomic Data Submission and Analytical platform Big data from theData, Data, Data… “Sequencing Coal Face” Data Modeling Pipeline design Tin-Lap Lee, CUHK Validation Applications
  28. 28. Acknowledgements• Lee Lab (CUHK) • myExperiment – Huayan Gao – Finn Bacall – Dave De Roure• GigaScience • NBIC – Scott Edmunds – Kostas Karasavvas – Peter Li – Tam Sneddon• BGI-Hong Kong – Dennis Chan – Edmond Leung• Galaxy team – Nate Coraor
  29. 29. Thank you

×