SlideShare a Scribd company logo
Next-Gen Sequencing Analysis
by GigaGalaxy
Tin-Lap, LEE
School of Biomedical Sciences
CUHK-BGI Innovation Institute of Trans-omics,
The Chinese University of Hong Kong
CUHK-BGI Innovation Institute of Trans-Omics (CBIIT)
• Jointly established between The Chinese
University of Hong Kong (CUHK) and BGI
in July 2011.
• “We aim to provide a platform conductive
to training of multi-disciplinary talents
conversant with the knowledge and
application of genomics, proteomics,
genetics, computation biology and
bioinformatics, by capitalizing on both
institutions’ expertise and strengths in
genomic science.”
Galaxy
http://galaxyproject.org/
www.gigasciencejournal.com
Journal, data-platform and
database for large-scale data
Editor-in-Chief: Laurie Goodman
Executive Editor: Scott Edmunds
Commissioning Editor: Nicole Nogoy
Lead Curator: Chris Hunter
Data Platform: Peter Li
in conjunction with
GigaDB
Giga-Galaxy
 Collaboration between GigaScience and CBIIT
 A publicly accessible Galaxy Servers
 Share some of the workload of the main Galaxy server
 Host data and workflows published in GigaScience, particularly involving
NGS data analysis
 SOAP package: advantages from GigaGalaxy
 Application Instance: SOAPdenovo2 tool
http://www.cuhk.edu.hk/cbiit/galaxy.html
Galaxy/CUHK-BGI
Import data from GigaDB to GigaGalaxy
GigaSolution: deconstructing the paper
www.gigadb.org
www.gigasciencejournal.com
galaxy.cbiit.cuhk.edu.hk
Combines and integrates:
Open-access journal
Data Publishing Platform
Data Analysis Platform
doi:10.1186/2047-217X-1-18doi:10.5524/100038
AnalysisData Methods
doi:10.5524/100044+ =
Wang J et al., (2012): Updated genome assembly of YH: the first diploid genome sequence of a
Han Chinese individual (version 2, 07/2012). GigaScience Database.
http://dx.doi.org/10.5524/100038
Luo R et al., (2012): Software and supporting material for “SOAPdenovo2: An empirically improved
memory-efficient short read de novo assembly”. GigaScience Database.
http://dx.doi.org/10.5524/100044
Data
Methods
Luo R et al., (2012): SOAPdenovo2: an empirically improved memory-efficient short-read de novo
assembler GigaScience, 1:18 (28th December 2012) http://dx.doi.org/10.1186/2047-217X-1-18
Analysis
Example
Tin-Lap Lee: Next-Gen Sequencing Analysis by GigaGalaxy
CBIIT GigaGalaxy Structure
Tool
Development PublishingBiomedical and bioinformatics research
What is SOAP?
• SOAP - a tool package that provides full solution to NGS data analysis by BGI.
http://soap.genomics.org.cn/
SOAPdenovo2 tools
 An assembly tool for short reads generated from NGS
technology
 Four modules
 Pregraph: construct bruijn graph
 Contig: identification from overlapping sequence reads
 Map: reads onto contigs
 Scaff: generate final assembly results
 Generate 1. Contig and 2. Scaffold files
SOAPdenovo2 in GigaGalaxy
Integrate BGI SOAP tools into Giga-Galaxy
Assembly Supporting Tools
• SOAPfilter: removed reads with artifacts
• Kmerfreq HA: a kmer frequency counter
• Corrector HA: corrects sequencing errors in short reads
• Gapcloser: close gaps in scaffolds
Put them together
Sequencing
Data
SOAPfilter kmerFreq HA
Corrector HASOAPdenovo2GAGE evaluation
Soapdenovo2 Workflow
S. Aureus Dataset
GAGE
Visualization Tool: CONTIGuator2
CONTIGuator2 output
Visualization
NC_010079.pdf
gi_161510924_ref_NC_010063.1_.pdf
Help Center: Shared Data
• Several Datasets are available from the shared data menu
for test-running the tools.
• Data Libraries
• Published Workflows
• Published Pages
What is in the shared data menu?
SOAPdenovo2 tutorial
Tin-Lap Lee: Next-Gen Sequencing Analysis by GigaGalaxy
How is GigaScience supporting data
reproducibility?
Data sets
Analyses
Open-Paper
Open-Review
DOI:10.1186/2047-217X-1-18
~10000 accesses
Open-Code
8 reviewers tested data in ftp server & named reports published
DOI:10.5524/100044
Open-Pipelines
Open-Workflows
DOI:10.5524/100038
Open-Data
78GB CC0 data
Code in sourceforge under GPLv3: http://soapdenovo2.sourceforge.net/
~5000 downloads
Enabled code to being picked apart by bloggers in wiki
http://homolog.us/wiki/index.php?title=SOAPdenovo2
SOAPdenovo2 workflows implemented in
galaxy.cbiit.cuhk.edu.hk
Implemented entire workflow in GigaGalaxy server, inc.:
• 3 pre-processing steps
• 4 SOAPdenovo modules
• 1 post processing steps
• Evaluation and visualization tools
Will be available for >25K Galaxy users in Galaxy Toolshed
Acknowledgements
• CUHK
• Huayuan Gao
• BGI-HK and GigaScience
• Peter Li
• Scott Edmunds
• Galaxy team members

More Related Content

Viewers also liked

Scott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science sessionScott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science session
GigaScience, BGI Hong Kong
 
Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
 Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
IndiaMART InterMESH Limited
 
Alldelite Heat Pumps Limited, Chennai, Heat Pumps
Alldelite Heat Pumps Limited, Chennai, Heat PumpsAlldelite Heat Pumps Limited, Chennai, Heat Pumps
Alldelite Heat Pumps Limited, Chennai, Heat Pumps
IndiaMART InterMESH Limited
 
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
IndiaMART InterMESH Limited
 
Element14 India Private Limited, Bengaluru
Element14 India Private Limited, BengaluruElement14 India Private Limited, Bengaluru
Element14 India Private Limited, Bengaluru
IndiaMART InterMESH Limited
 
Techno Electronics System, Delhi, DC Motor & Transformer
Techno Electronics System, Delhi, DC Motor & TransformerTechno Electronics System, Delhi, DC Motor & Transformer
Techno Electronics System, Delhi, DC Motor & Transformer
IndiaMART InterMESH Limited
 
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator SunglassesWink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
IndiaMART InterMESH Limited
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
GigaScience, BGI Hong Kong
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience, BGI Hong Kong
 
DNV Creations, New Delhi, Wood Packaging Solutions
DNV Creations, New Delhi, Wood Packaging SolutionsDNV Creations, New Delhi, Wood Packaging Solutions
DNV Creations, New Delhi, Wood Packaging Solutions
IndiaMART InterMESH Limited
 
Channel Co-operation - A Distant Dream?
Channel Co-operation - A Distant Dream?Channel Co-operation - A Distant Dream?
Channel Co-operation - A Distant Dream?
Richard Tubb
 

Viewers also liked (11)

Scott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science sessionScott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science session
 
Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
 Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
Puneet Laboratories Pvt. Ltd. Mumbai, Mumbai, Zinc Carnosine Capsules
 
Alldelite Heat Pumps Limited, Chennai, Heat Pumps
Alldelite Heat Pumps Limited, Chennai, Heat PumpsAlldelite Heat Pumps Limited, Chennai, Heat Pumps
Alldelite Heat Pumps Limited, Chennai, Heat Pumps
 
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
Unique SPM Solutions & Engineering, Ghaziabad , Broaching Machines
 
Element14 India Private Limited, Bengaluru
Element14 India Private Limited, BengaluruElement14 India Private Limited, Bengaluru
Element14 India Private Limited, Bengaluru
 
Techno Electronics System, Delhi, DC Motor & Transformer
Techno Electronics System, Delhi, DC Motor & TransformerTechno Electronics System, Delhi, DC Motor & Transformer
Techno Electronics System, Delhi, DC Motor & Transformer
 
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator SunglassesWink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
Wink Lifestyles Pvt. Ltd., Mumbai, Aviator Sunglasses
 
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...
 
GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
 
DNV Creations, New Delhi, Wood Packaging Solutions
DNV Creations, New Delhi, Wood Packaging SolutionsDNV Creations, New Delhi, Wood Packaging Solutions
DNV Creations, New Delhi, Wood Packaging Solutions
 
Channel Co-operation - A Distant Dream?
Channel Co-operation - A Distant Dream?Channel Co-operation - A Distant Dream?
Channel Co-operation - A Distant Dream?
 

Similar to Tin-Lap Lee: Next-Gen Sequencing Analysis by GigaGalaxy

Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
Larry Smarr
 
Global Network Advancement Group Next Generation Network-Integrated Sys...
      Global Network Advancement GroupNext Generation Network-Integrated Sys...      Global Network Advancement GroupNext Generation Network-Integrated Sys...
Global Network Advancement Group Next Generation Network-Integrated Sys...
Larry Smarr
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
kentblackburn
 
Grid computing
Grid computingGrid computing
Grid computing
Ramraj Choudhary
 
C02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analyticsC02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analytics
Bioinformatics Open Source Conference
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC
 
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
GIS in the Rockies
 
COBWEB technology platform and future development needs
COBWEB technology platform and future development needsCOBWEB technology platform and future development needs
COBWEB technology platform and future development needs
EDINA, University of Edinburgh
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
marpierc
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
Carole Goble
 
BioNLPSADI
BioNLPSADIBioNLPSADI
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
Larry Smarr
 
IDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on CloudIDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on Cloud
stratuslab
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
Globus
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
marpierc
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
Luis Bermudez
 
OpenACC and Open Hackathons Monthly Highlights June 2022.pdf
OpenACC and Open Hackathons Monthly Highlights June 2022.pdfOpenACC and Open Hackathons Monthly Highlights June 2022.pdf
OpenACC and Open Hackathons Monthly Highlights June 2022.pdf
OpenACC
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Blue BRIDGE
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
Robert Davidson
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
Bioinformatics and Computational Biosciences Branch
 

Similar to Tin-Lap Lee: Next-Gen Sequencing Analysis by GigaGalaxy (20)

Global Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated SystemsGlobal Network Advancement Group - Next Generation Network-Integrated Systems
Global Network Advancement Group - Next Generation Network-Integrated Systems
 
Global Network Advancement Group Next Generation Network-Integrated Sys...
      Global Network Advancement GroupNext Generation Network-Integrated Sys...      Global Network Advancement GroupNext Generation Network-Integrated Sys...
Global Network Advancement Group Next Generation Network-Integrated Sys...
 
Ogf27 Ligo
Ogf27 LigoOgf27 Ligo
Ogf27 Ligo
 
Grid computing
Grid computingGrid computing
Grid computing
 
C02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analyticsC02-Visualization-Applying visual analytics
C02-Visualization-Applying visual analytics
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
 
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
2015 FOSS4G Track: Open Specifications for the Storage, Transport and Process...
 
COBWEB technology platform and future development needs
COBWEB technology platform and future development needsCOBWEB technology platform and future development needs
COBWEB technology platform and future development needs
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Security Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research PlatformSecurity Challenges and the Pacific Research Platform
Security Challenges and the Pacific Research Platform
 
IDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on CloudIDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on Cloud
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Validation of services, data and metadata
Validation of services, data and metadataValidation of services, data and metadata
Validation of services, data and metadata
 
OpenACC and Open Hackathons Monthly Highlights June 2022.pdf
OpenACC and Open Hackathons Monthly Highlights June 2022.pdfOpenACC and Open Hackathons Monthly Highlights June 2022.pdf
OpenACC and Open Hackathons Monthly Highlights June 2022.pdf
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 

More from GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
GigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
GigaScience, BGI Hong Kong
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
GigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
GigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
GigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
GigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
GigaScience, BGI Hong Kong
 

More from GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Recently uploaded

Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
DianaGray10
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
ssuser1915fe1
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
SynapseIndia
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
Debmalya Biswas
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
FIDO Alliance
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
Razin Mustafiz
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
Baishakhi Ray
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
Ivanti
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Priyanka Aash
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
KIRAN KV
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
SubhamMandal40
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
ldtexsolbl
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
bellared2
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
shanihomely
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 

Recently uploaded (20)

Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision MakingConnector Corner: Leveraging Snowflake Integration for Smarter Decision Making
Connector Corner: Leveraging Snowflake Integration for Smarter Decision Making
 
Feature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptxFeature sql server terbaru performance.pptx
Feature sql server terbaru performance.pptx
 
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptxUse Cases & Benefits of RPA in Manufacturing in 2024.pptx
Use Cases & Benefits of RPA in Manufacturing in 2024.pptx
 
Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)Gen AI: Privacy Risks of Large Language Models (LLMs)
Gen AI: Privacy Risks of Large Language Models (LLMs)
 
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
UX Webinar Series: Essentials for Adopting Passkeys as the Foundation of your...
 
The Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - CoatueThe Path to General-Purpose Robots - Coatue
The Path to General-Purpose Robots - Coatue
 
Semantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software DevelopmentSemantic-Aware Code Model: Elevating the Future of Software Development
Semantic-Aware Code Model: Elevating the Future of Software Development
 
Patch Tuesday de julio
Patch Tuesday de julioPatch Tuesday de julio
Patch Tuesday de julio
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
 
kk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdfkk vathada _digital transformation frameworks_2024.pdf
kk vathada _digital transformation frameworks_2024.pdf
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
Sonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdfSonkoloniya documentation - ONEprojukti.pdf
Sonkoloniya documentation - ONEprojukti.pdf
 
Types of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technologyTypes of Weaving loom machine & it's technology
Types of Weaving loom machine & it's technology
 
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
Russian Girls Call Navi Mumbai 🎈🔥9920725232 🔥💋🎈 Provide Best And Top Girl Ser...
 
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
Girls call Kolkata 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
Premium Girls Call Mumbai 9920725232 Unlimited Short Providing Girls Service ...
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 

Tin-Lap Lee: Next-Gen Sequencing Analysis by GigaGalaxy

  • 1. Next-Gen Sequencing Analysis by GigaGalaxy Tin-Lap, LEE School of Biomedical Sciences CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong
  • 2. CUHK-BGI Innovation Institute of Trans-Omics (CBIIT) • Jointly established between The Chinese University of Hong Kong (CUHK) and BGI in July 2011. • “We aim to provide a platform conductive to training of multi-disciplinary talents conversant with the knowledge and application of genomics, proteomics, genetics, computation biology and bioinformatics, by capitalizing on both institutions’ expertise and strengths in genomic science.”
  • 4. www.gigasciencejournal.com Journal, data-platform and database for large-scale data Editor-in-Chief: Laurie Goodman Executive Editor: Scott Edmunds Commissioning Editor: Nicole Nogoy Lead Curator: Chris Hunter Data Platform: Peter Li in conjunction with
  • 6. Giga-Galaxy  Collaboration between GigaScience and CBIIT  A publicly accessible Galaxy Servers  Share some of the workload of the main Galaxy server  Host data and workflows published in GigaScience, particularly involving NGS data analysis  SOAP package: advantages from GigaGalaxy  Application Instance: SOAPdenovo2 tool
  • 8. Import data from GigaDB to GigaGalaxy
  • 9. GigaSolution: deconstructing the paper www.gigadb.org www.gigasciencejournal.com galaxy.cbiit.cuhk.edu.hk Combines and integrates: Open-access journal Data Publishing Platform Data Analysis Platform
  • 10. doi:10.1186/2047-217X-1-18doi:10.5524/100038 AnalysisData Methods doi:10.5524/100044+ = Wang J et al., (2012): Updated genome assembly of YH: the first diploid genome sequence of a Han Chinese individual (version 2, 07/2012). GigaScience Database. http://dx.doi.org/10.5524/100038 Luo R et al., (2012): Software and supporting material for “SOAPdenovo2: An empirically improved memory-efficient short read de novo assembly”. GigaScience Database. http://dx.doi.org/10.5524/100044 Data Methods Luo R et al., (2012): SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler GigaScience, 1:18 (28th December 2012) http://dx.doi.org/10.1186/2047-217X-1-18 Analysis Example
  • 12. CBIIT GigaGalaxy Structure Tool Development PublishingBiomedical and bioinformatics research
  • 13. What is SOAP? • SOAP - a tool package that provides full solution to NGS data analysis by BGI. http://soap.genomics.org.cn/
  • 14. SOAPdenovo2 tools  An assembly tool for short reads generated from NGS technology  Four modules  Pregraph: construct bruijn graph  Contig: identification from overlapping sequence reads  Map: reads onto contigs  Scaff: generate final assembly results  Generate 1. Contig and 2. Scaffold files
  • 16. Integrate BGI SOAP tools into Giga-Galaxy
  • 17. Assembly Supporting Tools • SOAPfilter: removed reads with artifacts • Kmerfreq HA: a kmer frequency counter • Corrector HA: corrects sequencing errors in short reads • Gapcloser: close gaps in scaffolds
  • 18. Put them together Sequencing Data SOAPfilter kmerFreq HA Corrector HASOAPdenovo2GAGE evaluation
  • 21. GAGE
  • 25. Help Center: Shared Data • Several Datasets are available from the shared data menu for test-running the tools. • Data Libraries • Published Workflows • Published Pages
  • 26. What is in the shared data menu?
  • 29. How is GigaScience supporting data reproducibility? Data sets Analyses Open-Paper Open-Review DOI:10.1186/2047-217X-1-18 ~10000 accesses Open-Code 8 reviewers tested data in ftp server & named reports published DOI:10.5524/100044 Open-Pipelines Open-Workflows DOI:10.5524/100038 Open-Data 78GB CC0 data Code in sourceforge under GPLv3: http://soapdenovo2.sourceforge.net/ ~5000 downloads Enabled code to being picked apart by bloggers in wiki http://homolog.us/wiki/index.php?title=SOAPdenovo2
  • 30. SOAPdenovo2 workflows implemented in galaxy.cbiit.cuhk.edu.hk Implemented entire workflow in GigaGalaxy server, inc.: • 3 pre-processing steps • 4 SOAPdenovo modules • 1 post processing steps • Evaluation and visualization tools Will be available for >25K Galaxy users in Galaxy Toolshed
  • 31. Acknowledgements • CUHK • Huayuan Gao • BGI-HK and GigaScience • Peter Li • Scott Edmunds • Galaxy team members

Editor's Notes

  1. Galaxy is a web-based data analysis platform developed by PSUAccessible, Reproducible, and transparentEasy to use, no command line, much shorter learning curve for biologists
  2. The first section of this talk is about implementation of public instance using galaxy tool shed. We are currently implement the first public SOAP instance to the platform.
  3. The SOAP package provides a set of tools for processing NGS data. There are different versions of SOAP for mapping short reads to reference sequences. There are also tools like soapdenovo for construction of a new genome sequence and soapsnp which can assemble a consensus sequence and identify SNPs present on it in relation to a reference. Documentation in the BGI SOAP package is limited in scope, making the tools difficult to use. We will be working with the BGI developers in providing test data and Galaxy pipelines demonstrating the use of SOAP.