SlideShare a Scribd company logo
Sean La
Intern
Simon Fraser University
laseanl@sfu.ca
Cheryl Ames, Ph.D.
Research Fellow
Smithsonian National Museum
of Natural History
amesc@si.edu
Ben Busby, Ph.D.
Genomics Outreach
Coordinator
NCBI
ben.busby@nih.gov
1. Image taken from https://media1.britannica.com/eb-media/82/126182-004-A23C1423.jpg
1
The scientific community wants to detect
viruses in SRA
SIDEARM
SRR
BLASTDB of
Viruses
Magic-BLAST
(Optimized version of BLAST)
BAM alignments
to viruses
Statistics Viral contigs
Motivation
2
2. Image taken from https://github.com/NCBI-Hackathons/Virus_Detection_SRA
1
1 Image taken from http://www.newhealthguide.org/images/19999893/image001.jpg
2
2 Image taken from https://3c1703fe8d.site.internapcdn.net/newman/gfx/news/hires/2014/auroraakinase.png
Detect bacteria in metagenomics samples Identify proteins
3
Detect plasmid sequences in bacterial reads
.
3 Image taken from https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Plasmid_%28english%29.svg/300px-Plasmid_%28english%29.svg.png
Detect mitochondrial DNA
4
4 Image taken http://www.penrules.com/_Media/art_mito_300.png
Step 6: Convert mitochondria-free files from SAM to FASTA
format
samtools fasta trimmed.read.nomtDNA.sam > trimmed.read.nomtDNA.fasta
Step 5: Extract reads that don’t map to mtDNA database
awk '$4 == 0 {print $0}' trimmed.read.sam >> trimmed.read.nomtDNA.sam
Step 4: Create Magic-blast report (.sam) mapped &
unmapped reads
magicblast -query trimmed.read.fasta -db ala_mito_db -splice F
-perc_identity 90 -paired > trimmed.read.sam &
Step 3: Generate A. alata mtDNA database
makeblastdb -in alatina_mitochondria.fasta -ala_mito_db -dbtype nucl
Step 2: Trim adaptors from NGS data sets
-- trimmomatic Illumina.fasta > trimmed.Illumina.fasta
-- removesmartbell.sh Pacbio.fasta > trimmed.pacbio.fasta
Step 1: Generate A. alata NGS data sets (n=15)
Illumina.1.fasta=short reads (forward)
Illumina.2.fasta=short reads (reverse)
Pacbio.fasta =long reads
A. alata 8 mitochondrial chromosomes (Genbank)
Step 7: Pipe mitochondria-free reads (n=15) into downstream
pipelines
trimmed.reads.nomtDNA.fasta e.g., genome assembly
box jellyfish A. alata
Neisseria meningitides
genome (ERR1865236)
BLAST DB of known
bacterial plasmids
SIDEARM
1 Image taken from https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Plasmid_%28english%29.svg/300px-Plasmid_%28english%29.svg.png
1
Viral Metagenome
(ERR1301508 a la
Chris O’Sullivan)
BLAST DB of complete
bacterial genomes
SIDEARM
On SRA….
Using SIDEARM…
 Greg Boratyn
 Mike Muchow
 Payl Cantalupo
 Alex Goncearenco
 Unix.systems
7

More Related Content

What's hot

Revised Bio 1wfx Recombinant D N A
Revised  Bio 1wfx   Recombinant  D N ARevised  Bio 1wfx   Recombinant  D N A
Revised Bio 1wfx Recombinant D N A
Hans Lim
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Jonathan Eisen
 
Human genome project
Human genome projectHuman genome project
Human genome project
Amjad Afridi
 
The human genome project
The human genome projectThe human genome project
The human genome project
14pascba
 
DNA Technology
DNA TechnologyDNA Technology
DNA Technology
mgsonline
 
ABIcurator.doc
ABIcurator.docABIcurator.doc
ABIcurator.doc
butest
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Surya Saha
 
Role of computer science in biotechnology
Role of computer science in biotechnologyRole of computer science in biotechnology
Role of computer science in biotechnology
Paranjay Manchanda
 
Three's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome AnnotationThree's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome Annotation
Monica Munoz-Torres
 
Kang Fighting Phytophthora 08
Kang Fighting Phytophthora 08Kang Fighting Phytophthora 08
Kang Fighting Phytophthora 08
Bongsoo Park
 
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
Jeff Capaldo
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and Opportunities
Jonathan Eisen
 
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
BZahiri
 
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuksThe need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
Jonathan Eisen
 

What's hot (14)

Revised Bio 1wfx Recombinant D N A
Revised  Bio 1wfx   Recombinant  D N ARevised  Bio 1wfx   Recombinant  D N A
Revised Bio 1wfx Recombinant D N A
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
The human genome project
The human genome projectThe human genome project
The human genome project
 
DNA Technology
DNA TechnologyDNA Technology
DNA Technology
 
ABIcurator.doc
ABIcurator.docABIcurator.doc
ABIcurator.doc
 
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
Using Long Reads, Optical Maps and Long-Range Scaffolding to improve the Diap...
 
Role of computer science in biotechnology
Role of computer science in biotechnologyRole of computer science in biotechnology
Role of computer science in biotechnology
 
Three's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome AnnotationThree's a crowd-source: Observations on Collaborative Genome Annotation
Three's a crowd-source: Observations on Collaborative Genome Annotation
 
Kang Fighting Phytophthora 08
Kang Fighting Phytophthora 08Kang Fighting Phytophthora 08
Kang Fighting Phytophthora 08
 
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
Johns Hopkins University - The Data Scientist's Toolbox - Certificate with Di...
 
Marine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and OpportunitiesMarine Host-Microbiome Interactions: Challenges and Opportunities
Marine Host-Microbiome Interactions: Challenges and Opportunities
 
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
Beniamin Zahiri-Coursera Data Scientist Toolbox 2015
 
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuksThe need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
The need for a phylogeny driven genomic encyclopedia of eukaryotes #SMBEEuks
 

More from Ben Busby

Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_access
Ben Busby
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data access
Ben Busby
 
Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019
Ben Busby
 
Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003
Ben Busby
 
Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019
Ben Busby
 
RNAML_Bio-IT_2019
RNAML_Bio-IT_2019RNAML_Bio-IT_2019
RNAML_Bio-IT_2019
Ben Busby
 
Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019
Ben Busby
 
Data science futures_v_vu2
Data science futures_v_vu2Data science futures_v_vu2
Data science futures_v_vu2
Ben Busby
 
Sage 2 19_v5_busby
Sage 2 19_v5_busbySage 2 19_v5_busby
Sage 2 19_v5_busby
Ben Busby
 
Bb health ai_jan26_v2
Bb health ai_jan26_v2Bb health ai_jan26_v2
Bb health ai_jan26_v2
Ben Busby
 
BB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_WorkshopBB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_Workshop
Ben Busby
 
Hackathons lightning v_nbs
Hackathons lightning v_nbsHackathons lightning v_nbs
Hackathons lightning v_nbs
Ben Busby
 
Cmu oss 18
Cmu oss 18Cmu oss 18
Cmu oss 18
Ben Busby
 
Genome web v_repro1
Genome web v_repro1Genome web v_repro1
Genome web v_repro1
Ben Busby
 
Data science futures_v_une
Data science futures_v_uneData science futures_v_une
Data science futures_v_une
Ben Busby
 
Variant and disease_grs_kickoff
Variant and disease_grs_kickoffVariant and disease_grs_kickoff
Variant and disease_grs_kickoff
Ben Busby
 
Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2
Ben Busby
 
Ncbi resources i5_k_v4
Ncbi resources i5_k_v4Ncbi resources i5_k_v4
Ncbi resources i5_k_v4
Ben Busby
 
Ncbi resources abrf_v3
Ncbi resources abrf_v3Ncbi resources abrf_v3
Ncbi resources abrf_v3
Ben Busby
 
Data science futures_v_lbirn
Data science futures_v_lbirnData science futures_v_lbirn
Data science futures_v_lbirn
Ben Busby
 

More from Ben Busby (20)

Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_access
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data access
 
Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019
 
Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003
 
Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019
 
RNAML_Bio-IT_2019
RNAML_Bio-IT_2019RNAML_Bio-IT_2019
RNAML_Bio-IT_2019
 
Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019
 
Data science futures_v_vu2
Data science futures_v_vu2Data science futures_v_vu2
Data science futures_v_vu2
 
Sage 2 19_v5_busby
Sage 2 19_v5_busbySage 2 19_v5_busby
Sage 2 19_v5_busby
 
Bb health ai_jan26_v2
Bb health ai_jan26_v2Bb health ai_jan26_v2
Bb health ai_jan26_v2
 
BB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_WorkshopBB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_Workshop
 
Hackathons lightning v_nbs
Hackathons lightning v_nbsHackathons lightning v_nbs
Hackathons lightning v_nbs
 
Cmu oss 18
Cmu oss 18Cmu oss 18
Cmu oss 18
 
Genome web v_repro1
Genome web v_repro1Genome web v_repro1
Genome web v_repro1
 
Data science futures_v_une
Data science futures_v_uneData science futures_v_une
Data science futures_v_une
 
Variant and disease_grs_kickoff
Variant and disease_grs_kickoffVariant and disease_grs_kickoff
Variant and disease_grs_kickoff
 
Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2
 
Ncbi resources i5_k_v4
Ncbi resources i5_k_v4Ncbi resources i5_k_v4
Ncbi resources i5_k_v4
 
Ncbi resources abrf_v3
Ncbi resources abrf_v3Ncbi resources abrf_v3
Ncbi resources abrf_v3
 
Data science futures_v_lbirn
Data science futures_v_lbirnData science futures_v_lbirn
Data science futures_v_lbirn
 

Recently uploaded

GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 

Recently uploaded (20)

GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 

Contamination Detection and Taxonomic confirmation with magicBLAST

  • 1. Sean La Intern Simon Fraser University laseanl@sfu.ca Cheryl Ames, Ph.D. Research Fellow Smithsonian National Museum of Natural History amesc@si.edu Ben Busby, Ph.D. Genomics Outreach Coordinator NCBI ben.busby@nih.gov
  • 2. 1. Image taken from https://media1.britannica.com/eb-media/82/126182-004-A23C1423.jpg 1 The scientific community wants to detect viruses in SRA SIDEARM SRR BLASTDB of Viruses Magic-BLAST (Optimized version of BLAST) BAM alignments to viruses Statistics Viral contigs Motivation 2 2. Image taken from https://github.com/NCBI-Hackathons/Virus_Detection_SRA
  • 3. 1 1 Image taken from http://www.newhealthguide.org/images/19999893/image001.jpg 2 2 Image taken from https://3c1703fe8d.site.internapcdn.net/newman/gfx/news/hires/2014/auroraakinase.png Detect bacteria in metagenomics samples Identify proteins 3 Detect plasmid sequences in bacterial reads . 3 Image taken from https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Plasmid_%28english%29.svg/300px-Plasmid_%28english%29.svg.png Detect mitochondrial DNA 4 4 Image taken http://www.penrules.com/_Media/art_mito_300.png
  • 4. Step 6: Convert mitochondria-free files from SAM to FASTA format samtools fasta trimmed.read.nomtDNA.sam > trimmed.read.nomtDNA.fasta Step 5: Extract reads that don’t map to mtDNA database awk '$4 == 0 {print $0}' trimmed.read.sam >> trimmed.read.nomtDNA.sam Step 4: Create Magic-blast report (.sam) mapped & unmapped reads magicblast -query trimmed.read.fasta -db ala_mito_db -splice F -perc_identity 90 -paired > trimmed.read.sam & Step 3: Generate A. alata mtDNA database makeblastdb -in alatina_mitochondria.fasta -ala_mito_db -dbtype nucl Step 2: Trim adaptors from NGS data sets -- trimmomatic Illumina.fasta > trimmed.Illumina.fasta -- removesmartbell.sh Pacbio.fasta > trimmed.pacbio.fasta Step 1: Generate A. alata NGS data sets (n=15) Illumina.1.fasta=short reads (forward) Illumina.2.fasta=short reads (reverse) Pacbio.fasta =long reads A. alata 8 mitochondrial chromosomes (Genbank) Step 7: Pipe mitochondria-free reads (n=15) into downstream pipelines trimmed.reads.nomtDNA.fasta e.g., genome assembly box jellyfish A. alata
  • 5. Neisseria meningitides genome (ERR1865236) BLAST DB of known bacterial plasmids SIDEARM 1 Image taken from https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Plasmid_%28english%29.svg/300px-Plasmid_%28english%29.svg.png 1
  • 6. Viral Metagenome (ERR1301508 a la Chris O’Sullivan) BLAST DB of complete bacterial genomes SIDEARM On SRA…. Using SIDEARM…
  • 7.  Greg Boratyn  Mike Muchow  Payl Cantalupo  Alex Goncearenco  Unix.systems 7

Editor's Notes

  1. Alatina alata has one of the most unusual mtDNA organizations in Metazoa, where genes are distributed on eight linear chromosomes with long terminal inverted repeats.