SlideShare a Scribd company logo
Luca Cozzuto
Bioinformatics Core Facility
vectorQC
A pipeline for assembling
and annotation of vectors
Background
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
Amplification (cloning vector)
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
Vector
Host cell
Amplification (cloning vector)
Expression (expression vector)
A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a
cell, where it can be replicated and/or expressed.
The vector itself is generally a DNA sequence that consists of an insert (transgene) and
a larger sequence that serves as the "backbone" of the vector.
Background
A vector is composed of different elements:
• Origin of replication
• Cloning sites: one or more targets for restriction enzymes
The pBR322 plasmid
• Reporter genes: genes that activate / inactivate
their function after successful insertion and colour
the positive colonies
• Antibiotic resistance: for selecting only the
colonies containing the vector
• Promoter
• …
Source: wikipedia
The problem
Nowadays vectors are considered a basic tools in biotechnology and having a library of
vector in a lab / facility is quite common.
After each year there is an increase of the risk of mis-labelling, construct degradation,
contamination.
Having a quality control of the integrity of the
vectors backbone and of the inserted DNA
could help in avoiding wasting of time and
money and in reducing errors.
Solution
Biomolecular Screening
&
Protein Technologies Unit
Genomics Unit
Bioinformatics Unit
Solution
Massive
sequencing
Pool of vectors
Solution
Massive
sequencing
Pool of vectors Analysis
Reproducible
pipeline
Solution
Massive
sequencing
Pool of vectors Analysis
Reproducible
pipeline
Result
Report and map of
each vector
Database
The pipeline: vectorQC
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
vectorQC
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
Annotation of
features
DB of features
+ list of inserts
Annotations
Fragmented DNA
Scaffolds / whole
constructs
Quality
trimming and
assembly
Annotation of
features
DB of features
+ list of inserts
Annotations
Generating
maps Generating report
and sequences
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
Read assembly
• Flash: merging of overlapping reads (optional)
• SPAdes: assembly that is corrected with a custom script for addressing the circularity
• Custom script: to randomly join the scaffolds in a single molecule
vectorQC
Quality control and trimming
• FASTQC: QC of initial and trimmed reads
• Skewer: trimming the raw reads.
Read assembly
• Flash: merging of overlapping reads (optional)
• SPAdes: assembly that is corrected with a custom script for addressing the circularity
• Custom script: to randomly join the scaffolds in a single molecule
Annotation
• Blast: annotating features and eventually detecting the DNA insert.
• Restrict (Emboss): for detecting restriction enzyme sites
• Circular Genome Viewer: for generating the maps
• MultiQC: for collecting the results in a comprehensive report
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
https://github.com/biocorecrg/vectorQC
vectorQC
Available resources
• Database of features: from Plasmapper tool, but can be expanded
• Database of restriction enzyme: REBASE
Custom resources
• Insert list: custom fasta file with the name of the inserts
https://github.com/biocorecrg/vectorQC
vectorQC
vectorQC
vectorQC
vectorQC
Good practices
Good practices
Continuous integration
Good practices
Docker image in dockerhub with automatic buildings
Next developments
• Improving the assembly: removing the low covered contigs
• Comparison with reference: if provided we should check the concordance of the
contigs with the reference
• Detection of variants: SNP / Indel calling against the reference if provided
https://github.com/biocorecrg/vectorQC
Thank you!
Toni Hermoso Pulido
Julia Ponomarenko
Sarah Bonnin
Jochen Hecht (Genomics Unit)
Carlo Carolis (BS&PT Unit)

More Related Content

What's hot

Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Torsten Seemann
 
Abrf 2017 hadfield j
Abrf 2017 hadfield jAbrf 2017 hadfield j
Abrf 2017 hadfield j
James Hadfield
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
Morgan Langille
 
Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015
Torsten Seemann
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
João André Carriço
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualize
Ann Loraine
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
Christian Frech
 
Viral biodiversity in rodents
Viral biodiversity in rodentsViral biodiversity in rodents
Viral biodiversity in rodents
Nacho Caballero
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
Qiang Kou
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platforms
AllSeq
 
Robust tn5 transposase
Robust tn5 transposaseRobust tn5 transposase
Robust tn5 transposase
creativebiogene1
 
Benjamin Stielow - Fungi
Benjamin Stielow - FungiBenjamin Stielow - Fungi
Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012
gregcaporaso
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis Pipeline
Hai-Wei Yen
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Data Driven Innovation
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2
BITS
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
Josh Neufeld
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Torsten Seemann
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
AGRF_Ltd
 
Genome simulation and applications
Genome simulation and applicationsGenome simulation and applications
Genome simulation and applications
Hari Prasad
 

What's hot (20)

Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
Approaches to analysing 1000s of bacterial isolates - ICEID 2015 Atlanta, USA...
 
Abrf 2017 hadfield j
Abrf 2017 hadfield jAbrf 2017 hadfield j
Abrf 2017 hadfield j
 
GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015Long read sequencing - LSCC lab talk - fri 5 june 2015
Long read sequencing - LSCC lab talk - fri 5 june 2015
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
wings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualizewings2014 Workshop 1 Design, sequence, align, count, visualize
wings2014 Workshop 1 Design, sequence, align, count, visualize
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
Viral biodiversity in rodents
Viral biodiversity in rodentsViral biodiversity in rodents
Viral biodiversity in rodents
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
NGx Sequencing 101-platforms
NGx Sequencing 101-platformsNGx Sequencing 101-platforms
NGx Sequencing 101-platforms
 
Robust tn5 transposase
Robust tn5 transposaseRobust tn5 transposase
Robust tn5 transposase
 
Benjamin Stielow - Fungi
Benjamin Stielow - FungiBenjamin Stielow - Fungi
Benjamin Stielow - Fungi
 
Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012Caporaso sloan qiime_workshop_slides_18_oct2012
Caporaso sloan qiime_workshop_slides_18_oct2012
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis Pipeline
 
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekGenomic Big Data Management, Integration and Mining - Emanuel Weitschek
Genomic Big Data Management, Integration and Mining - Emanuel Weitschek
 
BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2BITS training - UCSC Genome Browser - Part 2
BITS training - UCSC Genome Browser - Part 2
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
Pipeline or pipe dream - Midlands Micro Meeting UK - mon 15 sep 2014
 
An introduction to RNA-seq data analysis
An introduction to RNA-seq data analysisAn introduction to RNA-seq data analysis
An introduction to RNA-seq data analysis
 
Genome simulation and applications
Genome simulation and applicationsGenome simulation and applications
Genome simulation and applications
 

Similar to vectorQC: 'A pipeline for assembling and annotation of vectors'

Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Prof. Wim Van Criekinge
 
Vectors
VectorsVectors
Vectors
PardeepKaur78
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
Bioinformatics and Computational Biosciences Branch
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 
cloning vectors.ppt
cloning vectors.pptcloning vectors.ppt
cloning vectors.ppt
Dr. Vardhana J
 
Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors
فہیمہ کاسی
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on Production
Chris Dwan
 
BioWeka
BioWekaBioWeka
BioWeka
Martin Szugat
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891
saurabh verma
 
Gene library
Gene libraryGene library
Gene library
Delince Samuel
 
Gwas.emes.comp
Gwas.emes.compGwas.emes.comp
Gwas.emes.comp
Richard Emes
 
Cloning vector
Cloning vectorCloning vector
Cloning vector
Effat Jahan Tamanna
 
Cloning vectors
Cloning vectorsCloning vectors
Cloning vectors
Effat Jahan Tamanna
 
Genomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash SontakkeGenomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash Sontakke
KAILASHSONTAKKE
 
cloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology classcloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology class
rakeshbarik8
 
Principles of cloning DNA introduction
Principles of cloning DNA introductionPrinciples of cloning DNA introduction
Principles of cloning DNA introduction
Komandla venkatkiran Reddy
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Nick Brown
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
Kim D. Pruitt
 
DNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.pptDNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.ppt
ChisamaSichone1
 

Similar to vectorQC: 'A pipeline for assembling and annotation of vectors' (20)

Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Vectors
VectorsVectors
Vectors
 
Variant analysis and whole exome sequencing
Variant analysis and whole exome sequencingVariant analysis and whole exome sequencing
Variant analysis and whole exome sequencing
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
cloning vectors.ppt
cloning vectors.pptcloning vectors.ppt
cloning vectors.ppt
 
Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors Comparison Between Different Types Of Vectors
Comparison Between Different Types Of Vectors
 
Production Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on ProductionProduction Bioinformatics, emphasis on Production
Production Bioinformatics, emphasis on Production
 
BioWeka
BioWekaBioWeka
BioWeka
 
Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891Genomiclibrary 151004020241-lva1-app6891
Genomiclibrary 151004020241-lva1-app6891
 
Gene library
Gene libraryGene library
Gene library
 
Gwas.emes.comp
Gwas.emes.compGwas.emes.comp
Gwas.emes.comp
 
Cloning vector
Cloning vectorCloning vector
Cloning vector
 
Cloning vectors
Cloning vectorsCloning vectors
Cloning vectors
 
Genomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash SontakkeGenomic and c dna library by Kailash Sontakke
Genomic and c dna library by Kailash Sontakke
 
cloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology classcloning vectors.pptx Biotechnology class
cloning vectors.pptx Biotechnology class
 
Principles of cloning DNA introduction
Principles of cloning DNA introductionPrinciples of cloning DNA introduction
Principles of cloning DNA introduction
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
Genome Simulation & Applications: Use of Managed Distributed Compute Infrastr...
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
DNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.pptDNA_cloning_principles and procedures.ppt
DNA_cloning_principles and procedures.ppt
 

More from Luca Cozzuto

Course on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq dataCourse on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq data
Luca Cozzuto
 
From Zero to Nextflow 2017
From Zero to Nextflow 2017From Zero to Nextflow 2017
From Zero to Nextflow 2017
Luca Cozzuto
 
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Luca Cozzuto
 
AnnoWiki
AnnoWikiAnnoWiki
AnnoWiki
Luca Cozzuto
 
Macs course
Macs courseMacs course
Macs course
Luca Cozzuto
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with Rfam
Luca Cozzuto
 

More from Luca Cozzuto (6)

Course on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq dataCourse on parsing methods for biologists with a focus on ChIP-seq data
Course on parsing methods for biologists with a focus on ChIP-seq data
 
From Zero to Nextflow 2017
From Zero to Nextflow 2017From Zero to Nextflow 2017
From Zero to Nextflow 2017
 
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
Benchmarking 16S rRNA gene sequencing and bioinformatics tools for identifica...
 
AnnoWiki
AnnoWikiAnnoWiki
AnnoWiki
 
Macs course
Macs courseMacs course
Macs course
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with Rfam
 

Recently uploaded

gastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptxgastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptx
Shekar Boddu
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Sérgio Sacani
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
Ritik83251
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
Sérgio Sacani
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
fatima132662
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
suyashempire
 
Clinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdfClinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdf
RAYMUNDONAVARROCORON
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
QusayMaghayerh
 
23PH301 - Optics - Unit 2 - Interference
23PH301 - Optics - Unit 2 - Interference23PH301 - Optics - Unit 2 - Interference
23PH301 - Optics - Unit 2 - Interference
RDhivya6
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
Sérgio Sacani
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
BIOTRANSFORMATION MECHANISM FOR OF STEROID
BIOTRANSFORMATION MECHANISM FOR OF STEROIDBIOTRANSFORMATION MECHANISM FOR OF STEROID
BIOTRANSFORMATION MECHANISM FOR OF STEROID
ShibsekharRoy1
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Sérgio Sacani
 
Nutaceuticsls herbal drug technology CVS, cancer.pptx
Nutaceuticsls herbal drug technology CVS, cancer.pptxNutaceuticsls herbal drug technology CVS, cancer.pptx
Nutaceuticsls herbal drug technology CVS, cancer.pptx
vimalveerammal
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
sammy700571
 
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENTFlow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
savindersingh16
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
sandertein
 

Recently uploaded (20)

gastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptxgastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptx
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
 
Post translation modification by Suyash Garg
Post translation modification by Suyash GargPost translation modification by Suyash Garg
Post translation modification by Suyash Garg
 
Clinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdfClinical periodontology and implant dentistry 2003.pdf
Clinical periodontology and implant dentistry 2003.pdf
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
 
23PH301 - Optics - Unit 2 - Interference
23PH301 - Optics - Unit 2 - Interference23PH301 - Optics - Unit 2 - Interference
23PH301 - Optics - Unit 2 - Interference
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDSJAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
JAMES WEBB STUDY THE MASSIVE BLACK HOLE SEEDS
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
BIOTRANSFORMATION MECHANISM FOR OF STEROID
BIOTRANSFORMATION MECHANISM FOR OF STEROIDBIOTRANSFORMATION MECHANISM FOR OF STEROID
BIOTRANSFORMATION MECHANISM FOR OF STEROID
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
Compositions of iron-meteorite parent bodies constrainthe structure of the pr...
 
Nutaceuticsls herbal drug technology CVS, cancer.pptx
Nutaceuticsls herbal drug technology CVS, cancer.pptxNutaceuticsls herbal drug technology CVS, cancer.pptx
Nutaceuticsls herbal drug technology CVS, cancer.pptx
 
Microbiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdfMicrobiology of Central Nervous System INFECTIONS.pdf
Microbiology of Central Nervous System INFECTIONS.pdf
 
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENTFlow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
 
cathode ray oscilloscope and its applications
cathode ray oscilloscope and its applicationscathode ray oscilloscope and its applications
cathode ray oscilloscope and its applications
 

vectorQC: 'A pipeline for assembling and annotation of vectors'

  • 1. Luca Cozzuto Bioinformatics Core Facility vectorQC A pipeline for assembling and annotation of vectors
  • 2. Background A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 3. Background Vector Host cell A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 4. Background Vector Host cell Amplification (cloning vector) A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 5. Background Vector Host cell Amplification (cloning vector) Expression (expression vector) A vector is a DNA molecule used as a vehicle to carry foreign genetic material into a cell, where it can be replicated and/or expressed. The vector itself is generally a DNA sequence that consists of an insert (transgene) and a larger sequence that serves as the "backbone" of the vector.
  • 6. Background A vector is composed of different elements: • Origin of replication • Cloning sites: one or more targets for restriction enzymes The pBR322 plasmid • Reporter genes: genes that activate / inactivate their function after successful insertion and colour the positive colonies • Antibiotic resistance: for selecting only the colonies containing the vector • Promoter • … Source: wikipedia
  • 7. The problem Nowadays vectors are considered a basic tools in biotechnology and having a library of vector in a lab / facility is quite common. After each year there is an increase of the risk of mis-labelling, construct degradation, contamination. Having a quality control of the integrity of the vectors backbone and of the inserted DNA could help in avoiding wasting of time and money and in reducing errors.
  • 8. Solution Biomolecular Screening & Protein Technologies Unit Genomics Unit Bioinformatics Unit
  • 10. Solution Massive sequencing Pool of vectors Analysis Reproducible pipeline
  • 11. Solution Massive sequencing Pool of vectors Analysis Reproducible pipeline Result Report and map of each vector Database
  • 12. The pipeline: vectorQC Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly
  • 13. vectorQC Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly Annotation of features DB of features + list of inserts Annotations
  • 14. Fragmented DNA Scaffolds / whole constructs Quality trimming and assembly Annotation of features DB of features + list of inserts Annotations Generating maps Generating report and sequences vectorQC
  • 15. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. vectorQC
  • 16. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. Read assembly • Flash: merging of overlapping reads (optional) • SPAdes: assembly that is corrected with a custom script for addressing the circularity • Custom script: to randomly join the scaffolds in a single molecule vectorQC
  • 17. Quality control and trimming • FASTQC: QC of initial and trimmed reads • Skewer: trimming the raw reads. Read assembly • Flash: merging of overlapping reads (optional) • SPAdes: assembly that is corrected with a custom script for addressing the circularity • Custom script: to randomly join the scaffolds in a single molecule Annotation • Blast: annotating features and eventually detecting the DNA insert. • Restrict (Emboss): for detecting restriction enzyme sites • Circular Genome Viewer: for generating the maps • MultiQC: for collecting the results in a comprehensive report vectorQC
  • 18. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts vectorQC
  • 19. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts https://github.com/biocorecrg/vectorQC vectorQC
  • 20. Available resources • Database of features: from Plasmapper tool, but can be expanded • Database of restriction enzyme: REBASE Custom resources • Insert list: custom fasta file with the name of the inserts https://github.com/biocorecrg/vectorQC vectorQC
  • 26. Good practices Docker image in dockerhub with automatic buildings
  • 27. Next developments • Improving the assembly: removing the low covered contigs • Comparison with reference: if provided we should check the concordance of the contigs with the reference • Detection of variants: SNP / Indel calling against the reference if provided https://github.com/biocorecrg/vectorQC
  • 28. Thank you! Toni Hermoso Pulido Julia Ponomarenko Sarah Bonnin Jochen Hecht (Genomics Unit) Carlo Carolis (BS&PT Unit)